Genomic characterization of the Yersinia genus
-
* Corresponding author: Timothy D Read tread@emory.edu
- Equal contributors
1 Biological Defense Research Directorate, Naval Medical Research Center, 503 Robert Grant Avenue, Silver Spring, Maryland 20910, USA
2 University of Maryland Institute for Advanced Computer Sciences, Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland 20742, USA
3 Emerging Pathogens Institute and Department of Molecular Genetics and Microbiology, University of Florida College of Medicine, Gainesville, Florida 32610, USA
4 454 Life Sciences Inc., 15 Commercial Street, Branford, Connecticut 06405, USA
5 Department of Human Genetics, Emory University School of Medicine, 615 Michael Street, Atlanta, Georgia 30322, USA
6 Division of Infectious Diseases, Emory University School of Medicine, 615 Michael Street, Atlanta, Georgia 30322, USA
7 Current address: Computational and Mathematical Biology, Genome Institute of Singapore, Singapore-127726
Genome Biology 2010, 11:R1 doi:10.1186/gb-2010-11-1-r1
Published: 4 January 2010Additional files
Additional file 1:
Statistics from running DIYA [89] and frameshift detection programs on the eight genomes sequenced in this study and various other enterobacterial genomes downloaded from NCBI.
Format: XLS Size: 39KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 2:
Results of amosvalidate [37] analysis on the eight genomes of this study.
Format: DOC Size: 29KB Download file
This file can be viewed with: Microsoft Word Viewer
Additional file 3:
These consist of ISfinder [40], RepeatScout [41]and amosvalidate [37] results (GFF format); repeats found by RepeatScout in fasta format, scaffold files (NCBI AGP format); and information about length of contigs, read count, estimated repeat number, count in scaffold and whether or not the contig was placed by SOMA [39].
Format: GZ Size: 278KB Download file
Additional file 4:
Estimates for genome sizes (in Mbp) based on optical map data.
Format: DOC Size: 39KB Download file
This file can be viewed with: Microsoft Word Viewer
Additional file 5:
An E. coli strain with known plasmids was a positive control.
Format: DOC Size: 628KB Download file
This file can be viewed with: Microsoft Word Viewer
Additional file 7:
Y. pestis CO92 signatures longer than 100 bp computed by the Insignia [44] pipeline.
Format: TXT Size: 25KB Download file
Additional file 8:
Sequences of the new genomes that match (that is, invalidate) the Y. pestis CO92 signatures listed in Additional file 7.
Format: TXT Size: 1KB Download file
Additional file 9:
Y. enterocolitica signatures longer than 100 bp computed by the Insignia pipeline.
Format: TXT Size: 13KB Download file
Additional file 10:
Sequences of the new genomes that match (that is, invalidate) the Y. enterocolitica signatures.
Format: TXT Size: 2KB Download file
Additional file 11:
Y. pestis genome with the Insiginia-indentified repeats and genome islands identified using IslandViewer [45] plotted. The figure was created using DNAPlotter [106].
Format: PNG Size: 67KB Download file
Additional file 12:
Y. enterocolitica genome with the Insiginia-indentified repeats and genome islands identified using IslandViewer [45] plotted. The figure was created using DNAPlotter [106].
Format: PNG Size: 71KB Download file
Additional file 13:
The eight genomes sequenced in this study are represented as pseudocontigs, ordered by a combination of optical mapping and alignment to the closest completed reference genome.
Format: JPEG Size: 2.2MB Download file
Additional file 14:
Whole genome multiple alignment produced by MAUVE of the 11 Yersinia genomes in XMFA format [106].
Format: ZIP Size: 15.1MB Download file
Additional file 15:
The top level directory consists of a directory called Additional_cluster_files and 5010 directories, one for each multi-protein cluster family. (This top level directory has been split into three data files for uploading purposes (Additional files 15, 16, 17).) Within the directory are the following files: PGL1_unique_Yersinia_unclustered.out - list of all protein singletons that MCL did not group into a cluster (see Materials and Methods); PGL1_Yersinia_unique_locus_tags.txt - names of the 11 locus tag prefixes used for each genome; PGL1_unique_Yersinia.gff - mapping each Yersinia protein to a cluster in tab delimited GFF; PGL1_unique_Yersinia.sigfile - list of the longest protein in each cluster; PGL1_unique_Yersinia.summary - summary table of features of each of the clusters; PGL1_unique_Yersinia.table - summary table of each protein in the clusters. Within each cluster directory are the following files, where 'x' is the cluster name: PGL1_unique_Yersinia-x.faa - multifasta file of the proteins in the cluster; PGL1_unique_Yersinia-x.summary - summary of the properties of the proteins; PGL1_unique_Yersinia-x.matches - blast matches between the proteins of the cluster; PGL1_unique_Yersinia-x.muscle.fasta - muscle alignment of the proteins; PGL1_unique_Yersinia-x.muscle.fasta.gblo - gblocks output of muscle alignment (that is, auto-trimmed alignment); PGL1_unique_Yersinia-x.muscle.fasta.gblo.htm - as above in html format; PGL1_unique_Yersinia-x.muscle.tree - treefile from muscle alignment; PGL1_unique_Yersinia-x.sif - matches between proteins in simple interaction format for display on graphing software.
Format: ZIP Size: 18.2MB Download file
Additional file 16:
The top level directory consists of a directory called Additional_cluster_files and 5010 directories, one for each multi-protein cluster family. (This top level directory has been split into three data files for uploading purposes (Additional files 15, 16, 17.) Within the directory are the following files: PGL1_unique_Yersinia_unclustered.out - list of all protein singletons that MCL did not group into a cluster (see Materials and Methods); PGL1_Yersinia_unique_locus_tags.txt - names of the 11 locus tag prefixes used for each genome; PGL1_unique_Yersinia.gff - mapping each Yersinia protein to a cluster in tab delimited GFF; PGL1_unique_Yersinia.sigfile - list of the longest protein in each cluster; PGL1_unique_Yersinia.summary - summary table of features of each of the clusters; PGL1_unique_Yersinia.table - summary table of each protein in the clusters. Within each cluster directory are the following files, where 'x' is the cluster name: PGL1_unique_Yersinia-x.faa - multifasta file of the proteins in the cluster; PGL1_unique_Yersinia-x.summary - summary of the properties of the proteins; PGL1_unique_Yersinia-x.matches - blast matches between the proteins of the cluster; PGL1_unique_Yersinia-x.muscle.fasta - muscle alignment of the proteins; PGL1_unique_Yersinia-x.muscle.fasta.gblo - gblocks output of muscle alignment (that is, auto-trimmed alignment); PGL1_unique_Yersinia-x.muscle.fasta.gblo.htm - as above in html format; PGL1_unique_Yersinia-x.muscle.tree - treefile from muscle alignment; PGL1_unique_Yersinia-x.sif - matches between proteins in simple interaction format for display on graphing software.
Format: ZIP Size: 14.8MB Download file
Additional file 17:
The top level directory consists of a directory called Additional_cluster_files and 5010 directories, one for each multi-protein cluster family. (This top level directory has been split into three data files for uploading purposes (Additional files 15, 16, 17.) Within the directory are the following files: PGL1_unique_Yersinia_unclustered.out - list of all protein singletons that MCL did not group into a cluster (see Materials and Methods); PGL1_Yersinia_unique_locus_tags.txt - names of the 11 locus tag prefixes used for each genome; PGL1_unique_Yersinia.gff - mapping each Yersinia protein to a cluster in tab delimited GFF; PGL1_unique_Yersinia.sigfile - list of the longest protein in each cluster; PGL1_unique_Yersinia.summary - summary table of features of each of the clusters; PGL1_unique_Yersinia.table - summary table of each protein in the clusters. Within each cluster directory are the following files, where 'x' is the cluster name: PGL1_unique_Yersinia-x.faa - multifasta file of the proteins in the cluster; PGL1_unique_Yersinia-x.summary - summary of the properties of the proteins; PGL1_unique_Yersinia-x.matches - blast matches between the proteins of the cluster; PGL1_unique_Yersinia-x.muscle.fasta - muscle alignment of the proteins; PGL1_unique_Yersinia-x.muscle.fasta.gblo - gblocks output of muscle alignment (that is, auto-trimmed alignment); PGL1_unique_Yersinia-x.muscle.fasta.gblo.htm - as above in html format; PGL1_unique_Yersinia-x.muscle.tree - treefile from muscle alignment; PGL1_unique_Yersinia-x.sif - matches between proteins in simple interaction format for display on graphing software.
Format: ZIP Size: 10.4MB Download file
Additional file 18:
Complete protein sets for the 11 species of Yersinia.
Format: ZIP Size: 13.5MB Download file
Additional file 19:
To evaluate node support, a majority rule-consensus tree of 1,000 bootstrap replicates was computed. E. coli was used as an outgroup species.
Format: PDF Size: 96KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 20:
To evaluate node support, a majority rule-consensus tree of 1,000 bootstrap replicates was computed. E. coli was used as an outgroup species.
Format: PDF Size: 95KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 21:
A curve showing the rate of decline in number of this set as more non-pathogen genomes are added is also included.
Format: DOC Size: 144KB Download file
This file can be viewed with: Microsoft Word Viewer
Additional file 22:
Phylogeny of TTSS component YscN in Yersinia and other enterobacteria species.
Format: DOC Size: 247KB Download file
This file can be viewed with: Microsoft Word Viewer
Additional file 23:
Putative antibiotic resistance genes in the Yersinia genus determined using the Antibiotic Resistance Genes Database [45].
Format: XLS Size: 23KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 24:
Calculations for the estimation of Π from aligned Yersinia core genomes.
Format: DOC Size: 23KB Download file
This file can be viewed with: Microsoft Word Viewer
