Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

Open Access Highly Accessed Research

Strong functional patterns in the evolution of eukaryotic genomes revealed by the reconstruction of ancestral protein domain repertoires

Christian M Zmasek and Adam Godzik*

Author Affiliations

Program in Bioinformatics and Systems Biology, Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA

For all author emails, please log on.

Genome Biology 2011, 12:R4  doi:10.1186/gb-2011-12-1-r4

Published: 17 January 2011

Additional files

Additional file 1:

Table of genomes analyzed.

Format: XLS Size: 125KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 2:

Table of Pfam domain counts in extant species. Summary of conditions used: protein predictions as listed in Additional file 1, domain models from Pfam 24.0, analyzed with HMMER 3.0b2, Pfam 'gathering' cutoffs.

Format: XLS Size: 33KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 3:

Domain gains and loss counts during eukaryote evolution. Inferred domainome sizes are shown in blue, domain gain counts in green, and domain loss counts in red. Numbers in brackets are average domainome sizes of all extant descendents of each node. Summary of conditions used: protein predictions as listed in Additional file 1, domain models from Pfam 24.0, analyzed with HMMER 3.0b2, Pfam 'gathering' cutoffs.

Format: PDF Size: 50KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 4:

Domain gains and losses during eukaryote evolution. phyloXML [70] formatted file, which was used to create Figure 2 and Additional file 3, viewable with Archaeopteryx software [71]. Summary of conditions used: protein predictions as listed in Additional file 1, domain models from Pfam 24.0, analyzed with HMMER 3.0b2, Pfam 'gathering' cutoffs.

Format: ZIP Size: 3.6MB Download file

Open Data

Additional file 5:

Domain gains and corresponding GO terms during eukaryote evolution. Summary of conditions used: protein predictions as listed in Additional file 1, model of eukaryote evolution as shown in Figure 2 (and more detailed in Additional files 3 and 4), domain models from Pfam 24.0, analyzed with HMMER 3.0b2, Pfam 'gathering' cutoffs, 'pfam2go' mappings dated 2009/10/01. GO namespaces are abbreviated as follows: B, biological process; C, cellular component; M, molecular function.

Format: ZIP Size: 54KB Download file

Open Data

Additional file 6:

Domain losses and corresponding GO terms during eukaryote evolution. Summary of conditions used: protein predictions as listed in Additional file 1, model of eukaryote evolution as shown in Figure 2 (and more detailed in Additional files 3 and 4), domain models from Pfam 24.0, analyzed with HMMER 3.0b2, Pfam 'gathering' cutoffs, 'pfam2go' mappings dated 2009/10/01. GO namespaces are abbreviated as follows: B, biological process; C, cellular component; M, molecular function.

Format: ZIP Size: 707KB Download file

Open Data

Additional file 7:

Domain gain and loss counts during eukaryote evolution under a coelomata model. Summary of conditions used: protein predictions as listed in Additional file 1, domain models from Pfam 24.0, analyzed with HMMER 3.0b2, Pfam 'gathering' cutoffs.

Format: PDF Size: 91KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 8:

Domain gains and loss counts during eukaryote evolution under a 'crown group' model. Summary of conditions used: protein predictions as listed in Additional file 1, domain models from Pfam 24.0, analyzed with HMMER 3.0b2, Pfam 'gathering' cutoffs.

Format: PDF Size: 137KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 9:

Table of enriched gained and lost GO terms evolution under a coelomata model. The two terms with the lowest P-values are shown. Summary of conditions used: protein predictions as listed in Additional file 1, domain models from Pfam 24.0, analyzed with HMMER 3.0b2, Pfam 'gathering' cutoffs, model of eukaryote evolution as shown in Additional file 7, 'pfam2go' mappings dated 2009/10/01, Ontologizer 2.0 with Topology-Elim algorithm.

Format: PDF Size: 39KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 10:

Table of enriched gained and lost GO terms under a 'crown group' model. The two terms with the lowest P-values are shown. Summary of conditions used: protein predictions as listed in Additional file 1, domain models from Pfam 24.0, analyzed with HMMER 3.0b2, Pfam 'gathering' cutoffs, model of eukaryote evolution as shown in Additional file 8, 'pfam2go' mappings dated 2009/10/01, Ontologizer 2.0 with Topology-Elim algorithm.

Format: PDF Size: 41KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 11:

Functional analysis of the human domainome complemented with intestinal bacteria. Summary of conditions used: protein predictions as listed in Additional file 1, model of eukaryote evolution as shown in Figure 2 (and more detailed in Additional files 3 and 4), domain models from Pfam 24.0, analyzed with HMMER 3.0b2, Pfam 'gathering' cutoffs, 'pfam2go' mappings dated 2009/10/01.

Format: PDF Size: 127KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 12:

Domain counts for a variety of cutoff values.

Format: PDF Size: 171KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 13:

Domain gains and losses during eukaryote evolution for a E-value cutoff of 10-8. Summary of conditions used: protein predictions as listed in Additional file 1, domain models from Pfam 24.0, analyzed with HMMER 3.0b2.

Format: PDF Size: 96KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 14:

Comparison of enriched gained and lost GO terms along path from Unikonta to Mammalia using different calculation methods and different approaches for multiple testing correction. The two terms with the lowest P-values are shown (calculated by the Ontologizer 2.0 software [63]), with the exception of terms marked by an asterisk, due to the relevance of these terms for this work. Prototypical regulatory terms are in red, prototypical metabolic terms are in blue.

Format: PDF Size: 99KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data