Additional data file 1.

Superfamily and family assignments for each of the sequences and structures from this work, as well as the corresponding Pfam, SCOP, and SUPERFAMILY assignments. Pfam, SCOP, and SUPERFAMILY assignments are listed as names. 1National Center for Biotechnology Information GI number. Additional data file 3 contains the fasta format sequences corresponding to each gi number. 2Protein Data Bank identifier. 3The gold and silver standard o -succinylbenzoate synthase (OSBS) families contain a more diverse set of enzymes than many other families listed in the table. All of the OSBS enzymes are believed to catalyze the same overall reaction via the same catalytic residues and there is no convincing evidence to suggest convergent evolution from within the superfamily, so we believe that these enzymes meet our definition of family. They appear, however, to utilize a different constellation of substrate binding residues, and certain subclusters within the family catalyze the promiscuous N-acyl amino acid racemase reaction in addition to the OSBS reaction. Because the sequences that comprise this family are highly divergent, it may pose special difficulties for automated clustering methods. Additional families that may be especially challenging include the extradiol dioxygenase families within the VOC superfamily, where a relatively high degree of sequence similarity and catalytic promiscuity make accurate clustering difficult. 4Evidence code for gold and silver standard family assignment [59]. 5ID number for the literature reference upon which gold/silver family assignment was based. When a sequence has been assigned to both a gold and silver standard family, this reference applies to both family classifications. When it has only been assigned to a silver standard family, this reference applies to the silver standard family classification. The full reference may be obtained by cross-referencing the ID number with Additional data file 4.

Format: XLS Size: 881KB Download file

This file can be viewed with: Microsoft Excel Viewer

Brown et al. Genome Biology 2006 7:R8   doi:10.1186/gb-2006-7-1-r8