Figure 1.
Multiple sequence alignment of the 2OG-Fe(II) dioxygenase superfamily. Individual
protein families are separated by blank lines and a brief description of each family
is given to the right of the alignment. The numbers at the ends of the alignment indicate
the position of the first and last of the aligned residues in the respective protein
sequences. The consensus secondary structure is shown above the alignment in uppercase
letters. It was derived by taking those elements that are shared by the predicted
structures of individual families and the experimentally determined structures; H
indicates α helix and E indicates extended conformation (β strand). The lowercase
letters represent extensions of the secondary structure elements that are seen in
some, but not all, members of the superfamily. The conserved amino-terminal extensions
that are specific only to a given family are separated from the rest of the alignment
by vertical lines. The coloring of the alignment columns is according to the 85% consensus
that is shown underneath the alignment and includes the following categories of amino
acid residues: h,hydrophobic; l, aliphatic; a, aromatic (Y, F, W, H, L, I, V, M, A,
all shaded yellow); s, small (S, A, G, T, V, P, N, H, D, shaded blue); b, big (K,
R, E, Q, W, F, Y, L, M, I, shaded gray); +, positively charged (K, R, H; colored magenta).
The (predicted) catalytic residues are indicated by asterisks and with reverse red
shading. The proteins are designated by the protein/gene name, the species abbreviation
and the gene identification (GI) number. Protein abbreviations are: CAS, clavaminic
acid synthase; DAOCS, deacetoxycephalosporin C synthetase; EFE, ethylene-forming enzyme;
FLAS, flavonol synthase; Ga20Ox, giberellin 20-oxidase; IPNS, isopenicillin N synthase;
LDOX, leucoanthocyanidin hydroxylase; Lep, leprecan; P4HA, prolyl-4-hydroxylase; PLO,
lysyl hydroxylase; SanF and SanC, enzymes involved in nikkomycin biosynthesis. The
remaining names are the standard names of the genes that encode the respective proteins.
Species abbreviations: At, Arabidopsis thaliana; Bb, Borrelia burgdorferi; Cc, Caulobacter crescentus; Ce, Caenorhabditis elegans; Ci, Ciona intestinalis; Dm, Drosophila melanogaster; Ec, Escherichia coli; Em, Emericella nidulans; Hs, Homo sapiens; Lc, Lysobacter lactamgenus; Le, Lycopersicon esculentum; Mtu, Mycobacterium tuberculosis; Nc, Neurospora crassa; Pa, Pseudomonas aeruginosa; Pet, Petunia hybrida; Rr, Rattus rattus; Sc, Saccharomyces cerevisiae; Sp, Schizosaccharomyces pombe; Sot, Solanum tuberosum; Scoe, Streptomyces coelicolor; Scan, Streptomyces ansochromogenes; Scla, Streptomyces clavuligerus; Ssp, Synechocystis; Vc, Vibrio cholerae; ASPV, apple stem pitting virus; ACLSV, apple chlorotic leaf spot virus; BSV, blueberry
scorch virus; GLV, garlic latent virus; GVA, grapevine virus A; PBCV, Parameciumbursaria chlorella virus; PMV, papaya mosaic virus; SHVX, shallot virus X.
Aravind and Koonin Genome Biology 2001 2:research0007.1 doi:10.1186/gb-2001-2-3-research0007 |