Figure 1.
A multiple alignment of the GOLD domain was constructed using T-Coffee [32] and realigning
the sequences by parsing high-scoring pairs from PSI-BLAST search results. The PHD-secondary
structure [15] is shown above the alignment, with E representing a β strand (upper-case
letters indicate predictions with > 82% accuracy, and lower-case letters denote predictions
with > 72% accuracy). A search with the lumenal region of Caenorhabditis elegans p24-family member K08E4.6 (region 20-191) recovers RALBP from Todarodes pacificus, the squid ortholog of SEC14L2 (E = 7 × 10-3, iteration 1). A reciprocal search with RALBP_Todarodes pacificus (region 189-343) recovers GCP60 (E = 2 × 10-10, iteration 1), FYCO1 (8 × 10-4, iteration 1), SPAC23H4.01c (10-3, iteration 2), KIAA0420 (10-3, iteration 2) and K08E4.6 (7 × 10-3, iteration 2). The 80% consensus shown below the alignment was derived using the
following amino-acid classes: h, hydrophobic (ALICVMYFW, yellow shading); l, the aliphatic
subset of the hydrophobic class (ALIVMC, yellow shading); a, aromatic (FHWY, yellow
shading); s, small (ACDGNPSTV, green letters); u and p are the tiny subsets of the
small class (u, GAS, green shading) and polar (p, CDEHKNQRST, blue letters). Y denotes
a conserved tyrosine residue. The limits of the domains are indicated by the residue
positions on each side. The numbers within the alignment indicate poorly conserved
inserts that are not shown. The different families are shown on the right. A, p24
family; B, Osh3p family; C, CG9528 family (Sec14-like proteins with an amino-terminal
PRELI/MSF1p' domain); D, Sec14-like proteins; E,GCP60 family; and F, FYCO1. The sequences
are denoted by their gene name followed by the species abbreviation and GenBank Identifier.
Ce, Caenorhabditis elegans; Dm, Drosophila melanogaster; Hs, Homo sapiens; Sc, Saccharomyces cerevisiae; Sp, Schizosaccharomyces pombe; Top, Todarodes pacificus.
Anantharaman and Aravind Genome Biology 2002 3:research0023.1 doi:10.1186/gb-2002-3-5-research0023 |