|
Resolution: standard / high Figure 1.
Evolution and sequence features of Atg8 genes. (a) Atg8 genes in fungi, animals and intermediate-branching species. A schematic tree,
based on a sequence-derived phylogenetic tree, showing the three animal Atg8 subfamilies
and their presence in key lineages. The subfamilies appear only in animals. Branching
between animals and fungi are Atg8 proteins from the few known unicellular species
that diverged after the emergence of fungi and before the emergence of multicellular
animals. Shown here are Atg8 proteins from the choanoflagellate Monosiga brevicollis (Mbe) [6], the ichthyosporean Sphaeroforma arctica (Sar) and the amoeba Capsaspora owczarzaki (Cow) [111]. The scheme is based on a tree calculated from protein multiple alignment of Atg8
proteins from representative species with complete and almost complete genomic data.
The alignment included 117 conserved amino acid positions. The tree was calculated
using the PhyML program version 2.4.4, with 100 bootstrap replicates, four substitution
rate categories, the HKY nucleotide substitution model and program-estimated Ts/Tv
ratios, gamma shape parameters and invariant proportions as previously described [112]. The subfamily clusters are supported by bootstrap values ranging from 37/100 to
95/100 and also appeared with significant bootstrap values in other trees similarly
calculated with different sets of Atg8 genes. The representative species for this
scheme were: human, Danio rerio, Xenopus tropicalis, Branchiostoma floridae, Ciona savignyi, Oikopleura
dioica, Strongylocentrotus purpuratus, Aplysia californica, Schistosoma mansoni, Schmidtea
mediterranea, Drosophila melanogaster, Caenorhabditis elegans, Capitella teleta, Nematostella
vectensis, Trichoplax adhaerens, Amphimedon queenslandica, Monosiga brevicollis, Sphaeroforma
arctica, Capsaspora owczarzaki, Saccharomyces cerevisiae, Schizosaccharomyces pombe,
Allomyces macrogynus, and Tuber melanosporum. (b) Atg8 subfamily sequence features. Sequence logos [113] show the conservation (overall height) and residue prevalence of multiple alignment
positions. The alignment includes the core conserved sequence regions, only excluding
short non-conserved distal regions of some sequences. The subfamilies are numbered
by the coordinates of the human GATE-16, GABARAP and LC3 proteins. Plus signs indicate
similar positions between alignments of the GATE-16 and GABARAP subfamilies and between
the GABARAP and LC3 subfamilies. Each of the three families is very well conserved
across its entire length (apart from the few amino-terminal residues in LC3). The
three families are also very similar to each other in most of their positions. The
few positions that are only conserved in each family and different between the subfamilies
may account for some of the functional differences between the subfamilies. The alignments
and logos were constructed as previously described [112], taking into account sequence redundancy and expected amino acid frequencies. Sequences
for the alignments were taken from the CDD [114] and PFAM [115] database entries cd01611 and PF02991, respectively, and sequences similar to ones
in these entries, from protein sequences and translated genomic and EST sequences
found in public sequence databases.
Shpilka et al. Genome Biology 2011 12:226 doi:10.1186/gb-2011-12-7-226 |