|
Resolution: standard / high Figure 1.
Schematic representation of the phylogenetic pipeline used to reconstruct the human
phylome. Each protein sequence encoded in the human genome is compared against a database
of proteins from 39 fully sequenced eukaryotic genomes (Table 1) to select putative
homologous proteins. Groups of homologous sequences are aligned and subsequently trimmed
to remove gap-rich regions. The refined alignment is used to build a NJ tree, which
is then used as a seed tree to perform a ML likelihood analysis as implemented in
PhyML, using four different evolutionary models (five in the case of mitochondrially
encoded proteins). The ML tree with the maximum likelihood is further refined with
a Bayesian analysis using MrBayes. Finally, different algorithms are used to search
for specific topologies in the phylome or to define orthology and paralogy relationships.
Huerta-Cepas et al. Genome Biology 2007 8:R109 doi:10.1186/gb-2007-8-6-r109 |