|
Resolution: standard / high Figure 1.
The general mechanism of inference for each of the four methods used by the Proteome
Navigator. (a) The gene neighbor (GN) method identifies protein pairs encoded in close proximity
across multiple genomes. We see in this example that genes A and B are gene neighbors while A and C are not. (b) The Rosetta Stone (RS) method searches for gene fusion events. We see that the A and
B proteins are expressed as separate proteins in one organism. However, in a second
organism a sequence exists that represents the fusion of the two proteins. The fusion
protein is termed the Rosetta Stone protein as it allows us to infer that the A and
B proteins are functionally linked. (c) The construction of phylogenetic profiles (PP) begins with four sequenced genomes,
from which the protein sequences have been predicted. The protein sequence, A, within
E. coli is compared to that of the proteins coded by the other genomes and homologs are identified.
If the genome contains a homolog of A, a 1 is placed in the corresponding phylogenetic
profile position, a 0 otherwise. Genes with similar phylogenetic profiles are likely
to participate in the same pathway. (d) The gene cluster (GC) or operon method identifies closely spaced genes, and assigns
a probability P of observing a particular gap distance (or smaller), as judged by the collective set
of inter-gene distances.
Bowers et al. Genome Biology 2004 5:R35 |