|
Resolution: standard / high Figure 1.
Schematic of the COMIT algorithm for identifying unusually conserved motifs in coding
regions. The example illustrates how the score would be calculated for the motif ACAAAG,
using genome-wide coding sequence alignments for two species. Each instance of the
motif is identified in species 1, and the observed conservation - that is, whether
all bases are identical among the two species - is calculated. The expected conservation
at each instance is modeled from genome-wide frequencies of nucleotide-level conservation
patterns conditional on the aligned amino acids. For each instance, the expected conservation
is calculated from all possible ways in which the motif could be conserved at that
location given the amino acids in each species, using values from Table 1 (typically
some of these quantities, such as (H, Y)111, will be zero). The observed and expected conservation levels are compared and normalized
to yield a conservation z-score for each motif.
Kural et al. Genome Biology 2009 10:R133 doi:10.1186/gb-2009-10-11-r133 |