|
Resolution: standard / high Figure 1.
A compendium of insertions and deletions in primates and flies. (a) Gaps and their boundaries. The bar charts represent the total number of insertions
(Ins) and deletions (del) at each lineage, resolved using the known phylogenetic relations
between the species. Gaps with ambiguous boundaries or flanking gapless matches of
less than 20 bp were filtered out since they either represented superimposed events
or alignment problems. D. Sec, D. sechelia; D. Sim, D. simulans. (b) Filtering questionable indels. Shown are the numbers of non-repetitive primate indels
that were retained or filtered as questionable based on direct genomic searches (see
Materials and methods). (c) Insertion and deletion length distributions. Shown are the distributions of insertion
and deletion lengths in our primate and fly compendia. Graphs are drawn in log-scale.
A non-geometric trend is apparent in the primate and fly insertion data and in the
primate deletion data (the probability of a single exponential fit has P < 10-150 (KS goodness of fit)). The peaks at, for example, primate insertions of length 4 and
8 are not understood but statistically significant. The fluctuation around the trend
for longer events reflects lower sample sizes.
Tanay and Siggia Genome Biology 2008 9:R37 doi:10.1186/gb-2008-9-2-r37 |