#suppress_mobile_home()
|
Resolution: standard / high Figure 5.
Indels are clustered throughout the genome. (A) A representative multikilobase span, where ‘X’ indicates an indel and dashes signify
non-polymorphic repeat sequences. (B) The number of ‘–‘ characters between each indel (‘X’) was counted across the genome
and compiled into a histogram in purple. In gray, the exponential distribution expected
based on the observed indel probability and assuming random dispersion of indels.
Inset: the analogous plot for ‘dense’ regions identified by the hidden Markov model
(HMM). (C) (i) Schematic of the HMM used to distinguish indel-dense from indel-sparse regions.
(ii) Fractional share of total indels (left) and number of bases in the genome (right)
present in ‘dense’ (blue) and ‘sparse’ (red) regions. (D) Relative enrichment of three different sequence features between ‘dense’ and ‘sparse’
regions. Error bars indicate ±S.E.M. across regions, propagated through division.
(E) The indel concentration, measured as indels-per-repeat sequence, in 7.5 kb windows
centered at replication origins was calculated as a function of replication-origin
offset (that is, 0 kb is the native origin location). Step size is 1 kb, and the average
value across three adjacent windows is plotted. (F) The total number of repeat sequences present in non-overlapping 1 kb windows centered
at replication origins.
Muzzey et al. Genome Biology 2013 14:R97 doi:10.1186/gb-2013-14-9-r97 |