Resolution:
standard / ## Figure 4.
Evidence for the 'buffering' of deleterious TFBS variation by neighboring homotypic
motifs in . Drosophila(a) Distributions of average motif load per 100 kb window along Drosophila chromosome 2R and chromosome × (yellow; see Figure S5 in Additional file 1 for other chromosomes). Recombination rate distributions along the chromosomes (dashed
lines) are from [22] (and are near-identical to an earlier analysis [43]); note that there is no apparent correlation between these two parameters. Regions
of high average motif load marked with asterisks are further examined in (b). Average
motif load is computed excluding a single maximum value to reduce the impact of outliers.
(b) Examples of motif arrangement at regions that fall within 100 kb windows having high
average motif load (L >5e-3). Motifs with no detected deleterious variation (L = 0)
are colored grey, and those with non-zero load pink (low load) to red (high load).
Asterisks refer to similarly labeled peaks from (a). Note that most high-load motifs
found in these regions have additional motifs for the same TF in their proximity.
(c) Distributions of average load across ranges of phylogenetic conservation for motifs
with a single match within a bound region ('singletons', blue) versus those found
in pairs ('duplets', red). For equivalent comparison, a random motif out of the duplet
was chosen for each bound region and the process was repeated 100 times. Results are
shown for the four TFs for which appreciable differences between 'singletons' and
'duplets' were detected. Phylogenetic conservation is expressed in terms of branch
length score (BLS) ranges, similarly to Figure 2b. The P-value is from a permutation test for the sum of average load differences for each
range between 'singleton' and 'duplet' motifs. Average load was computed excluding
a single maximum value. (d) Relationship between the average load per TF and the average number of motifs per
bound region. Average load was computed excluding a single maximum value; r is Pearson's correlation coefficient and the P-value is from the correlation test. (e) The difference in motif score between motif pairs mapping to the same bound regions:
the one with the highest load versus one with a zero load ('constant'; left) or in
random pairs (right). These results suggest that the major alleles of motifs with
a high load are generally not 'weaker' than their non-varying neighbors (the P-value is from the Wilcoxon test).
Spivakov |