Figure 3.

Motif mutational load of Drosophila and human TFBSs located within different genomic contexts. (a) Examples of mutational load values for individual instances of four human TFs (ranging from high to very low) showing different combinations of parameters that are combined in this metric: the reduction of PWM match scores at the minor allele ('ΔPWM score') and the number of genotypes within the mutation in the population (minor allele frequency (MAF)). (b) Relationship between phylogenetic conservation and motif mutational load for D. melanogaster (left) and human (right) TFs included in this study. Conservation is expressed as per-instance branch length scores (BLSs) for each instance computed against the phylogenetic tree of 12 Drosophila species. The average load for D. melanogaster-specific sites (BLS = 0) is shown separately as these have an exceptionally high motif load. (c) Relationship between motif stringency and motif load in Drosophila (left) and humans (right). Motif stringency is expressed as scaled ranked PWM scores grouped into five incremental ranges of equal size (left to right), with average motif load shown for each range. (d) Relationship between distance from transcription start site (TSS) and motif load in Drosophila (left) and humans (right) for all analyzed TFs excluding CTCF (top) and for CTCF alone (bottom), with average motif load shown for each distance range. (b-d) Average motif load is computed excluding a single maximum value to reduce the impact of outliers. The P-values are from permutation tests, in which permutations are performed separately for each TF and combined into a single statistic as described in Materials and methods.

Spivakov et al. Genome Biology 2012 13:R49   doi:10.1186/gb-2012-13-9-r49