Figure 4.

Primate CNV hotspots overlap functional regions. (a) CNVs were classified as intergenic, intronic, or exonic, based on whether any part of them overlapped with genes. Promoter regions were defined as the 2-kb region immediately upstream of the transcription start site of a gene. (b) Enrichment analysis for the HCR CNVs that overlap with exons of UCSC gene track. For calculating the significance independent of the size distribution, we generated 1,000 intervals that mimic the size distribution of the HCR CNVs and plotted the distribution of their genic overlap. The dotted red line indicates the number of observations, whereas the red-bins show the expected distribution (P < 0.001, Kolmogorov-Smirnov test). (c) The K values for all genes, genes that overlap with human CNVs, and genes that overlap with HCR CNVs. We plot here the density of K values, which is a measure of positive selection [18]. The genes that overlap with HCR CNVs have significantly lower K values (P < 0.001, Kolmogorov-Smirnov test), indicating that they are more likely to evolve under positive selection. (d) A cumulative fraction plot of conservation for primate hotspot and non-hotspot CNVs. Conservations scores were obtained using the phastCons on 17-species multiz track from the UCSC Genome Browser [31]. The D and P-values were calculated using a Kolmogorov-Smirnov test. Combined, these data indicate that most of the genes overlapping HCR CNVs are evolving under lineage-specific positive selection.

Gokcumen et al. Genome Biology 2011 12:R52   doi:10.1186/gb-2011-12-5-r52
