Figure 2.

Graphical summary of IP enrichment. In addition to summary statements, CHANCE produces graphical visualizations of IP strength by separating background regions from ChIP-enriched regions. For a complete discussion on the statistical interpretation of these plots, see [1]. Briefly, points on the x-axis correspond to percentages of the genome, and points on the y-axis correspond to percentages of the total number of reads. The point at which the distance between the IP and Input percentages is maximized is denoted by the green line; the greater the separation between IP and Input at this point, the better the IP enrichment. The shapes of the two curves also provide useful information about the data. (a) The IP curve for H3K4me3 in human embryonic stem cells (HESCs; GEO GSM727572) stays near 0 until it reaches 0.6, indicating that 60% of the genome did not have sufficient coverage in the IP channel. CHANCE detects this insufficient sequencing depth and indicates the percentage of uncovered genome by a black line. (b) For H3K4me3 in mouse neural stem cells (NSCs), CHANCE indicates amplification bias with a turquoise line, identifying over 60% of the reads mapping to a small percentage of the genome. (c) The same sample as in (b) is shown after de-duplication. CHANCE does not detect any amplification bias after de-duplication. (d) This figure exemplifies a weak IP (CARM1 in HESCs; GEO GSM801064), where the IP and Input curves are not well separated.

