Figure 3.

Bis-SNP error frequencies in detecting SNPs on the Illumina 1 M SNP array. Receiver Operating Characteristics (ROC curves) are shown for Bis-SNP accuracy at detecting SNPs in Bisulfite-seq data derived from human colonic mucosa tissue. The 'true' genotypes were determined using an Illumina Duo 1 M Human SNP array, and Bis-SNP results were only evaluated at these million genomic positions. All datasets were from [6]. The three ROC curves at the top (a-c) show accuracy at positions corresponding to 435,120 homozygous cytosines on the 1 M SNP array. By randomly downsampling from the average 32× read depth of the Bisulfite-seq data, we are able to show results corresponding to 8× coverage (a), 16× coverage (b). Bis-SNP using three different conditions is compared to Bismark and the method used in 'Berman2012' [6], both of which restrict their results to reference cytosines. For 'Berman2012', we varied the number of reverse strand G reads required to plot a range of stringencies. The three plots at the bottom (d-f) show accuracy at the 303,656 positions that are heterozygous according to the 1 M SNP array. For comparison, we show results from the k-allele method (similar to the approach of [30]), Shoemaker2010 [20] and bisReadMapper[3].

Liu et al. Genome Biology 2012 13:R61   doi:10.1186/gb-2012-13-7-r61
