Figure 4.

SNP calls for a high-coverage 1000 Genomes Project individual. (a) Concordance rate (Matching/(Matching + Discordant) SNP calls) across a range of stringency thresholds (MAPthrough P < 10-160) between Sniper SNP calls and SNP calls from the March 2010 release of the 1000 Genomes Project Consortium for human individual NA19240 chromosomes 20 to 22. Solid lines depict concordance rates using all SNP calls; dashed lines depict concordance rates using the subset of Sniper SNPs covered exclusively by unique alignments (b) Estimates of read depth, stringency, variant allele frequency, alignments per read, and proportion of non-unique SNPs averaged over the concordant, Sniper-specific, and 1000 Genomes Project-specific (1 KG) SNP calls (using Q ≥ 40 cutoff for Sniper). Significant comparisons resulting from a two-sample unequal variance t-test are indicated. ***P < 10-10. No test was performed for the lower right panel due to sample sizes of 1. (c) Comparison of Sniper SNPs across mapping strategies, showing total loci assayed and SNP calls at Q ≥ 13 and Q ≥ 40 stringency cutoffs. See Figure 1b for complete descriptions of mapping strategies. (d) Comparison of SNPs across SNP calling algorithms (Sniper, SAMtools/Maq, GATK) given an identical read map generated with Bowtie using the best no-guess (BESTNO) strategy. SNPs were filtered using Q ≥ 13 (Sniper and SAMtools) and recommended settings for GATK (DP > 100 || MQ0 > 40 || SB > -0.1). (e) Isolation of high confidence false positive (FP; left) and true positive (TP; right) SNPs by intersecting 1 KG-specific and Sniper-specific SNPs with calls from GATK to control for read map differences between this study and the 1 KG study [1]. ALL, unique and multiply mapped reads; BEST, best guess; BESTNO, best no-guess mapping; FDR, false discovery rate; FP, false positive; MAP, maximum a posteriori; TP, true positive; UNI, strictly unique mapping.

Simola and Kim Genome Biology 2011 12:R55   doi:10.1186/gb-2011-12-6-r55
