Table 4

Comparison between SNP predictions made by the three base callers and the 'CG-HQ' high-quality set of variant calls from Complete Genomics.

Match

Δ%

Mismatch

Extra

Complex

Total

FDR


AYB

339908

+6.65

299

36524

13251

389982

8.79 × 10-4

Bustard

318725

284

32165

12476

363650

8.90 × 10-4

Ibis

319907

+0.37

342

42865

14210

377324

10.68 × 10-4


SNP predictions for sites that are SNPs in CG-HQ either 'match' or 'mismatch' depending on whether they agree; for AYB and Ibis matches we also record the percent improvement over Bustard (Δ%). 'Extra' SNP predictions are those for sites that are not present in the CG-HQ data. Predicted SNPs at sites for which the CG-HQ results indicate complex variations from GRCh37, mainly insertions and deletions, are counted as 'complex'. The false discovery rate (FDR) is calculated across matches and mismatches only, since the high level of congruence across base callers in the 'extra' category suggests many of these predicted SNPs may be correct (see text for details). Only variant calls for the autosomal, × or mitochondrial chromosomes are considered.

Massingham and Goldman Genome Biology 2012 13:R13   doi:10.1186/gb-2012-13-2-r13

Open Data