Table 1

Genotyping accuracy on a subset of the HapMap CEU population

cnvHiTSeq


cnvHiTSeq*

cnvHiTSeq†


Location

Predicted Location

Predicted length (bp)

cnvHap r2

SCIMM r2

r2

Accuracy

Missing rate

r2

Accuracy

Missing rate


chr1:35098051-35115368

chr1:35100671-35112111

11,440

1.00

1.00

0.77

0.88

0.23

1.00

1.00

0.00

chr1:152759872-152770356

chr1:152760173-152770753

10,580

0.90

0.94

0.85

0.82

0.23

1.00

1.00

0.00

chr3:151625213-151657165

N/A

N/A

0.00

N/A

N/A

N/A

N/A

N/A

N/A

N/A

chr7:97395305-97402641

chr7:97395365-97402646

7,281

1.00

1.00

0.66

0.81

0.05

1.00

1.00

0.00

chr7:115930472-115941073

chr7:115931453-115941632

10,179

1.00

1.00

N/A

0.95

0.00

1.00

1.00

0.00

chr8:51030941-51038331

chr8:51031082-51038282

7,200

0.92

0.93

0.63

0.94

0.11

1.00

1.00

0.00

chr8:144700485-144714694

chr8:144700505-144714606

14,101

1.00

0.97

0.54

0.67

0.00

1.00

1.00

0.00

chr10:71280989-71291079

chr10:71280949-71291070

10,121

0.90

0.82

0.58

0.86

0.05

0.89

0.95

0.00

chr11:5783630-5809284

chr11:5784450-5809211

24,761

1.00

1.00

0.61

0.83

0.00

0.94

0.94

0.00

chr11:107238222-107244154

chr11:107238422-107244103

5,681

0.94

0.97

0.83

0.90

0.00

1.00

1.00

0.00

chr15:34694542-34817215

chr15:34701483-34817043

115,560

0.80

1.00

0.69

1.00

0.05

1.00

1.00

0.00

chr15:76884597-76907042

chr15:76884597-76896918

12,321

0.56

N/A

0.96

0.95

0.00

1.00

1.00

0.00

chr19:35851153-35861684

chr19:35851134-35863213

12,079

1.00

1.00

0.81

0.90

0.05

1.00

1.00

0.00

chr19:52132525-52148984

chr19:52132606-52149186

16,580

0.96

0.96

0.88

0.86

0.00

1.00

1.00

0.00

chr20:1558407-1585809

chr20:1561187-1585928

24,741

0.90

N/A

0.95

0.94

0.00

1.00

1.00

0.00

chr22:23154417-23243496

chr22:23186037-23241798

55,761

0.22

N/A

0.61

0.73

0.00

0.77

0.80

0.09

chr22:24323894-24418396

chr22:24343395-24397295

53,900

0.00

N/A

0.91

0.86

0.00

1.00

1.00

0.00

chr22:39366812-39386139

chr22:39358773-39383652

24,879

1.00

0.96

0.46

0.85

0.00

1.00

1.00

0.00


Genotyping accuracy as measured by the concordance between copy number estimates on 22 HapMap CEU samples from the low-coverage pilot of the 1000 Genomes Project and reference copy number estimates obtained using PCR. Concordance is quantified using two different metrics: the correlation coefficient r2 between the reference and the predicted genotypes as well as the fraction of calls with the correct genotype for both alleles. r2 measurements for SCIMM (SNP-Conditional Mixture Modeling) were obtained from the supplementary material of [16]. r2 measurements for cnvHap were obtained from the supplementary material of [9]. Two different versions of cnvHiTSeq were used: cnvHiTSeq*, which is a single-sample version of the algorithm that does not take advantage of the population modeling capabilities, and cnvHiTSeq†, which trains the parameters of the model using the entire low-coverage HapMap CEU population from the 1000 Genomes Project (currently consisting of 94 samples). The genotyping accuracy was calculated using 22 of the 94 samples, since these were the only samples for which PCR copy number estimates were available. When all the samples are predicted to be copy neutral for a given location, the accuracy and r2 are undefined and denoted by N/A. cnvHiTSeq calls with posterior probabilities lower than 80% were excluded and declared as missing.

Bellos et al. Genome Biology 2012 13:R120   doi:10.1186/gb-2012-13-12-r120

Open Data