Table 2

Improved mutation calling in higher coverage.

αa

n.Answerb

Coverage = 40 ×

Coverage = 100 ×


n.Predictc

n.Correctd

Precision

Recall

F-score

n.Predictc

n.Correctd

Precision

Recall

F-score


70%

1999

1976

0.98

0.87

0.93

2208

2198

1.00e

0.97

0.98

80%

2265

1551

1516

0.98

0.67

0.79

2142

2133

1.00f

0.94

0.97

90%

638

611

0.96

0.27

0.42

1572

1545

0.98

0.68

0.81


Precision and recall values are compared between two different coverages (40 × and 100 ×) at highly contaminated data (α = 70%, 80%, and 90%). Much deeper coverage is required when severe contamination is expected.

aα is the proportion of the control sample in the disease sample (contamination level).

bn.Answer is the number of true somatic mutations.

cn.Predict is the number of mutations predicted by Virmid.

dn.Correct is the number of correct mutations in the prediction.

eRounded up from 0.9954.

fRounded up from 0.9958.

Kim et al. Genome Biology 2013 14:R90   doi:10.1186/gb-2013-14-8-r90

Open Data