Table 1

Types of error

Error type

Number of occurrences

Percent of errors

Error rate


Insertions

58,337

36%

0.18%

Homopolymer extension and CAFIE

32,858

20%

0.10%

Not associated with homopolymers

25,479

16%

0.08%

Deletions

43,107

27%

0.13%

Incomplete homopolymer extension

13,868

9%

0.04%

Not associated with homopolymers

29,239

18%

0.09%

Mismatches

25,281

16%

0.08%

Homopolymer extension and CAFIE

16,725

10%

0.05%

Not associated with homopolymers

8,556

5%

0.03%

Ambiguous base calls (N)

34,184

21%

0.10%


Read errors

Number of occurrences

Cumulative percent of reads

Percent of reads


Reads with no errors (perfect match)

279,468

82%

82%

Reads with no more than one error

35,813

93%

11%

Reads with no more than two errors

11,651

96%

3%

Reads with more than two errors

13,218

100%

4%


Errors were classified as insertions, deletions, mismatches (substitutions) and ambiguous base calls (Ns). We further classified insertions, deletions and mismatches by their association with homopolymer effects. Deletions corresponding to an adjacent base are considered incomplete extension. Insertions and mismatches of the same base as an adjacent base are extensions. Insertions and mismatches that are the same as an upcoming homopolymer with no more than two intervening bases are considered carry forward errors.

Huse et al. Genome Biology 2007 8:R143   doi:10.1186/gb-2007-8-7-r143

Open Data