Table 2

Data sets used in this study

Region name

Typea

Locib

Difficultyc

% deg.d

APRe

% ≥ 3

% ≥ 5

% ≥ 10


Human 261 k

Exp.

261,475

Low

90.4

1.91

0.21

0.05

0.02

Yeast RPL

Sim.

94,678

Low

85.1

1.88

2.60

0.00

0.00

Yeast 2 × RPL +0%

Sim.

189,356

Extreme

97.1

3.01

96.90

3.20

0.00

Yeast 2 × RPL +2%

Sim.

189,356

High

98.6

2.18

29.90

0.52

0.00

Yeast 2 × RPL +5%

Sim.

189,356

Medium/high

97.9

2.05

6.83

0.06

0.00

Yeast 2 × RPL +10%

Sim.

189,356

Medium

86.8

2.01

1.86

0.57

0.00


Genomic templates used in this study: Human 261 k [9]; synthetic concatenation of yeast ribosomal proteins (RPL); perfect tandem duplication of RPL (2 × RPL +0%); tandem duplication of RPL adding 2% sequence divergence to the second copy (2 × RPL +2%); tandem duplication of RPL adding 5% sequence divergence to the second copy (2 × RPL +5%); tandem duplication of RPL adding 10% sequence divergence to the second copy (2 × RPL +10%). See main text for more details. aWhether the sequenced reads were obtained experimentally (Exp.) or by simulation (Sim.). bIndicates the number of positions genotyped. cDetermined in the following manner: 100,000 36-nucleotide PE reads were sampled randomly from the template region with perfect fidelity. Sampled reads were aligned to the template with k = 3 mismatches. dPercentage of sampled reads with more than one alignment (percentage degeneracy). eAverage number of alignments for each sampled read across all reads (alignments per read). The last three columns report the percentage of reads aligning to at least three, five, or ten different loci in the template.

Simola and Kim Genome Biology 2011 12:R55   doi:10.1186/gb-2011-12-6-r55

Open Data