Table 2

Gap closure results obtained on the eukaryotic datasets

Method


Original

SOAPdenovo

GapFiller


Saccharomyces cerevisiae

Genome size (bp)

11,388,647

11,388,600

11,388,609

Scaffolds

334

334

334

Gap count

283

67

45

Total gap length (bp)

19,358

994

2,873

Errors (SNPs)

890

1,033

931

Errors (indels)

565

754

648

Errors (misjoins)

23

42

31

N50

84,640

84,640

84,649

Homo sapiens (chromosome 14)

Genome size (bp)

95,081,274

95,059,687

95,072,801

Scaffolds

19,249

19,249

19,249

Gap count

2,820

1,986

1,682

Total gap length (bp)

949,137

423,107

699,550

Errors (SNPs)

76,653

79,266

76,928

Errors (indels)

21,261

23,144

22,338

Errors (misjoins)

179

224

187

N50

7,748

8,262

8,469


Results of SOAPdenovo and GapFiller obtained for the S. cerevisiae and human genome show the suitability of both methods to close gaps also in eukaryotic genomes. Patterns are similar to the observations made for bacteria: overall, GapFiller yields the most reliable results and the lowest gap count whereas SOAPdenovo yields a significantly shorter total gap length (though at the cost of a fairly increased error rate). In human the shortened genome size and total gap length obtained by SOAPdenovo (together with the increased indel and misjoin error rate) might indicate that some gaps are eventually closed by collapsing of (repeated) elements.

Boetzer and Pirovano Genome Biology 2012 13:R56   doi:10.1186/gb-2012-13-6-r56

Open Data