Genome Biology

official impact factor 6.89

Open Access Highly Access Method

ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads

Iain MacCallum1, Dariusz Przybylski1, Sante Gnerre1, Joshua Burton1, Ilya Shlyakhter1, Andreas Gnirke1, Joel Malek2,3, Kevin McKernan2, Swati Ranade2,4, Terrance P Shea1, Louise Williams1, Sarah Young1, Chad Nusbaum1 and David B Jaffe1*

Author Affiliations

1 Broad Institute of MIT and Harvard, Charles Street, Cambridge, MA 02141, USA

2 Life Technologies, Cummings Center, Beverly, MA 01915, USA

3 Current address: Department of Medical Genetics, Weill Cornell Medical College in Qatar, 24144 Al-Rayaan St., Doha, Qatar

4 Current address: Pacific Biosciences, Adams Drive, Menlo Park, CA 94025, USA

For all author emails, please log on.

Genome Biology 2009, 10:R103 doi:10.1186/gb-2009-10-10-r103

Published: 1 October 2009

Abstract

We demonstrate that genome sequences approaching finished quality can be generated from short paired reads. Using 36 base (fragment) and 26 base (jumping) reads from five microbial genomes of varied GC composition and sizes up to 40 Mb, ALLPATHS2 generated assemblies with long, accurate contigs and scaffolds. Velvet and EULER-SR were less accurate. For example, for Escherichia coli, the fraction of 10-kb stretches that were perfect was 99.8% (ALLPATHS2), 68.7% (Velvet), and 42.1% (EULER-SR).