Log on / register
BioMed Central home | Journals A-Z | Feedback | Support | My details
.refereed research
 |  |  |  |  | 


Open AccessHighly AccessMethod

Prediction of effective genome size in metagenomic samples

Jeroen Raes* 1 email, Jan O Korbel* 1,2 email, Martin J Lercher1 email, Christian von Mering1,3 email and Peer Bork1 email

1European Molecular Biology Laboratory, Meyerhofstrasse 1, D-69117 Heidelberg, Germany

2Molecular Biophysics & Biochemistry Department, Yale University, Whitney Avenue, New Haven, Connecticut, USA

3Institute of Molecular Biology, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland

author email corresponding author email* Contributed equally

Genome Biology 2007, 8:R10doi:10.1186/gb-2007-8-1-r10

Published: 15 January 2007

Subject areas: Genome studies, Bioinformatics, Microbiology and parasitology

Abstract

We introduce a novel computational approach to predict effective genome size (EGS; a measure that includes multiple plasmid copies, inserted sequences, and associated phages and viruses) from short sequencing reads of environmental genomics (or metagenomics) projects. We observe considerable EGS differences between environments and link this with ecologic complexity as well as species composition (for instance, the presence of eukaryotes). For example, we estimate EGS in a complex, organism-dense farm soil sample at about 6.3 megabases (Mb) whereas that of the bacteria therein is only 4.7 Mb; for bacteria in a nutrient-poor, organism-sparse ocean surface water sample, EGS is as low as 1.6 Mb. The method also permits evaluation of completion status and assembly bias in single-genome sequencing projects.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.