Open Access Highly Accessed Method

Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments

Brian J Haas12*, Steven L Salzberg3, Wei Zhu1, Mihaela Pertea3, Jonathan E Allen34, Joshua Orvis15, Owen White15, C Robin Buell16 and Jennifer R Wortman15

Author Affiliations

1 J Craig Venter Institute, The Institute for Genomic Research, Rockville, 9712 Medical Center Drive, Maryland 20850, USA

2 Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA

3 Center for Bioinformatics and Computational Biology, Department of Computer Science, 3125 Biomolecular Sciences Bldg #296, University of Maryland, College Park, Maryland 20742, USA

4 Computation Directorate, Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550, USA

5 Institute for Genome Sciences, University of Maryland Medical School, Baltimore, Maryland 21201, USA

6 Department of Plant Biology, Michigan State University, East Lansing, Michigan 48824, USA

For all author emails, please log on.

Genome Biology 2008, 9:R7  doi:10.1186/gb-2008-9-1-r7

Published: 11 January 2008

Abstract

EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.