Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

Open Access Highly Accessed Method

CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction

Samuel S Gross1*, Chuong B Do1, Marina Sirota2 and Serafim Batzoglou1

Author Affiliations

1 Computer Science Department, Stanford University, Stanford, CA, USA

2 Biomedical Informatics, Stanford University, Stanford, CA, USA

For all author emails, please log on.

Genome Biology 2007, 8:R269  doi:10.1186/gb-2007-8-12-r269

Published: 20 December 2007

Abstract

We describe CONTRAST, a gene predictor which directly incorporates information from multiple alignments rather than employing phylogenetic models. This is accomplished through the use of discriminative machine learning techniques, including a novel training algorithm. We use a two-stage approach, in which a set of binary classifiers designed to recognize coding region boundaries is combined with a global model of gene structure. CONTRAST predicts exact coding region structures for 65% more human genes than the previous state-of-the-art method, misses 46% fewer exons and displays comparable gains in specificity.