Log on / register
BioMed Central home | Journals A-Z | Feedback | Support | My details
.refereed research
 |  |  |  |  | 


Open AccessHighly AccessResearch

Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome

Casey M Bergman1,9, Barret D Pfeiffer1,9, Diego E Rincón-Limas2,6, Roger A Hoskins1, Andreas Gnirke3, Chris J Mungall4, Adrienne M Wang1,7, Brent Kronmiller1,8, Joanne Pacleb1, Soo Park1, Mark Stapleton1, Kenneth Wan1, Reed A George1, Pieter J de Jong5, Juan Botas2, Gerald M Rubin1,4 and Susan E Celniker1 email

1Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA

2Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA

3Exelixis Inc., South San Francisco, CA 94080, USA

4Howard Hughes Medical Institute, Department of Molecular and Cellular Biology, University of California, Berkeley, CA 94720, USA

5Children's Hospital and Research Center at Oakland, Oakland, CA 94609, USA

6Current address: Departamento de Biologia Molecular, Universidad Autonoma de Tamaulipas-UAMRA, Reynosa, CP 88740, Mexico

7Current address: Department of Physiology, University of California, San Francisco, CA 94143, USA

8Current address: Department of Bioinformatics and Computational Biology, Iowa State University, Ames, IA 50011, USA

9These authors contributed equally to this work

author email corresponding author email

Genome Biology 2002, 3:research0086.1-0086.20doi:10.1186/gb-2002-3-12-research0086

Published: 30 December 2002


This article is part of a series of refereed research articles from Berkeley Drosophila Genome Project, FlyBase and colleagues, describing Release 3 of the Drosophila genome, which are freely available at http://genomebiology.com/drosophila/.

Subject areas: Evolution, Genome studies, Model organisms, Bioinformatics

Abstract

Background

It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined.

Results

We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons. Microsynteny is largely conserved, with rearrangement breakpoints, novel transposable element insertions, and gene transpositions occurring in similar numbers. Rates of amino-acid substitution are higher in uncharacterized genes relative to genes that have previously been studied. Conserved non-coding sequences (CNCSs) tend to be spatially clustered with conserved spacing between CNCSs, and clusters of CNCSs can be used to predict enhancer sequences.

Conclusions

Our results provide the basis for choosing species whose genome sequences would be most useful in aiding the functional annotation of coding and cis-regulatory sequences in Drosophila. Furthermore, this work shows how decoding the spatial organization of conserved sequences, such as the clustering of CNCSs, can complement efforts to annotate eukaryotic genomes on the basis of sequence conservation alone.


© 1999-2008 BioMed Central Ltd unless otherwise stated < info@genomebiology.com >   Terms and conditions