Log on / register
BioMed Central home | Journals A-Z | Feedback | Support
.refereed research
 |  |  |  |  | 


Open AccessHighly AccessResearch

Heterochromatic sequences in a Drosophila whole-genome shotgun assembly

Roger A Hoskins1,8 email, Christopher D Smith2,8, Joseph W Carlson1, A Bernardo Carvalho3, Aaron Halpern4, Joshua S Kaminker2, Cameron Kennedy5, Chris J Mungall6, Beth A Sullivan5, Granger G Sutton4, Jiro C Yasuhara7, Barbara T Wakimoto7, Eugene W Myers4, Susan E Celniker1, Gerald M Rubin1,2,6 and Gary H Karpen5

1Department of Genome Sciences, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA

2Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA

3Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21944-970, Rio de Janeiro, Brazil

4Celera Genomics, 45 West Gude Drive, Rockville, MD 20850, USA

5Molecular and Cell Biology Laboratory, Salk Institute, La Jolla, CA 92037, USA

6Howard Hughes Medical Institute, University of California, Berkeley, CA 94720, USA

7Department of Zoology, University of Washington, Seattle, WA 98195, USA

8These authors contributed equally to this work

author email corresponding author email

Genome Biology 2002, 3:research0085.1-0085.16doi:10.1186/gb-2002-3-12-research0085

Published: 31 December 2002


This article is part of a series of refereed research articles from Berkeley Drosophila Genome Project, FlyBase and colleagues, describing Release 3 of the Drosophila genome, which are freely available at http://genomebiology.com/drosophila/.

Subject areas: Molecular biology, Genome studies, Model organisms, Bioinformatics

Abstract

Background

Most eukaryotic genomes include a substantial repeat-rich fraction termed heterochromatin, which is concentrated in centric and telomeric regions. The repetitive nature of heterochromatic sequence makes it difficult to assemble and analyze. To better understand the heterochromatic component of the Drosophila melanogaster genome, we characterized and annotated portions of a whole-genome shotgun sequence assembly.

Results

WGS3, an improved whole-genome shotgun assembly, includes 20.7 Mb of draft-quality sequence not represented in the Release 3 sequence spanning the euchromatin. We annotated this sequence using the methods employed in the re-annotation of the Release 3 euchromatic sequence. This analysis predicted 297 protein-coding genes and six non-protein-coding genes, including known heterochromatic genes, and regions of similarity to known transposable elements. Bacterial artificial chromosome (BAC)-based fluorescence in situ hybridization analysis was used to correlate the genomic sequence with the cytogenetic map in order to refine the genomic definition of the centric heterochromatin; on the basis of our cytological definition, the annotated Release 3 euchromatic sequence extends into the centric heterochromatin on each chromosome arm.

Conclusions

Whole-genome shotgun assembly produced a reliable draft-quality sequence of a significant part of the Drosophila heterochromatin. Annotation of this sequence defined the intron-exon structures of 30 known protein-coding genes and 267 protein-coding gene models. The cytogenetic mapping suggests that an additional 150 predicted genes are located in heterochromatin at the base of the Release 3 euchromatic sequence. Our analysis suggests strategies for improving the sequence and annotation of the heterochromatic portions of the Drosophila and other complex genomes.


© 1999-2008 BioMed Central Ltd unless otherwise stated < info@genomebiology.com >   Terms and conditions