This article is part of the supplement: Beyond the Genome: The true gene count, human evolution and disease genomics
Genes and genomes, an imperfect world: comparison of gene annotations of two Bos taurus draft assemblies
Genome Biology 2010, 11(Suppl 1):P13 doi:10.1186/gb-2010-11-s1-p13
The electronic version of this article is the complete one and can be found online at: http://genomebiology.com/2010/11/S1/P13
| Published: | 11 October 2010 |
© 2010 Florea et al; licensee BioMed Central Ltd.
Background
Gene annotation is the first and most important step in analyzing a genome. As an increasing number of species are being sequenced and assembled to various degrees of completion, two key questions are: how does the quality of the assembled sequence affect gene annotation? What is the impact on scientists using the annotation?
Results
We compared the gene annotations produced with the GNOMON [1] annotation pipeline for two Bos taurus genome assemblies produced at the University of Maryland [2,3]. We find that the changes to the assembly from one release to the next, which were quite substantial, made significant differences in the gene sets and gene structures, and implicitly the predicted proteins. The annotation was affected by the availability of new gene evidence and by seemingly rare genome mis-assembly events and local sequence variations. For instance, although the later assembly is generally superior, hundreds of protein coding genes in the earlier assembly are missing from the annotation of the later genome, and ~15% (~3600) of the genes have complex structural differences between the two assemblies. In addition, 15-20% of the predicted proteins have relatively large sequence differences when compared to their Ref-seq models.
Conclusions
These findings highlight the consequences of genome quality on gene annotation, and argue for continued improvements in a genome sequence until it is truly finished. They also demonstrate the challenges that confront a biologist when tracking a gene of interest between different assembly versions. In addition, these analyses have helped us identify specific loci for improvement, which will benefit the user community of the Bos taurus genome.
References
-
The NCBI GNOMON eukaryotic gene prediction tool [http://www.ncbi.nlm.nih.gov/genome/guide/gnomon.shtml] webcite
-
Zimin AV, Delcher AL, Florea L, Kelley DR, Schatz MC, Puiu D, Hanrahan F, Pertea G, Van Tassell CP, Sonstegard TS, et al.: A whole-genome assembly of the domestic cow, Bos taurus.
Genome. Biol 2009, 10:R42. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text
-
Bos taurus assembly site at the University of Maryland [http://www.cbcb.umd.edu/research/bos_taurus_assembly.shtml] webcite