Table 3

Evidence supporting the euchromatic protein-coding gene models*

Data category
Number of Release 3 protein-coding genes
% of Release 3 protein-coding genes

Total
13,379
100
Release 2 annotations
12,549
94
Gene-prediction data only
815
6
Genie gene predictions
12,427
93
GENSCAN gene predictions
12,853
96
BLASTX/TBLASTX homologies
10,996
82
ESTs and DGC cDNA sequencing reads
10,406
78
GenBank accessions
3,104
23
ARGS (RefSeq)
795
6
Error report submissions
825
6
Full-insert DGC cDNAs§
9,297
69

*Determined by assessment of alignment overlap of data category versus gene model. Not including those contributed by the BDGP (not mutually exclusive categories: many of these genes also have representative cDNA clones in the DGC). ARGS, annotated reference gene sequence; high-quality FlyBase gene-level annotations that include data from the published literature; contributed to the NCBI reference sequence (RefSeq) project. §For a rigorous assessment of the quality of the DGC cDNAs, see [30].

Misra et al. Genome Biology 2002 3:research0083.1   doi:10.1186/gb-2002-3-12-research0083

Open Data