Table 3

Evidence supporting the euchromatic protein-coding gene models*

Data category

Number of Release 3 protein-coding genes

% of Release 3 protein-coding genes


Total

13,379

100

Release 2 annotations

12,549

94

Gene-prediction data only

815

6

Genie gene predictions

12,427

93

GENSCAN gene predictions

12,853

96

BLASTX/TBLASTX homologies

10,996

82

ESTs and DGC cDNA sequencing reads

10,406

78

GenBank accessions

3,104

23

ARGS (RefSeq)

795

6

Error report submissions

825

6

Full-insert DGC cDNAs§

9,297

69


*Determined by assessment of alignment overlap of data category versus gene model. Not including those contributed by the BDGP (not mutually exclusive categories: many of these genes also have representative cDNA clones in the DGC). ARGS, annotated reference gene sequence; high-quality FlyBase gene-level annotations that include data from the published literature; contributed to the NCBI reference sequence (RefSeq) project. §For a rigorous assessment of the quality of the DGC cDNAs, see [30].

Misra et al. Genome Biology 2002 3:research0083.1-0083.22   doi:10.1186/gb-2002-3-12-research0083

Open Data