Table 1 |
||||||
|
Statistics of the 25 selected tracks, arranged in the order of the UCSC genome browser |
||||||
| UCSC track |
Model with introns |
Model with introns and CDS |
Single exon model (some clipped) |
Unique introns in mRNA |
All introns in mRNA |
Input or method |
|
|
||||||
| HAVANA Gencode (Sanger, UK) known + putative |
1,691 |
649 |
70 |
3,618 |
9,693 |
MEP,CA,H |
| EGASP model submissions |
||||||
| AceView (NCBI, US) |
1,630 |
1,460 |
24 |
3,530 |
9,597 |
ME,(H) |
| UP Dogfish (Sanger, UK) |
204 |
204 |
15 |
1,679 |
1,679 |
CA |
| Exogean (ENS, France) |
554 |
538 |
2 |
2,855 |
6,178 |
MEP,CA |
| UP ExonHunter (U Waterloo, Canada) |
807 |
807 |
220 |
3,237 |
3,237 |
MEP,CA |
| Fgenesh (U London, UK) |
462 |
458 |
97 |
2,610 |
3,241 |
P,CA |
| UP GeneId (IMIM, Spain) |
267 |
267 |
51 |
1,905 |
1,905 |
A |
| UP GeneMark (Georgia IT, US) |
551 |
551 |
81 |
2,185 |
2,185 |
A |
| UP Jigsaw (TIGR, US) |
259 |
259 |
67 |
2,168 |
2,168 |
MEP,CA |
| PairagonAny (Wash U, US) |
471 |
437 |
38 |
2,300 |
3,470 |
MEP?,CA |
| UP SGP2 (IMIM, Spain) |
552 |
552 |
159 |
2,645 |
2,645 |
P,CA |
| P Twinscan-MARS (Wash U,US) |
547 |
547 |
108 |
2,501 |
4,943 |
CA |
| UP Augustus Any (U Göttingen, Germany) |
312 |
316 |
87 |
2,291 |
2,291 |
MEP,CA |
| UP GeneZilla (TIGR, US) |
477 |
477 |
179 |
2,758 |
2,758 |
A |
| UP Saga (UC Berkeley, US) |
331 |
331 |
47 |
1,737 |
1,737 |
CA |
| UCSC gene tracks |
||||||
| *Known Gene (UCSC) |
501 |
477 |
53 |
2,264 |
4,427 |
MP |
| *P CCDS |
201 |
201 |
14 |
1,296 |
1,508 |
MP,H |
| *RefSeq (NCBI, US) |
342 |
325 |
41 |
2,082 |
2,922 |
M(E)P,H |
| *MGC |
323 |
310 |
19 |
1,400 |
2,101 |
M |
| *Ensembl (EBI, UK) |
427 |
418 |
58 |
2,429 |
3,548 |
MEP,CA |
| *AceView (Aug 2005 NCBI) |
1,792 |
1,627 |
902 |
3,812 |
9,792 |
ME, (H) |
| *ECgene (Korea) |
3,851 |
3,551 |
2,569 |
3,942 |
30,660 |
ME,C |
| *U NscanEst (Wash U, US) |
282 |
252 |
27 |
2,292 |
2,292 |
ME,CA |
| *UP GenScan (MIT, US) |
395 |
395 |
59 |
3,042 |
3,042 |
A |
|
|
||||||
|
The number of models, with or without introns (after clipping at region boundaries), the number of spliced coding models, and the number of unique and multiply used introns are given over the 31 ENCODE test regions. Coded information has been added in front of the track name: asterisks distinguish standard gene tracks, available genome-wide, from an ENCODE only track; a U track predicts a unique model per gene; P predicts protein coding regions only. According to their documentation, the programs use different input or methods: M, E, P stand for human mRNA, EST, protein sequences or alignments, respectively; C stands for for conservation, or use of cDNA or protein evidence from other species; A stands for ab initio prediction; H stands for Hand curation; and parenthesized letters stand for minimal use of the particular type. Notice the low proportion of Gencode mRNA models with an annotated CDS (in bold). |
||||||
|
Thierry-Mieg and Thierry-Mieg Genome Biology 2006 7(Suppl 1):S12 doi:10.1186/gb-2006-7-s1-s12 |
||||||