Table 1

Results for gene mention normalization

Short description of the submitted run

Precision

Recall

F measure (%)

True positives (n)

False positives (n)

False negatives (n)


Training set

82.1

81.6

81.8

522

114

118

Training set, no filtering, no disambiguation

20.2

92.7

33.1

593

2,348

47

Test set

78.9

83.3

81.0

654

175

131

Test set, no disambiguation

49.6

87.5

63.3

687

699

98

Test set, unextended lexicon

70.7

72.5

71.6

569

236

216

Test set, current performance

90.7

82.4

86.4

647

66

138


Performance of the gene mention normalization component on the BioCreative II gene normalization sets. Each run includes the extended gene name lexicon, all false-positive filters, and the disambiguation, unless indicated otherwise. Results on the test set reflect official results achieved in the external evaluation; the last row shows the current performance, resulting from improvements added in the aftermath of BioCreative II.

Hakenberg et al. Genome Biology 2008 9(Suppl 2):S14   doi:10.1186/gb-2008-9-s2-s14

Open Data