This article is part of the supplement: The BioCreative II - Critical Assessment for Information Extraction in Biology Challenge

Open Access Research

Gene mention normalization and interaction extraction with context models and sentence motifs

Jörg Hakenberg123*, Conrad Plake14, Loic Royer1, Hendrik Strobelt1, Ulf Leser2 and Michael Schroeder1

Author Affiliations

1 Bioinformatics Group, Biotechnological Centre, Technische Universität Dresden, Tatzberg, 01307 Dresden, Germany

2 Knowledge Management in Bioinformatics, Computer Science Department, Humboldt-Universität zu Berlin, Unter den Linden, 10099 Berlin, Germany

3 Current affiliation: BioAI Lab, Arizona State University, S Mill Avenue, Tempe/Phoenix, Arizona 85281-8809, USA

4 Transinsight GmbH, Tatzberg, 01307 Dresden, Germany

For all author emails, please log on.

Genome Biology 2008, 9(Suppl 2):S14  doi:10.1186/gb-2008-9-s2-s14

Published: 1 September 2008

Additional files

Additional file 1:

Regular expressions catching gene names that are too unspecific - that is, false positives in the BioCreative II gene normalization data. These result, for example, from protein families, tissues, or other 'spurious' synonyms in the original lexicon.

Format: TXT Size: 3KB Download file

Open Data

Additional file 2:

The file contains verbs, nouns, and adjectives that can refer to PPIs.

Format: TXT Size: 16KB Download file

Open Data