This article is part of the supplement: The BioCreative II - Critical Assessment for Information Extraction in Biology Challenge
Research
Overview of BioCreative II gene mention recognition
1 National Center for Biotechnology Information, Bethesda, Maryland, USA
2 IBM TJ Watson Research Center, Yorktown Heights, NY, USA
3 Institute of Bioinformatics, National Yang-Ming University, Taipei, Taiwan
4 Institute of Information Science, Academia Sinica, Taipei, Taiwan
5 Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Department of Bioinformatics, Schloss Birlinghoven, Sankt Augustin, Germany
6 Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
7 Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University Medical Center, Washington, District of Columbia, USA
8 School of Informatics, University of Edinburgh, UK
9 Department of Mathematics, Statistics and Computer Science, Marquette University, Milwaukee, Wisconsin, USA
10 Department of Electrical and Computer Engineering, Marquette University, Milwaukee, Wisconsin, USA
11 Computer Laboratory, University of Cambridge, Cambridge, UK
12 University of Colorado School of Medicine, Center for Computational Pharmacology, Denver, Colorado, USA
13 Alias-i, Inc., Brooklyn, New York, USA
14 Institute of Information Science, Academia Sinica, Taipei, Taiwan
15 Department of Computer Science & Engineering, Yuan Ze University, Taoyuan City, Taiwan
16 Department of Computer Science, National Tsing-Hua University, Hsinchu, Taiwan
17 Computational Modeling Laboratory, Vrije Universiteit Brussels, Belgium
18 School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
19 Human Computer Studies Laboratory, Institute of Informatics, University of Amsterdam, Amsterdam, The Netherlands
20 Bioalma, Tres Cantos (Madrid), Spain
21 Facultad de Informática, Universidad Complutense de Madrid, Madrid, Spain
22 Department of Electrical Engineering and Computer Sciences, Computer Science Division, University of California, Berkeley, California, USA
23 Bulgarian Academy of Sciences, Institute for Parallel Processing, Linguistic Modeling Department, Sofia, Bulgaria
24 School of Information, University of California, Berkeley, California, USA
25 Departamento de Tecnologías de la Información, Universidad de Huelva, Huelva, Spain
Genome Biology 2008, 9(Suppl 2):S2 doi:10.1186/gb-2008-9-s2-s2
Published: 1 September 2008Abstract
Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions.



