Log on / register
BioMed Central home | Journals A-Z | Feedback | Support | My details
.refereed research
 |  |  |  |  | 


Open AccessMethod

ngLOC: an n-gram-based Bayesian method for estimating the subcellular proteomes of eukaryotes

Brian R King1,2 email and Chittibabu Guda2,3 email

1Department of Computer Science, State University of New York at Albany, Washington Ave, Albany, New York 12222, USA

2Gen*NY*sis Center for Excellence in Cancer Genomics, State University of New York at Albany, Discovery Drive, Rensselaer, New York 12144-3456, USA

3Department of Epidemiology and Biostatistics, State University of New York at Albany, Discovery Drive, Rensselaer, New York 12144-3456, USA

author email corresponding author email

Genome Biology 2007, 8:R68doi:10.1186/gb-2007-8-5-r68

Published: 1 May 2007

Subject areas: Bioinformatics, Cell biology

Abstract

We present a method called ngLOC, an n-gram-based Bayesian classifier that predicts the localization of a protein sequence over ten distinct subcellular organelles. A tenfold cross-validation result shows an accuracy of 89% for sequences localized to a single organelle, and 82% for those localized to multiple organelles. An enhanced version of ngLOC was developed to estimate the subcellular proteomes of eight eukaryotic organisms: yeast, nematode, fruitfly, mosquito, zebrafish, chicken, mouse, and human.


© 1999-2008 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.