Log on / register
BioMed Central home | Journals A-Z | Feedback | Support | My details
.refereed research
 |  |  |  |  | 


Open AccessMethod

ngLOC: an n-gram-based Bayesian method for estimating the subcellular proteomes of eukaryotes

Brian R King1,2 email and Chittibabu Guda2,3 email

1Department of Computer Science, State University of New York at Albany, Washington Ave, Albany, New York 12222, USA

2Gen*NY*sis Center for Excellence in Cancer Genomics, State University of New York at Albany, Discovery Drive, Rensselaer, New York 12144-3456, USA

3Department of Epidemiology and Biostatistics, State University of New York at Albany, Discovery Drive, Rensselaer, New York 12144-3456, USA

author email corresponding author email

Genome Biology 2007, 8:R68doi:10.1186/gb-2007-8-5-r68

Published: 1 May 2007

Subject areas: Bioinformatics, Cell biology


Additional files

Additional data file 1:

Provided are all of the formulas used for performance measurements and Supplementary Tables 1-21.

Format: DOC Size: 528KB Download file

This file can be viewed with: Microsoft Word Viewer

Additional data file 2:

Provided is the actual ngLOC dataset. It is a FASTA formatted file. The header format for each sequence is > SP_name loc [/loc2], where SP_name is the Swiss-Prot name of the protein sequence, from release 50.0, loc is a single letter representing subcellular localization for this sequence, and/loc2 is an optional field that exists only if the sequence is multi-localized. The letter codes for subcellular localization are as follows: C (CYT), cytoplasm; K (CSK), cytoskeleton[E (END), endoplasmic reticulum; S (EXC), extracellular/secreted; G (GOL), Golgi; L (LYS), lysosome; M (MIT), mitochondria; N (NUC), nucleus; P (PLA), plasma membrane; X (POX), perixosome.

Format: TXT Size: 25.8MB Download file


© 1999-2008 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.