Genome Biology

official impact factor 6.89

Open Access Highly Access Research

Towards defining the nuclear proteome

J Lynn Fink1, Seetha Karunaratne2, Amit Mittal3, Donald M Gardiner2, Nicholas Hamilton4,2, Donna Mahony2, Chikatoshi Kai5,6, Harukazu Suzuki5,6, Yosihide Hayashizaki5,6 and Rohan D Teasdale1,2*

Author Affiliations

1 ARC Centre of Excellence in Bioinformatics, The University of Queensland, St Lucia, Queensland, 4072, Australia

2 Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland, 4072, Australia

3 Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology, New Delhi, 110016, India

4 Advanced Computational Modelling Centre, Department of Mathematics, University of Queensland, St Lucia, Queensland, 4072, Australia

5 Genome Exploration Research, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan

6 Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan

For all author emails, please log on.

Genome Biology 2008, 9:R15 doi:10.1186/gb-2008-9-1-r15

Published: 23 January 2008

Abstract

Background

The nucleus is a complex cellular organelle and accurately defining its protein content is essential before any systematic characterization can be considered.

Results

We report direct evidence for 2,568 mammalian proteins within the nuclear proteome: the nuclear subcellular localization of 1,529 proteins based on a high-throughput subcellular localization protocol of full-length proteins and an additional 1,039 proteins for which clear experimental evidence is documented in published literature. This is direct evidence that the nuclear proteome consists of at least 14% of the entire proteome. This dataset was used to evaluate computational approaches designed to identify additional nuclear proteins.

Conclusion

This represents direct experimental evidence that the nuclear proteome consists of at least 14% of the entire proteome. This high-quality nuclear proteome dataset was used to evaluate computational approaches designed to identify additional nuclear proteins. Based on this analysis, researchers can determine the stringency and types of lines of evidence they consider to infer the size and complement of the nuclear proteome.