This article has not been peer reviewed.

Deposited research article

Universality in large-scale structure of complete genomes

Li-Ching Hsieh1,2, Ta-Yuan Chen1, Chang-Heng Chang1, Wen-Lang Fan1 and Hoong-Chien Lee1,2,3*

Author Affiliations

1 Department of Physics, National Central University, Chungli, Taiwan 320

2 Department of Life Sciences, National Central University, Chungli, Taiwan 320.

3 Center for Complex Systems, National Central University, Chungli, Taiwan 320

For all author emails, please log on.

Genome Biology 2004, 5:P7 doi:10.1186/gb-2004-5-3-p7


This is the first version of this article to be made available publicly.

Published: 28 January 2004

Abstract

The abundance of duplications in genomes in the form of paralogs, pseudogenes and a variety of repeats suggests that genomes may have used duplications as one mode for their growth. However a systematic knowledge on all possible duplications in whole genomes is still lacking. This paper reports the results of a detailed study of occurrence frequencies of short oligonucleotides in all extant complete genomes. We found a systematic pattern of repeats of short oligonucleotides that places all the complete genomes except Plasmodium in a single universality class expressed by an extremely simple formula. Our analysis of the data combined with computer simulation of genome growth models suggest a simple coarse-grain representation of genome growth: the ancestors of the genomes began to grow when they were no greater than 300 b in length via a mechanism whose main components were neutral stochastic segmental replicative translocations and random small mutations.