Genome Biology

official impact factor 6.89

Open Access Method

A method to assess compositional bias in biological sequences and its application to prion-like glutamine/asparagine-rich domains in eukaryotic proteomes

Paul M Harrison* and Mark Gerstein

Author Affiliations

Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, New Haven, CT 06520-8114, USA

For all author emails, please log on.

Genome Biology 2003, 4:R40 doi:10.1186/gb-2003-4-6-r40

Published: 30 May 2003

Abstract

We have derived a novel method to assess compositional biases in biological sequences, which is based on finding the lowest-probability subsequences for a given residue-type set. As a case study, the distribution of prion-like glutamine/asparagine-rich ((Q+N)-rich) domains (which are linked to amyloidogenesis) was assessed for budding and fission yeasts and four other eukaryotes. We find more than 170 prion-like (Q+N)-rich regions in budding yeast, and, strikingly, many fewer in fission yeast. Also, some residues, such as tryptophan or isoleucine, are unlikely to form biased regions in any eukaryotic proteome.