Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

Paper report

Amino-acid substitution and protein function

Reiner Veitia

Genome Biology 2001, 2:reports0020  doi:10.1186/gb-2001-2-7-reports0020

The electronic version of this article is the complete one and can be found online at: http://genomebiology.com/2001/2/7/reports/0020


Received:1 June 2001
Published:27 June 2001

© 2001 BioMed Central Ltd

Significance and context

Determining the effects of a mutation on a protein's structure and function has until recently only been possible by laborious physical and biological characterization of the mutant protein. With the advent of databases containing vast amounts of information on DNA and protein sequences and on protein structure and function, computational methods to speed up the assessment of the effects of mutation can now be envisaged. Sunyaev et al. describe a computational method for assessing the impact of amino-acid substitutions on the structure and function of a protein, using a wide range of information from the databases. This should in future enable a provisional prediction of the effects of a newly identified mutation or polymorphism to be made more rapidly.

Key results

The method (see Methodological innovations for details) was first evaluated using a set of known deleterious amino-acid replacements in human proteins. The analysis produced 10-30% false-negatives; that is, changes that were predicted not to be deleterious. On analysis of a data set consisting of human proteins and their orthologs from other mammals, the method predicted that about 9% of the substitutions are damaging, thus providing an estimate of the rate of false-positive prediction (base substitutions that have been fixed in functional proteins in different lineages cannot be considered deleterious). Applying the methodto data on well-characterized polymorphisms (polymorphisms in which the allele frequency, three-dimensional structure of the protein and disease association are all known), 73% of damaging changes were successfully predicted. The authors also estimate that an average human genome carries approximately 20,000 heterozygous changes with respect to a human consensus sequence; out of these 'substitutions', about 2,000 would be predicted to be deleterious. The authors provide theoretical and experimental evidence that most of the deleterious amino-acid replacements are not expected to abolish protein function, however. Much work is required in order to achieve a better performance in prediction, even if all currently available information is taken into account by the method.

Methodological innovations

The authors propose a combined prediction strategy that explores the physicochemical effects of each amino-acid change and exploits all the information available in sequence and structure databases. They take into account whether the replacement lies in an annotated active or binding site; whether it affects the interaction with ligands; if it leads to a change in hydrophobicity or electrostatic charge in a buried site; destroys a disulfide bond; inserts a proline in an α-helix; or is incompatible with the profile of the substitutions observed in a set of aligned homologous proteins. An evolutionary conservation analysis has also been integrated in the prediction tool. For this, a set of proteins homologous (more than 30% identity) to the ones under analysis is assembled and a sequence profile is extracted from the aligned sequences. Then a scoring system is used to evaluate whether the amino-acid replacement alters protein structure and/or function.

Links

The data used to compile this study are available from the website of the Bork lab. A report of another approach to predicting the effects of mutation can be found in a related report Genome Biology 2(7):reports0019.

Reporter's comments

Unfortunately, ab initio predictions of the impact of an amino-acid replacement are not possible yet (if indeed they will ever be) and, at the very minimum, homologous protein sequence data are required for the method presented here to work. Although this might be a problem today, it is likely to become less so in the near future, as the sequences of more genomes will be available. From the data and analyses presented in the paper, it is clear that each of us has always been, and will always be, an 'average' individual from a selective point of view.

Table of links

Assumptions that are made about each paper that is the subject of a report, unless otherwise specified:
The full text and figures are available only to subscribers of the journal, but are available over the internet from the journal's website. The paper itself is abstracted by PubMed. There is no supplementary material.