Pfam: multiple sequence alignments and HMM-profiles of protein domains.
Sonnhammer EL, Eddy SR, Birney E, Bateman A, Durbin R.
Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, Building 38A, Room 8N805, National Institutes of Health, Bethesda, MD 20894, USA. sonnhammer@ncbi.nlm.nih.gov
Pfam contains multiple alignments and hidden Markov model based profiles (HMM-profiles) of complete protein domains. The definition of domain boundaries, family members and alignment is done semi-automatically based on expert knowledge, sequence similarity, other protein family databases and the ability of HMM-profiles to correctly identify and align the members. Release 2.0 of Pfam contains 527 manually verified families which are available for browsing and on-line searching via the World Wide Web in the UK at http://www.sanger.ac.uk/Pfam/ and in the US at http://genome.wustl. edu/Pfam/ Pfam 2.0 matches one or more domains in 50% of Swissprot-34 sequences, and 25% of a large sample of predicted proteins from the Caenorhabditis elegans genome.
Publication Types:
PMID: 9399864 [PubMed - indexed for MEDLINE]PMCID: PMC147209