Box 1

Sources of standard disease phenotype terminology

International standards for describing disease phenotypes

The World Health Organization's International Classification of Diseases (ICD) is a widely used standard terminology for classification of diseases and health disorders [46]. The current version is available in more than 30 languages, covers more than 14,000 medical terms and includes adaptations focused on specific health areas such as oncology, mental disorder or primary care.

The Unified Medical Language System Metathesaurus (UMLS) is also a well-known source of ontology standards, integrating more than 2 million medical terms, and 12 million relationships between them [43]. UMLS-associated projects include the Medical Subject Headings (MeSH) thesaurus, a controlled vocabulary used for cataloging biomedical and health-related documents that provides one of the most popular searching facilities as the MeSH terms are used to label Medline abstracts. It also contains the Logical Observation Identifiers Names and Codes (LOINC) [47], a catalogue of universal identifiers designed for the electronic exchange of laboratory and clinical test results [48].

Another source of standard terminology is the Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) [49], supported by the International Health Terminology Standards Development Organization [50]. This computer-readable collection of medical terms covers diverse clinical areas such as diseases, medical procedures and drugs. SNOMED-CT currently contains more than 310,000 concepts with unique meanings and formal logic-based definitions organized into hierarchies. SNOMED-CT has already been extended to Spanish, and translations to other languages such as Danish, French and Swedish are currently taking place, addressing one of the pressing needs in the multilingual environment of medical records.

Complementary disease-related ontologies are the Human Phenotype Ontology (HPO) [51], with more than 8,000 terms representing individual phenotypic abnormalities [52] and the Disease Ontology (DOID) [53], which is part of the Open Biological Ontologies Foundry (OBO) [54].

Information on disease phenotypes related to particular genes and proteins

The Online Mendelian Inheritance in Man (OMIM) database stores information such as gene descriptions, inheritance patterns, localization maps and polymorphisms for more than 12,500 gene loci and phenotypic descriptions [55].

SwissProt, the key source of information about protein function, even though not specifically dedicated to disease-related annotations, also includes information linking proteins and associated mutations with pathologies. It provides a very useful link between MeSH disease terminology and specific proteins [56].

Disease description standardization is also fundamental for the exchange of electronic medical records and for their interoperability. Major efforts such as Health Level Seven (HL7) [57] and Digital Imaging and Communication in Medicine (DICOM) [58] protocols provide standards for sharing and retrieving electronic health information and medical images. A more detailed description of standards for electronic medical charts is provided in specialized reviews [59].

Baudot et al. Genome Biology 2009 10:221   doi:10.1186/gb-2009-10-6-221