<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-spotlight-20030515-01</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research news</dochead>
      <bibl>
         <title>
            <p>Many yeasts win the vote</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Holding</snm>
               <fnm>Cathy</fnm>
               <email>cholding@hgmp.mrc.ac.uk</email>
            </au>
         </aug>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2003</pubdate>
         <volume>4</volume>
         <fpage>spotlight-20030515-01</fpage>
         <xrefbib>
            <pubid idtype="doi">10.1186/gb-spotlight-20030515-01</pubid>
         </xrefbib>
      </bibl>
      <history>
         <pub>
            <date>
               <day>15</day>
               <month>5</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>BioMed Central Ltd</collab>
      </cpyrt>
      <shortabs>
         <p>Comparative genomics with two or more related species reveal limitations of single genome analysis</p>
      </shortabs>
   </fm>
   <meta>
      <classifications>
         <classification type="news" subtype="status">Live</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p/>
         </st>
         <p>The <abbr bid="B1">Human Genome Mapping Project</abbr> set out to unravel the secrets of the genes by determining the primary sequence of the human genome, but it has become clear that this information is insufficient. <abbr bid="B2">Determination </abbr>of functional and coding sequences in a primary genome sequence depends on an <it>a priori</it> knowledge of gene function and on <abbr bid="B3">statistics</abbr>, and so the information obtained is incomplete and probabilistic. In the May 15 <abbr bid="B4"><it>Nature</it></abbr>, Manolis Kellis and colleagues at the <abbr bid="B5">Whitehead Institute/Massachusetts Institute of Technology Center for Genome Research</abbr> develop and apply a general approach to determining regions of significance in primary sequence by whole genome comparison of several related species. They reasoned that evolution would conserve protein coding and regulatory elements and that comparison of more than two genomes would increase the signal:noise ratio by highlighting changes that were not due to chance (<it>Nature </it> 2003, <b>423:</b>241-254).</p>
         <p>Kellis et al. compared the sequences of four related species of yeast, <it>Saccharomyces cerevisiae</it>, <it>S. paradoxus, S. mikatae</it>, and <it>S. bayanus</it> and employed a "voting system" to reach a conclusion on the validity of theoretical open reading frames (ORFs) and on the accuracy of the determination of proposed gene structures such as promoters, translation start and stop sites, and intron/exon boundaries. They propose to reduce the number of genes in the yeast gene catalogue by eliminating 503 invalid ORFs and to redefine gene structure assignments in at least 300 cases. They identified 188 genes that encode small proteins of &lt;100 amino acids and many new genes and regulatory elements; they were also able to infer functions for more than half of their 42 newly discovered sequence motifs by categorizing the genes associated with them. In addition, they found evidence for rapid genome evolution at all of the telomeres.</p>
         <p>"The analyses will produce a substantial revision in our knowledge of the yeast genome and provide strategic directions for how we might select other sequencing targets to advance understanding of the human genome," writes Steven Salzberg of <abbr bid="B6">The Institute for Genomic Research</abbr> in an accompanying News and Views article. "This new study of yeast genomes makes it clear that comparative genome sequencing has tremendous analytical power," he concludes.</p>
      </sec>
   </bdy>
   <bm>
      <refgrp>
         <bibl id="B1">
            <url>http://www.genome.gov/</url>
            <note>Human Genome Project</note>
         </bibl>
         <bibl id="B2">
            <note>Computational identification of cis-regulatory elements associated with groups of functionally related genes in <it>Saccharomyces cerevisiae</it>.</note>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10698627</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <note>Greedy mixture learning for multiple motif discovery in biological sequences.</note>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12651719</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <url>http://www.nature.com</url>
            <note>
               <it>Nature</it>
            </note>
         </bibl>
         <bibl id="B5">
            <url>http://www-genome.wi.mit.edu/</url>
            <note>Whitehead Institute/Massachusetts Institute of Technology Center for Genome Research</note>
         </bibl>
         <bibl id="B6">
            <url>http://www.tigr.org</url>
            <note>The Institute for Genomic Research</note>
         </bibl>
      </refgrp>
   </bm>
</art>
