<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2008-9-3-r54</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Functional diversification of duplicate genes through subcellular adaptation of encoded proteins</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Marques</snm>
               <mi>C</mi>
               <fnm>Ana</fnm>
               <insr iid="I1"/>
               <email>ana.marques@unil.ch</email>
            </au>
            <au id="A2">
               <snm>Vinckenbosch</snm>
               <fnm>Nicolas</fnm>
               <insr iid="I1"/>
               <email>nicolas.vinckenbosch@unil.ch</email>
            </au>
            <au id="A3">
               <snm>Brawand</snm>
               <fnm>David</fnm>
               <insr iid="I1"/>
               <email>david.brawand@unil.ch</email>
            </au>
            <au id="A4" ca="yes">
               <snm>Kaessmann</snm>
               <fnm>Henrik</fnm>
               <insr iid="I1"/>
               <email>henrik.kaessmann@unil.ch</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Center for Integrative Genomics, University of Lausanne, CH-1015 Lausanne, Switzerland</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>3</issue>
         <fpage>R54</fpage>
         <url>http://genomebiology.com/2008/9/3/R54</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18336717</pubid>
               <pubid idtype="doi">10.1186/gb-2008-9-3-r54</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>25</day>
               <month>10</month>
               <year>2007</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>29</day>
               <month>1</month>
               <year>2008</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>12</day>
               <month>3</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>12</day>
               <month>03</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Marques et al.; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <shorttitle>
         <p>Subcellular localization of duplicate genes</p>
      </shorttitle>
      <shortabs>
         <p>Analysis of the subcellular localization patterns of duplicate genes revealed that protein subcellular adaptation represents a common mechanism for the functional diversification of duplicate genes.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Gene duplication is the primary source of new genes with novel or altered functions. It is known that duplicates may obtain these new functional roles by evolving divergent expression patterns and/or protein functions after the duplication event. Here, using yeast (<it>Saccharomyces cerevisiae</it>) as a model organism, we investigate a previously little considered mode for the functional diversification of duplicate genes: subcellular adaptation of encoded proteins.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We show that for 24-37% of duplicate gene pairs derived from the <it>S. cerevisiae </it>whole-genome duplication event, the two members of the pair encode proteins that localize to distinct subcellular compartments. The propensity of yeast duplicate genes to evolve new localization patterns depends to a large extent on the biological function of their progenitor genes. Proteins involved in processes with a wider subcellular distribution (for example, catabolism) frequently evolved new protein localization patterns after duplication, whereas duplicate proteins limited to a smaller number of organelles (for example, highly expressed biosynthesis/housekeeping proteins with a slow rate of evolution) rarely relocate within the cell. Paralogous proteins evolved divergent localization patterns by partitioning of ancestral localizations ('sublocalization'), but probably more frequently by relocalization to new compartments ('neolocalization'). We show that such subcellular reprogramming may occur through selectively driven substitutions in protein targeting sequences. Notably, our data also reveal that relocated proteins functionally adapted to their new subcellular environments and evolved new functional roles through changes of their physico-chemical properties, expression levels, and interaction partners.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>We conclude that protein subcellular adaptation represents a common mechanism for the functional diversification of duplicate genes.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010009">Genetics</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010014">Microbiology and parasitology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010015">Model organisms</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Gene duplication is an important evolutionary mechanism, providing genomes with the genetic raw material for the emergence of genes with new or altered functions <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Several evolutionary fates of the two duplicate gene copies are possible and have been described. For instance, one of the two copies may be redundant and accumulate deleterious mutations that eventually render it a non-functional pseudogene <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Alternatively, both copies might be functionally preserved by natural selection if an increase in gene dosage of the ancestral gene is beneficial <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, or if a change of the stoichiometry of proteins in complexes (for example, after whole genome duplication (WGD) events) would be deleterious <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. Finally, if both gene copies are preserved after the duplication event, they may functionally diverge in two major ways.</p>
         <p>In the classic scenario termed neofunctionalization <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, one of the duplicates evolves a new function (usually defined as a new biochemical function of the encoded protein), while the other retains the ancestral function of the progenitor gene. An alternative model - termed subfunctionalization - posits that the ancestral functions are partitioned between the two duplicates such that their joint levels and patterns of activity are equivalent to the single ancestral gene <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. 'Gene function' in this model is defined as either a function of the encoded proteins <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B8">8</abbr></abbrgrp> or the expression pattern of the gene <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B9">9</abbr></abbrgrp>. In addition, a combination of these two scenarios ('subneofunctionalization') was recently proposed <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>.</p>
         <p>The subcellular localization of a protein is key to its function in the cell <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. In view of this and prompted by the observation that a number of individual reports describe gene families that encode proteins that differ with respect to their subcellular localization (see, for example, <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>; for more individual examples, see also <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>), we set out to systematically investigate an - as yet - little considered alternative mechanism for the functional diversification of duplicate genes, namely, the subcellular relocalization and adaptation of their encoded proteins <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> (which may or not be followed or accompanied by changes of gene expression patterns and/or functional/biochemical properties of the proteins).</p>
         <p>To this end, we used the yeast <it>Saccharomyces cerevisiae </it>as a model, for three reasons. First, the subcellular localization of a large proportion (approximately 75%) of its proteins was recently established <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Second, in addition to other duplicates, the WGD event in this species, which occurred approximately 100 million years ago <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, resulted in a large set of well-defined duplicate gene pairs with the same age (that is, they have the same divergence time). Finally, a wide range of genome- and proteome-wide functional data sets are available for this organism. Thus, the <it>S. cerevisiae </it>genome/proteome provides a unique opportunity to assess the extent and patterns of protein subcellular adaptation after gene duplication.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <sec>
            <st>
               <p>Subcellular divergence is common among yeast whole-genome duplicates</p>
            </st>
            <p>Using protein localization data (22 compartments; obtained by green fluorescent protein (GFP)-fusion analysis) covering 75% of the <it>S. cerevisiae </it>proteome <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, we established the subcellular localization of proteins encoded by 900 yeast genes, forming 450 pairs of WGD-derived duplicate genes <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> (see Materials and methods for details). Among these, we collected 238 pairs for which both paralogs are unambiguously assigned to at least one subcellular compartment (Table <tblr tid="T1">1</tblr>; see Materials and methods). For 88 of these protein pairs (approximately 37%), we found that the two duplicates are located in at least one different subcellular compartment (Table <tblr tid="T1">1</tblr> and Additional data file 1).</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Subcellular localization data for <it>S. cerevisiae </it>proteins in this study</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Number of <it>S. cerevisiae </it>proteins*</p>
                     </c>
                     <c ca="center">
                        <p>Number of WGD duplicate pairs<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>Number of WGD pairs with distinct localizations for the two members</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GFP tagging<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>3,919 (62.9%)</p>
                     </c>
                     <c ca="center">
                        <p>238 (52.8%)</p>
                     </c>
                     <c ca="center">
                        <p>88 (37.0%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Epitope tagging<sup>&#167;</sup></p>
                     </c>
                     <c ca="center">
                        <p>2,745 (44.0%)</p>
                     </c>
                     <c ca="center">
                        <p>124 (27.5%)</p>
                     </c>
                     <c ca="center">
                        <p>53 (42.7%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GFP/epitope tagging overlap<sup>&#182;</sup></p>
                     </c>
                     <c ca="center">
                        <p>2,716 (43.5%)</p>
                     </c>
                     <c ca="center">
                        <p>75 (16.7%)</p>
                     </c>
                     <c ca="center">
                        <p>18 (24%)</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*Out of 6,234 annotated yeast open reading frames [15]. <sup>&#8224;</sup>Out of 450 gene pairs [17]. <sup>&#8225;</sup>Based on [15] unambiguous protein localization assigments. <sup>&#167;</sup>Based on [19]. <sup>&#182;</sup>Gene pairs for which the subcellular localization is unambiguously assigned to both paralogs in both datasets.</p>
               </tblfn>
            </tbl>
            <p>The localization data we used was previously shown to be in 80% agreement with data (small and large-scale) from the <it>Saccharomyces </it>Genome Database <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B18">18</abbr></abbrgrp>, suggesting that the subcellular assignments are generally reliable. However, to assess to what extent experimental artifacts may potentially have influenced the analysis of subcellular divergence between duplicates, we performed a second analysis using earlier <it>S. cerevisiae </it>localization data generated by epitope-tagging <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. The two localization analyses present considerable differences in their experimental setup and the number of cellular compartments covered. Thus, the error sources and potentially misassigned subcellular localizations are expected to be different between the two datasets (for details see <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>).</p>
            <p>We found no significant difference in the proportion of paralogous protein pairs showing distinct subcellular localizations between the GFP (88 of 238 pairs, approximately 37%) and epitope data (53 of 124 pairs, approximately 43%; two-tailed <it>P </it>= 0.31, Fisher's exact test; Table <tblr tid="T1">1</tblr>). We also considered 75 paralogous protein pairs for which localization was assigned in both the GFP and the epitope fusion analyses. Among these, 18 (24%) showed a distinct subcellular localization in both experimental sets (Table <tblr tid="T1">1</tblr>). Thus, we estimate that approximately 24-37% of the <it>S. cerevisiae </it>WGD pairs show protein localization differences, consistent with a recent estimate (approximately 19%) based on Gene Ontology (GO) annotation <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. This suggests that a significant proportion of yeast duplicates have diverged in terms of their subcellular localization.</p>
            <p>All following analyses are based on the GFP-fusion localization data <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, since they represent the most extensive and reliable localization survey of the budding yeast proteome available. WGD-derived duplicates with distinct cellular localization will be referred to as D-pairs, and those with the same subcellular distribution as S-pairs.</p>
         </sec>
         <sec>
            <st>
               <p>Subcellular localization change and protein function</p>
            </st>
            <p>As some biological processes (for example, phosphorylation) are widespread in the cell, whereas others, such as transcription, are restricted to certain organelles (nucleus, mitochondria), one may expect that ancestral functions may impose different constraints with respect to the subcellular diversification potential of duplicates.</p>
            <p>To assess whether a gene's biological function indeed influences the subcellular localization fate of proteins after duplication, we tested for general functional differences between genes in S- and D-pairs using GO annotation. In this analysis, we assume that the current GO distribution of duplicates overall reflects that of their ancestors. Two GO categories stand out (Table <tblr tid="T2">2</tblr>). While S-pairs show a significant excess of genes involved in biosynthetic processes, D-pairs are significantly enriched with genes involved in catabolism (<it>P </it>&#8804; 0.01 after false discovery rate correction <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>). We note that, generally, <it>S. cerevisiae </it>proteins (excluding the WGD duplicates) involved in catabolism are located in, on average, 1.47 compartments, while those that contribute to biosynthesis localize in 1.35 compartments, a significantly different distribution (two-tailed <it>P </it>&lt; 0.01, Mann-Whitney <it>U </it>test). This suggests that the <it>a priori </it>wider subcellular distribution of proteins involved in catabolic pathways facilitates functional divergence through subcelullar relocalization after gene duplication when compared to biosynthetic proteins, which show more restricted localization patterns.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Summary of GO analysis for D- and S-pair duplicates</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>D-pair</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>S-pair</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Biological process*</p>
                     </c>
                     <c ca="center">
                        <p>No.</p>
                     </c>
                     <c ca="center">
                        <p>Percentage</p>
                     </c>
                     <c ca="center">
                        <p>No.</p>
                     </c>
                     <c ca="center">
                        <p>Percentage</p>
                     </c>
                     <c ca="center">
                        <p><it>P</it>-value<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>Excess<sup>&#8225;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Biosynthetic process</p>
                     </c>
                     <c ca="center">
                        <p>32</p>
                     </c>
                     <c ca="center">
                        <p>21.1</p>
                     </c>
                     <c ca="center">
                        <p>111</p>
                     </c>
                     <c ca="center">
                        <p>42.9</p>
                     </c>
                     <c ca="center">
                        <p>0.0008</p>
                     </c>
                     <c ca="center">
                        <p>56</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Catabolic process</p>
                     </c>
                     <c ca="center">
                        <p>32</p>
                     </c>
                     <c ca="center">
                        <p>21.1</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>6.6</p>
                     </c>
                     <c ca="center">
                        <p>0.0024</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Regulation of biological process</p>
                     </c>
                     <c ca="center">
                        <p>59</p>
                     </c>
                     <c ca="center">
                        <p>38.8</p>
                     </c>
                     <c ca="center">
                        <p>56</p>
                     </c>
                     <c ca="center">
                        <p>21.6</p>
                     </c>
                     <c ca="center">
                        <p>0.0172</p>
                     </c>
                     <c ca="center">
                        <p>26</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Response to biotic stimulus</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>4.6</p>
                     </c>
                     <c ca="center">
                        <p>0.1480</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Nitrogen compound metabolic process</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>2.6</p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>8.1</p>
                     </c>
                     <c ca="center">
                        <p>0.3988</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cell communication</p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>11.8</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>5.8</p>
                     </c>
                     <c ca="center">
                        <p>0.4587</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Primary metabolic process</p>
                     </c>
                     <c ca="center">
                        <p>110</p>
                     </c>
                     <c ca="center">
                        <p>72.4</p>
                     </c>
                     <c ca="center">
                        <p>210</p>
                     </c>
                     <c ca="center">
                        <p>81.1</p>
                     </c>
                     <c ca="center">
                        <p>0.5012</p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cell cycle</p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>13.8</p>
                     </c>
                     <c ca="center">
                        <p>20</p>
                     </c>
                     <c ca="center">
                        <p>7.7</p>
                     </c>
                     <c ca="center">
                        <p>0.5012</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Response to endogenous stimulus</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>7.9</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>3.5</p>
                     </c>
                     <c ca="center">
                        <p>0.5012</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Chemical homeostasis</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>2.3</p>
                     </c>
                     <c ca="center">
                        <p>0.5012</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cellular developmental process</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>7.2</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>3.5</p>
                     </c>
                     <c ca="center">
                        <p>0.5012</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Chromosome segregation</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>3.3</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.8</p>
                     </c>
                     <c ca="center">
                        <p>0.5012</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cell division</p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>13.8</p>
                     </c>
                     <c ca="center">
                        <p>25</p>
                     </c>
                     <c ca="center">
                        <p>9.7</p>
                     </c>
                     <c ca="center">
                        <p>0.6028</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Response to stress</p>
                     </c>
                     <c ca="center">
                        <p>24</p>
                     </c>
                     <c ca="center">
                        <p>15.8</p>
                     </c>
                     <c ca="center">
                        <p>29</p>
                     </c>
                     <c ca="center">
                        <p>11.2</p>
                     </c>
                     <c ca="center">
                        <p>0.6483</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Reproductive process</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>9.2</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>5.8</p>
                     </c>
                     <c ca="center">
                        <p>0.6696</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Conjugation</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>4.6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>2.3</p>
                     </c>
                     <c ca="center">
                        <p>0.6982</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Sexual reproduction</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>4.6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>2.3</p>
                     </c>
                     <c ca="center">
                        <p>0.6982</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Aging</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.7</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>2.3</p>
                     </c>
                     <c ca="center">
                        <p>0.7084</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cellular metabolic process</p>
                     </c>
                     <c ca="center">
                        <p>120</p>
                     </c>
                     <c ca="center">
                        <p>79.0</p>
                     </c>
                     <c ca="center">
                        <p>214</p>
                     </c>
                     <c ca="center">
                        <p>82.6</p>
                     </c>
                     <c ca="center">
                        <p>0.7686</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Asexual reproduction</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>2.0</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.8</p>
                     </c>
                     <c ca="center">
                        <p>0.7686</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Nuclear division</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.7</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                     <c ca="center">
                        <p>0.7686</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Protein localization</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>6.6</p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                     <c ca="center">
                        <p>8.9</p>
                     </c>
                     <c ca="center">
                        <p>0.7785</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cell adhesion</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.8</p>
                     </c>
                     <c ca="center">
                        <p>0.7785</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Response to chemical stimulus</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>11.2</p>
                     </c>
                     <c ca="center">
                        <p>25</p>
                     </c>
                     <c ca="center">
                        <p>9.7</p>
                     </c>
                     <c ca="center">
                        <p>0.8583</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>RNA localization</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>4.0</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>3.5</p>
                     </c>
                     <c ca="center">
                        <p>0.9959</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Anatomical structure development</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>7.9</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                     <c ca="center">
                        <p>7.3</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Establishment of localization</p>
                     </c>
                     <c ca="center">
                        <p>33</p>
                     </c>
                     <c ca="center">
                        <p>21.7</p>
                     </c>
                     <c ca="center">
                        <p>55</p>
                     </c>
                     <c ca="center">
                        <p>21.2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Macromolecule metabolic process</p>
                     </c>
                     <c ca="center">
                        <p>98</p>
                     </c>
                     <c ca="center">
                        <p>64.5</p>
                     </c>
                     <c ca="center">
                        <p>169</p>
                     </c>
                     <c ca="center">
                        <p>65.3</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Response to external stimulus</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.4</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Maintenance of localization</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.7</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.4</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Response to abiotic stimulus</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>2.6</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>2.7</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cell organization and biogenesis</p>
                     </c>
                     <c ca="center">
                        <p>57</p>
                     </c>
                     <c ca="center">
                        <p>37.5</p>
                     </c>
                     <c ca="center">
                        <p>97</p>
                     </c>
                     <c ca="center">
                        <p>37.5</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Regulation of biological quality</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>5.3</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>5.4</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Autophagy</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.4</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Filamentous growth</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>5.3</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>5.0</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cell homeostasis</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>4.6</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>4.6</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Regulation of a molecular function</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.7</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>1.2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cell proliferation</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.4</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Non-developmental growth</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.7</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.8</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*We selected GO level 3, since this constitutes a good compromise between the number of genes annotated and the depth of the information contained in each class [46]. <sup>&#8224;</sup>After false discovery rate correction. <sup>&#8225;</sup>Represents the difference between the observed number of genes in the over-represented set and what would be expecated based on the observed genes in the other gene set.</p>
               </tblfn>
            </tbl>
            <p>Next, we analyzed the extent of amino acid divergence in D- and S-pair duplicates. To this end, we used a related yeast species, <it>Kluyveromyces waltii </it><abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, which diverged from <it>S. cerevisiae </it>before the WGD event, as an outgroup, and estimated the non-synonymous substitution rate (that is, the number of non-synonymous substitutions per non-synonymous site, <it>d</it><sub>N</sub>) on the lineages leading to each one of the two <it>S. cerevisiae </it>duplicates using a maximum-likelihood approach <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> (see Materials and Methods for details). This analysis revealed a difference in the <it>d</it><sub>N </sub>distribution between genes in S- and D-pairs (Figure <figr fid="F1">1</figr>; Additional data file 1; two-tailed <it>P </it>&lt; 10<sup>-5</sup>, Mann-Whitney <it>U </it>test); S-pair genes generally show lower non-synonymous substitution rates than those in D-pairs.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Distribution of non-synonymous substitution rates (<it>d</it><sub>N</sub>) for duplicate genes in S- and D-pairs (estimated for the time since the whole-genome duplication event - see text for details)</p>
               </caption>
               <text>
                  <p>Distribution of non-synonymous substitution rates (<it>d</it><sub>N</sub>) for duplicate genes in S- and D-pairs (estimated for the time since the whole-genome duplication event - see text for details).</p>
               </text>
               <graphic file="gb-2008-9-3-r54-1"/>
            </fig>
            <p>Consistent with previous observations <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, cases of extreme decelerated evolution (one of the duplicates has a <it>d</it><sub>N </sub>= 0) among S-pairs include protein coding genes that are known to be highly constrained, such as ribosomal genes (28 pairs), histones (2 pairs) and elongation factors (2 pairs). Selection for increased gene dosage and/or decreased dosage imbalance may explain the intensity of purifying selection observed for these 'housekeeping' duplicates <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B3">3</abbr><abbr bid="B23">23</abbr></abbrgrp>. The fact that these duplicates did not change their subcellular localization is likely due to the specificity of their biological function, which is restricted to certain compartments and may generally preclude subcellular shifts, as suggested by our data above.</p>
            <p>We also found that S-pair genes show higher expression levels than D-pair genes (median = 1.3 copies per cell versus 0.8 copies per cell; two-tailed <it>P </it>&lt; 10<sup>-5</sup>, Mann-Whitney <it>U </it>test), consistent with the idea that many S-pairs represent duplications of housekeeping genes. This difference is also reflected at the protein level; D-pair genes (median = 5,436.9 pmol) express significantly more protein than S-pair genes (mean = 35,788.3 pmol, two-tailed <it>P </it>&lt; 0.01, Mann-Whitney <it>U </it>test).</p>
            <p>Thus, generally, biological function appears to be a strong determinant for the propensity of duplicates to relocate in the cell. While duplicate proteins encoded by slowly evolving housekeeping genes with high expression levels (for example, genes involved in biosynthetic process, such as ribosomal genes) tend to preserve ancestral localization patterns (and functions) after duplication, duplicates from other categories, such as those involved in catabolic processes, are much more likely to evolve divergent localization patterns.</p>
         </sec>
         <sec>
            <st>
               <p>Functional divergence of duplicates through neo- or sublocalization</p>
            </st>
            <p>But how do these divergent localization patterns emerge? Akin to concepts proposed for the functional divergence of duplicate genes through changes in expression and/or protein function, we hypothesized that duplicates may show two types of subcellular divergence (Figure <figr fid="F2">2</figr>). First, ancestral cellular compartments may be partitioned between them, or they may specifically localize to only part of the ancestral compartments, a process we term 'sublocalization'. Second, they may localize to new, previously unoccupied compartments ('neolocalization'). These two processes are analogous to the traditional neo/subfunctionalization concepts <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B9">9</abbr></abbrgrp>, but should be treated separately, as neo/subfunctionalization of the biochemical function and/or expression of the duplicate (that is, the previously considered fates of duplicates) may follow or accompany subcellular divergence (Figure <figr fid="F2">2</figr>).</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Illustration of the different evolutionary fates of (functional) duplicate genes</p>
               </caption>
               <text>
                  <p>Illustration of the different evolutionary fates of (functional) duplicate genes. Each gene/protein is represented in different colors: red, ancestral, 'A'; green, duplicate copy A1; and blue, duplicate copy A2. Different shapes of proteins (circle, square, and triangle) indicate different functions. Three different subcellular localizations (nucleus, cytoplasm, and cytoplasmic membrane) are indicated in a schematic cell. We note that only the major possible scenarios are illustrated here.</p>
               </text>
               <graphic file="gb-2008-9-3-r54-2"/>
            </fig>
            <p>If divergent subcellular localization between duplicates was a consequence of sublocalization alone, the joint number of different compartments per protein pair (that is, combining both duplicates) would be expected to be the same as that of the common ancestral protein. Conversely, the number of compartments per pair should be higher than that of the progenitor if neolocalization contributed to subcellular diversification.</p>
            <p>To assess the contribution of neo- and sublocalization to the functional diversification of duplicates and given the lack of subcellular localization data for ancestral proteins, we used the average number of subcellular compartments of yeast singleton gene products (that is, genes that show no evidence of paralogs in the <it>S. cerevisiae </it>genome; see Materials and methods for details) as a proxy for the subcellular representation of WGD duplicate progenitors (akin to a previous analysis of yeast duplicates <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>).</p>
            <p>We observed that the joint number of distinct compartments per D-pair (mean = 2.31 &#177; 0.63, median = 2) is significantly higher than that observed for singleton proteins (mean = 1.30 &#177; 0.49, median = 1, two-tailed <it>P </it>&lt; 10<sup>-5</sup>, Mann-Whitney <it>U </it>test). In contrast, there is no difference between the distributions of the number of subcellular compartments for S-pairs (mean = 1.27 &#177; 0.42, median = 1) and singletons (two-tailed <it>P </it>= 0.2, Mann-Whitney <it>U </it>test), suggesting that the increase in the number of compartments observed for D-pairs is due to neolocalization events among D-pair proteins.</p>
            <p>A potential caveat of this analysis is that the types of proteins represented in D-pairs might generally and <it>a priori </it>be present in a larger number of compartments, as also indicated by the analysis of the number of compartments for catabolic/biosynthetic proteins discussed above. To control for this, we compared the number of distinct compartments per D-pairs and singletons for proteins within the same GO classes. To ensure adequate sample sizes, we focused on the 8 GO categories that contain more than 30 proteins for both D-pairs and singletons (Table <tblr tid="T3">3</tblr>). This analysis shows that for all eight comparisons, the joint number of compartments per D-pair is significantly higher than that observed for singletons (Table <tblr tid="T3">3</tblr>; two-tailed <it>P </it>&lt; 10<sup>-4</sup>, Mann-Whitney <it>U </it>test). This suggests that the elevated number of compartments for D-pairs is indeed a result of neolocalization and not due to a wide cellular representation of ancestral progenitor proteins, prior to duplication. Based on the observed excess (mean, approximately 0.98) of D-pairs relative to singletons from the same functional categories, we estimate that, on average, approximately one compartment is gained by neolocalization per duplication event between duplicates showing subcellular divergence. In addition to the elevated number of compartments per D-pairs, we find that the average number of compartments per D-pair protein (approximately 1.53) is significantly higher than that of singletons in 7 of 8 comparisons (two-tailed <it>P </it>&lt; 0.05, Mann-Whitney <it>U </it>test). This result further underscores that neolocalization probably predominated over sublocalization during yeast duplicate evolution.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Comparison between the number of different compartments per D-pairs, proteins in D-pairs, and singleton proteins from the same GO categories</p>
               </caption>
               <tblbdy cols="9">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Singleton</p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>D-protein</p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>D-pair</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="3">
                        <hr/>
                     </c>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Biological process*</p>
                     </c>
                     <c ca="center">
                        <p>Total no.</p>
                     </c>
                     <c ca="center">
                        <p>Average no. of compartments</p>
                     </c>
                     <c ca="center">
                        <p>Total no.</p>
                     </c>
                     <c ca="center">
                        <p>Average no. of compartments</p>
                     </c>
                     <c ca="center">
                        <p><it>P</it>-value<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>Total no.</p>
                     </c>
                     <c ca="center">
                        <p>Average no. of compartments</p>
                     </c>
                     <c ca="center">
                        <p><it>P</it>-value<sup>&#8225;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="9">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Regulation of biological process</p>
                     </c>
                     <c ca="center">
                        <p>64</p>
                     </c>
                     <c ca="center">
                        <p>1.30</p>
                     </c>
                     <c ca="center">
                        <p>59</p>
                     </c>
                     <c ca="center">
                        <p>1.56</p>
                     </c>
                     <c ca="center">
                        <p>0.020</p>
                     </c>
                     <c ca="center">
                        <p>32</p>
                     </c>
                     <c ca="center">
                        <p>2.31</p>
                     </c>
                     <c ca="center">
                        <p>1.92E-10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Macromolecule metabolic process</p>
                     </c>
                     <c ca="center">
                        <p>247</p>
                     </c>
                     <c ca="center">
                        <p>1.28</p>
                     </c>
                     <c ca="center">
                        <p>98</p>
                     </c>
                     <c ca="center">
                        <p>1.54</p>
                     </c>
                     <c ca="center">
                        <p>0.001</p>
                     </c>
                     <c ca="center">
                        <p>52</p>
                     </c>
                     <c ca="center">
                        <p>2.31</p>
                     </c>
                     <c ca="center">
                        <p>1.57E-19</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cell organization and biogenesis</p>
                     </c>
                     <c ca="center">
                        <p>160</p>
                     </c>
                     <c ca="center">
                        <p>1.28</p>
                     </c>
                     <c ca="center">
                        <p>57</p>
                     </c>
                     <c ca="center">
                        <p>1.60</p>
                     </c>
                     <c ca="center">
                        <p>0.021</p>
                     </c>
                     <c ca="center">
                        <p>37</p>
                     </c>
                     <c ca="center">
                        <p>2.38</p>
                     </c>
                     <c ca="center">
                        <p>2.27E-14</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Primary metabolic process</p>
                     </c>
                     <c ca="center">
                        <p>325</p>
                     </c>
                     <c ca="center">
                        <p>1.30</p>
                     </c>
                     <c ca="center">
                        <p>110</p>
                     </c>
                     <c ca="center">
                        <p>1.52</p>
                     </c>
                     <c ca="center">
                        <p>0.005</p>
                     </c>
                     <c ca="center">
                        <p>57</p>
                     </c>
                     <c ca="center">
                        <p>2.28</p>
                     </c>
                     <c ca="center">
                        <p>1.93E-20</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Catabolic process</p>
                     </c>
                     <c ca="center">
                        <p>53</p>
                     </c>
                     <c ca="center">
                        <p>1.47</p>
                     </c>
                     <c ca="center">
                        <p>32</p>
                     </c>
                     <c ca="center">
                        <p>1.41</p>
                     </c>
                     <c ca="center">
                        <p>0.370</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="center">
                        <p>2.19</p>
                     </c>
                     <c ca="center">
                        <p>1.19E-04</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Establishment of localization</p>
                     </c>
                     <c ca="center">
                        <p>81</p>
                     </c>
                     <c ca="center">
                        <p>1.17</p>
                     </c>
                     <c ca="center">
                        <p>33</p>
                     </c>
                     <c ca="center">
                        <p>1.55</p>
                     </c>
                     <c ca="center">
                        <p>0.013</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                     <c ca="center">
                        <p>2.37</p>
                     </c>
                     <c ca="center">
                        <p>1.82E-09</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Biosynthetic process</p>
                     </c>
                     <c ca="center">
                        <p>162</p>
                     </c>
                     <c ca="center">
                        <p>1.30</p>
                     </c>
                     <c ca="center">
                        <p>32</p>
                     </c>
                     <c ca="center">
                        <p>1.53</p>
                     </c>
                     <c ca="center">
                        <p>0.044</p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>2.11</p>
                     </c>
                     <c ca="center">
                        <p>1.52E-07</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cellular metabolic process</p>
                     </c>
                     <c ca="center">
                        <p>372</p>
                     </c>
                     <c ca="center">
                        <p>1.32</p>
                     </c>
                     <c ca="center">
                        <p>120</p>
                     </c>
                     <c ca="center">
                        <p>1.52</p>
                     </c>
                     <c ca="center">
                        <p>0.006</p>
                     </c>
                     <c ca="center">
                        <p>62</p>
                     </c>
                     <c ca="center">
                        <p>2.27</p>
                     </c>
                     <c ca="center">
                        <p>1.42E-21</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Regulation of biological process</p>
                     </c>
                     <c ca="center">
                        <p>64</p>
                     </c>
                     <c ca="center">
                        <p>1.30</p>
                     </c>
                     <c ca="center">
                        <p>59</p>
                     </c>
                     <c ca="center">
                        <p>1.56</p>
                     </c>
                     <c ca="center">
                        <p>0.020</p>
                     </c>
                     <c ca="center">
                        <p>32</p>
                     </c>
                     <c ca="center">
                        <p>2.31</p>
                     </c>
                     <c ca="center">
                        <p>1.92E-10</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>The data are derived from 8 categories with at least 30 proteins per group - see text for details. *GO level 3. <sup>&#8224;</sup>Mann-Whitney <it>U </it>test comparing D-proteins and singletons. <sup>&#8225;</sup>Mann-Whitney <it>U </it>test comparing D-pairs and singletons.</p>
               </tblfn>
            </tbl>
            <p>To further assess and illustrate the types and extent of subcellular relocalizations in the evolution of yeast gene families, we used the <it>K. waltii </it>ortholog(s) as outgroups and reconstructed the phylogeny for 45 WGD yeast families (15 D- and 30 S-pair-containing gene families) with at least 1 additional member and mapped the subcellular localizations of these onto the phylogenies (Figure <figr fid="F3">3</figr> and Additional data file 2). In 16 families, the subcellular localization has remained completely preserved among the members (Additional data file 2). For the remaining 29 families, we analyzed changes in protein location, assuming that the scenario requiring the smallest number of subcellular changes, given the observed data (parsimony principle), reflects the true pattern of events.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Subcellular localizations of the <b>(a) </b>UBC and <b>(b) </b>AIR family members and subcellular localization changes inferred based on the phylogeny</p>
               </caption>
               <text>
                  <p>Subcellular localizations of the <b>(a) </b>UBC and <b>(b) </b>AIR family members and subcellular localization changes inferred based on the phylogeny. The common name and yeast protein identifier (in brackets) of the protein are indicated. The schematic representation of a yeast cell depicts three possible localizations: nucleus (small circle), endoplasmatic reticulum (eclipse around nucleus), and cytoplasm (remainder of the cell). The co-localization of the protein with one of the yeast subcellular compartments is indicated by grey shading.</p>
               </text>
               <graphic file="gb-2008-9-3-r54-3"/>
            </fig>
            <p>For 16 of the 29 families, we could infer the most likely scenario of subcellular diversification. Eight families show instances of neolocalization (Additional data file 2). For example, members of the ubiquitin-conjugating enzyme family, involved in protein degradation <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, are generally located in the cytoplasm and the nucleus (Figure <figr fid="F3">3a</figr>). However, UBC7p (also known as QR18) neolocalized to the endoplasmic reticulum (ER; Figure <figr fid="F3">3a</figr>), where it became essential for the degradation of misfolded proteins <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>.</p>
            <p>Sublocalization events occurred in four protein families. For instance, GIS2 (cytoplasmic) and AIR2 (nuclear) of the AIR protein family partitioned their ancestral compartments (the cytoplasm and nucleus - still seen for their AIR1 paralog; Figure <figr fid="F3">3b</figr>). Consistent with their specific localizations, GIS2 specialized in a function in the RAS/cAMP signaling pathway <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>, whereas AIR2 became specifically involved in the processing and export of mRNAs from the nucleus <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>.</p>
            <p>In the remaining four families, both neo- and sublocalization events appear to have occurred (Additional data file 2). In addition to the neolocalization event described above, the UBC family also reveals an instance of sublocalization based on GFP data; UBC13 lost the ancestral nuclear localization (Figure <figr fid="F3">3a</figr>). Thus, the UBC family shows both neo- and sublocalization of family members.</p>
         </sec>
         <sec>
            <st>
               <p>Subcellular shifts and signal peptide evolution</p>
            </st>
            <p>The information required for sorting of proteins to different cellular compartments is encoded in their sequence, sometimes in distinct targeting motifs <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Consequently, differences in the subcellular localization of paralogous proteins should be due to protein sequence changes and should, in principle, be detectable. However, the identification of protein targeting sequence determinants has proven to be a difficult task <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. To elucidate the molecular basis of subcellular relocalization of duplicates, we focused our analysis on the best characterized targeting sequences; amino-terminal signal peptides (SPs) that target proteins to the mitochondria or the ER. These types of SPs are typically 13-36 amino acids long and are usually cleaved from the mature peptide <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>.</p>
            <p>We estimated the amino acid divergence between WGD proteins pairs with ER and/or mitochondrial localization (36 pairs in total, among which 21 are S-pairs and 15 are D-pairs; Table <tblr tid="T4">4</tblr>). We then determined the aminoacid divergence in the first either 13 or 36 amino acids (putative signal peptide region) and in the mature peptide (protein sequence without signal peptide), and then compared it between the 21 S- and 15 D-pairs (Table <tblr tid="T4">4</tblr>).</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Amino acid divergence between WGD protein pairs</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>D-pairs</p>
                     </c>
                     <c ca="center">
                        <p>S-pairs</p>
                     </c>
                     <c ca="center">
                        <p><it>P</it>-value*</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Amino-teminus signal peptide</b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>13 amino acids</p>
                     </c>
                     <c ca="center">
                        <p>0.92</p>
                     </c>
                     <c ca="center">
                        <p>0.69</p>
                     </c>
                     <c ca="center">
                        <p>0.019<sup>&#8224;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>36 amino acids</p>
                     </c>
                     <c ca="center">
                        <p>0.86</p>
                     </c>
                     <c ca="center">
                        <p>0.67</p>
                     </c>
                     <c ca="center">
                        <p>0.017<sup>&#8224;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Mature peptide</b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>13 amino acids</p>
                     </c>
                     <c ca="center">
                        <p>0.57</p>
                     </c>
                     <c ca="center">
                        <p>0.50</p>
                     </c>
                     <c ca="center">
                        <p>0.596</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>36 amino acids</p>
                     </c>
                     <c ca="center">
                        <p>0.57</p>
                     </c>
                     <c ca="center">
                        <p>0.48</p>
                     </c>
                     <c ca="center">
                        <p>0.785</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*Two-tailed Mann-Whitney <it>U </it>test. <sup>&#8224;</sup>Significant at the 5% level.</p>
               </tblfn>
            </tbl>
            <p>The average amino acid divergence in the SP is higher for protein pairs for which the ER/mitochondrial localization is not preserved (the median divergence based on the 13 amino acid SP is 0.92, and based on the 36 amino acid SP is 0.86) than for those pairs that maintained the same subcellular localization (13 amino acid SP, 0.69; 36 amino acid SP, 0.67), a significantly different distribution (two-tailed <it>P </it>&lt; 0.05, Mann-Whitney <it>U </it>test). In contrast, we observed no significant difference for the accumulation of amino acid substitutions in the mature peptide (for neither of the mature peptide sizes tested) between the two sets of proteins (two-tailed <it>P </it>> 0.6, Mann-Whitney <it>U </it>test; Table <tblr tid="T4">4</tblr>), which excludes the possibility that proteins that changed their subcellular localization generally show a faster rate of protein evolution and, therefore, show an elevated SP divergence. Thus, at least for proteins targeted to the mitochondria and to the ER, differences in subcellular localization between duplicates are associated with accelerated signal peptide sequence evolution. Conceivably, this acceleration may have been driven by positive Darwinian selection. A recent study demonstrating selectively driven optimization of a mitochondrial targeting signal of a protein from a young primate gene (L Rosso and colleagues, unpublished) suggests that this is a plausible scenario.</p>
            <p>The NTG1/2 base excision repair and TRR1/2 thioredoxin reductase WGD gene pairs provide striking examples of how subcellular reprogramming through changes in targeting sequences may occur (Figure <figr fid="F4">4</figr>). Through a comparison with the NTG orthologous protein from <it>K. waltii</it>, we determined that NTG1 gained an amino-terminal signal after the WGD event (mainly through a number of amino acid substitutions) that targets it to mitochondria, while NTG2 maintained the ancestral nuclear localization (Figure <figr fid="F4">4a</figr>). Conversely, TRR1 lost the ancestral capacity to localize to mitochondria, due to a deletion of its amino-terminal mitochondrial targeting sequence (Figure <figr fid="F4">4b</figr>). Thus, while keeping their ancestral enzymatic functions <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>, both NTG1 and TRR1 obtained new functional roles through neolocalization changes, caused by gain and loss of (mitochondrial) targeting sequences, respectively.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Subcellular relocalization and signal peptide evolution</p>
               </caption>
               <text>
                  <p>Subcellular relocalization and signal peptide evolution. Signal peptides (36 amino-terminal residues) and experimentally determined subcellular localizations of the <b>(a) </b>NTG1/NTG2 and <b>(b) </b>TRR1/TRR2 duplicate pairs (derived from the <it>S. cerevisiae </it>WGD event) are shown. <it>K. waltii </it>orthologous sequences are used as outgroups. Predotar [39,40] was used to predict subcellular localizations based on the protein sequences. The (predicted) subcellular localization of the <it>K. waltii </it>proteins was considered to represent the ancestral state. Identical residues in all peptide sequences are represented with (*) under the corresponding position in the protein alignment.</p>
               </text>
               <graphic file="gb-2008-9-3-r54-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Functional adaptation to new subcellular environments</p>
            </st>
            <p>Functional adaptation of duplicate proteins to new subcellular compartments may occur in several ways. Given that organelles generally display distinct physico-chemical properties that are reflected in the properties of their proteome and transcriptome <abbrgrp><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp>, relocalized duplicates may show physico-chemical adaptations that allow them to optimally function in their new (in the case of neolocalization) or more restricted (sublocalization) cellular environments.</p>
            <p>We first tested whether duplicate proteins reveal evidence for adaptation to the pH of the compartments to which they localize. To this end, we analyzed the pI (isoelectric point) of duplicates, since the pI distribution of proteins is specific to compartments and likely associated with the compartments' pH <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. We observed a significantly different distribution of fold differences in pI between D- and S-pair duplicates (two-tailed <it>P </it>&lt; 10<sup>-3</sup>, Mann-Whitney <it>U </it>test; see Additional data file 1 for individual values), with D-pairs displaying a higher median fold difference between the members of the pair (0.09) than S-pair genes (0.04). This result is in accordance with the notion that duplicates with different subcellular localization show pI adaptation, likely due to the pH of their new/altered cellular environments. However, alternatively, it remains possible that the elevated pI divergence of D-pair proteins may simply reflect the generally higher amino acid divergence observed for D-pair relative to S-pair duplicates (see above).</p>
            <p>To distinguish between these two possibilities, we tested whether the observed substitutions between proteins are biased in terms of the pI of the accumulated amino acids. This analysis revealed that 24 of the 88 D-pairs display a significantly skewed accumulation of substitutions regarding the pI of their amino acids (<it>P </it>&lt; 0.05, Pearson's chi-square test; Bonferroni-corrected for multiple (238) tests). In other words, for 24 pairs, the two paralogs have accumulated a significantly larger number of amino acids with a higher or smaller pI, respectively, than expected by chance for such a pairwise comparison (50%). This is a significantly higher proportion of pairs (one-tailed <it>P </it>&lt; 0.05, Fisher's exact test) compared to that of S-pairs (26/150 pairs), for which the difference in pI cannot be explained by subcellular localization differences. These analyses suggest that D-pair proteins show adaptation to the pH/pI properties of new or altered cellular environments through the fixation of certain amino acids by natural selection.</p>
            <p>The expression level of a gene was reported to also be related - at least in part - to the subcellular localization of its product <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. This may be due to the different volumes of the various compartments (for example, larger compartments would require more protein molecules according to this hypothesis <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>). We computed the fold difference in mRNA transcript abundance <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> for our set of yeast duplicates. The average expression difference between genes in D-pairs (mean, 0.88) is higher than that between S-pair genes (0.71), a significantly different distribution (two-tailed <it>P </it>&lt; 0.05, Mann-Whitney <it>U </it>test). The elevated expression divergence of D-pair duplicates may indicate that they generally adapted to the expression level requirements of their compartments, for example, through changes in their regulatory sequences.</p>
         </sec>
         <sec>
            <st>
               <p>Subcellular adaptation, protein-protein interactions, and the evolution of new functions</p>
            </st>
            <p>The subcellular localization of a protein determines its ability to interact with other proteins in its local environment. Therefore, subcellular diversification of duplicates should often entail changes in their interactions with other proteins. In the case of sublocalization, the descendant duplicate (assuming that it required protein partners for functioning) is bound to lose interaction partners that were specific to the lost compartment(s). Conversely, proteins that occupy new subcellular niches may obtain new interaction partners.</p>
            <p>Using a database containing extensive <it>S. cerevisiae </it>protein interaction data <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>, we observed that the two members in D-pairs share a significantly smaller fraction of interactors (median = 6.4%) than duplicates in S-pairs (median = 13.7%, two-tailed <it>P </it>&lt; 0.05, Mann-Whitney <it>U </it>test; Figure <figr fid="F5">5</figr>). This result is likely not due to a difference in the number of different interactors determined for the two sets of protein pairs (median = 9 and 8 interactors per S- and D-pairs, respectively; two-tailed <it>P </it>= 0.48, Mann-Whitney <it>U </it>test). Thus, as predicted, subcellular divergence of duplicates appears to lead to a pronounced divergence in terms of their interaction with other proteins.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Distribution of the proportion of shared interactors for genes in S- and D-pairs</p>
               </caption>
               <text>
                  <p>Distribution of the proportion of shared interactors for genes in S- and D-pairs.</p>
               </text>
               <graphic file="gb-2008-9-3-r54-5"/>
            </fig>
            <p>Subcellular relocalization may allow for the possibility that duplicate proteins evolve new functions (in the case of neolocalization) or functionally specialize (in the case of sublocalization, where both duplicates localize to distinct compartments) by evolving interactions with proteins that are located in their own compartment(s) but not in that of their duplicate copies. To test this, we assessed how often an interactor is located in the same compartment as the D-pair duplicate with which it interacts. We then compared this value to the extent of co-localization of these interactors with the other protein of the pair (with which no interaction was found).</p>
            <p>For 1,270 interactions that are not shared between D-pair proteins (involving 955 interactors and 82/88 D-pair proteins), 684 show co-localization of the interactor and the duplicate with which it interacts. This represents a significantly larger overlap than that observed between these interactors and the non-interacting paralogs of the pairs (582/1,270, two-tailed <it>P </it>&lt; 10<sup>-4</sup>, Fisher's exact test). We note, however, that - as expected (given the shared history of the two duplicates of a D-pair) - this subcellular overlap is greater than that observed between random protein pairs (1,354/2,628, two-tailed <it>P </it>&lt; 10<sup>-4</sup>, Fisher's exact test). These results support the notion that subcellular diversification allowed duplicates to obtain new functions and/or functionally specialize by evolving interactions with proteins that are specific to their compartment(s). Given that neolocalization seems frequent (see above), duplicates appear to often have obtained novel functional roles by evolving interactions with compartment-specific proteins - unattainable to their single copy progenitors.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>In this study, we have begun to assess the role of subcellular relocalization and adaptation for the emergence of new or altered gene functions after duplication, using yeast as a model organism. Our work suggests that subcellular divergence has played a significant role for the functional divergence of duplicate genes. It has affected roughly one-third of yeast WGD duplicates, in particular those involved in biological processes with a wider subcellular distribution (for example, catabolism).</p>
         <p>Although subcellular redistribution of duplicate proteins involved repartitioning/loss of ancestral compartments, relocalization of proteins to previously unoccupied compartments (neolocalization) seems to have prevailed and led to an overall gain of compartments among duplicates. Thus, duplicate genes appear to frequently have obtained new functional roles through the process of subcellular relocalization. The finding that relocalized proteins have obtained new interaction partners and lost ancestral ones underscores this notion. Interestingly, we found that relocalized proteins show adaptations to the physico-chemical properties of their altered cellular environments through the selective fixation of amino acid substitutions.</p>
         <p>A number of individual reports have revealed differences in subcellular localization of paralogous proteins in humans and other mammals (for example, <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>; see also references in <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>). Our study here motivates and warrants systematic surveys that address the role of subcellular adaptation in the functional diversification of mammalian (duplicate) genes. These should also aim to explore recent duplications (most duplications in the yeast genome - including those studied here - are old), in order to better understand the timing and selective pressures associated with this process. In fact, two individual recent cases from apes have shed initial light on the early stages of subcellular adaptation (L Rosso and colleagues, unpublished). These demonstrate that subcellular adaptation may indeed occur through both neolocalization (L Rosso and colleagues, unpublished) and sublocalization (L Rosso and colleagues, unpublished), and that subcellular adaptation may be accompanied or followed by adaptive changes of the biochemical function of the protein (L Rosso and colleagues, unpublished). Moreover, they show that subcellular shifts may be adaptive, driven by positive selection, and may occur through a few selected changes in specific (signal) sequences (consistent with our analysis of duplicated target sequences presented here), thus allowing for rapid retargeting of duplicate proteins during evolution.</p>
         <p>We conclude that in addition to changes in their expression and biochemical function, selectively driven subcellular adaptation has played an important role for the functional diversification of duplicate genes and the emergence of new gene functions in both uni- and multicellular organisms. Thus, generally, investigating the subcellular phenotype of duplicate genes may provide valuable clues to their function and fate.</p>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p><it>S. cerevisiae </it>WGD genes and other paralogs</p>
            </st>
            <p>We retrieved the gene IDs of 900 <it>S. cerevisiae </it>WGD paralogs (organized in 450 gene pairs) as well as the IDs, orthologs, and nucleotide/protein sequence of <it>K. waltii </it>orthologs from the supplemental data of <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Nucleotide and amino acid sequences for all <it>S. cerevisiae </it>WGD gene pairs and non-WGD paralogs (as defined by Ensembl gene family annotations) were retrieved from the Ensembl database, release 45 <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Subcellular localization data</p>
            </st>
            <p>Subcellular localization data were retrieved from <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. Only proteins unambiguously assigned to at least one of the 22 analyzed subcellular compartments were used. Another global protein localization data set <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B38">38</abbr></abbrgrp> was used for comparison. Subcellular localizations (Figure <figr fid="F4">4</figr>) were predicted using Predotar <abbrgrp><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Non-synonymous substitution rates</p>
            </st>
            <p>We used MUSCLE 3.6 <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> to construct codon-based nucleotide alignments of <it>S. cerevisiae </it>WGD gene pairs and their corresponding <it>K. waltii </it>orthologous genes. To estimate the rate of non-synonymous changes, <it>d</it><sub>N</sub>, along the different branches of the <it>S. cerevisiae</it>/<it>K. waltii </it>gene trees, we used the CODEML free-ratio model as implemented in the PAML 3.15 package <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Phylogenetic reconstructions</p>
            </st>
            <p>We used a maximum likelihood approach, PROML, as implemented in the PHYLIP 3.67 software package, to reconstruct the phylogeny of the protein families (protein sequence alignment infiles were generated using MUSCLE 3.6).</p>
         </sec>
         <sec>
            <st>
               <p><it>S. cerevisiae </it>singletons</p>
            </st>
            <p>To identify proteins without paralogs (singletons), an all-against-all BLASTP similarity search (E-value = 0.1) was conducted. Proteins without hits against other proteins in this search were considered to be singletons.</p>
         </sec>
         <sec>
            <st>
               <p>Gene ontology analysis</p>
            </st>
            <p>GO analyses were conducted using FatiGO <abbrgrp><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Gene expression analyses</p>
            </st>
            <p><it>S. cerevisiae </it>gene expression levels - measured as the number of mRNA copies per cell - were retrieved from <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. The average absolute protein abundance in pmol was retrieved from the literature (supplemental data of <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>). The fold difference of gene expression per gene pair was calculated as the absolute difference between the numbers of mRNA copies or protein concentration per gene normalized by the average mRNA copy number or protein concentration per gene pair.</p>
         </sec>
         <sec>
            <st>
               <p>Isoelectric point and hydrophathy data</p>
            </st>
            <p>Hydrophathy and p<it>I </it>data were collected from the literature (supplemental data of <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>).</p>
         </sec>
         <sec>
            <st>
               <p>Protein-protein interactions</p>
            </st>
            <p>Protein interactors for all proteins in D- and S-pairs were collected from the BioGRID repository <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B45">45</abbr></abbrgrp>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>ER, endoplasmic reticulum; GFP, green fluorescent protein; GO, Gene Ontology; SP, signal peptide; WGD, whole-genome duplication.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>ACM, NV and HK conceived and designed the experiments. ACM, NV and DB performed analysis. ACM and HK wrote the paper. All authors read and approved the final manuscript.</p>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>The following additional data are available with the online version of this paper. Additional data file <supplr sid="S1">1</supplr> is a table listing all whole genome duplicate pairs and relevant information for each member. Additional data file <supplr sid="S2">2</supplr> contains the subcellular localizations of members from 45 <it>S. cerevisiae </it>protein families (each containing one WGD protein pair) and most parsimonious subcellular localization changes inferred based on the phylogeny.</p>
         <suppl id="S1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>WGD proteins (in pairs), their subcellular localizations and other properties</p>
            </caption>
            <text>
               <p>WGD proteins (in pairs), their subcellular localizations and other properties.</p>
            </text>
            <file name="gb-2008-9-3-r54-S1.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S2">
            <title>
               <p>Additional data file 2</p>
            </title>
            <caption>
               <p>Subcellular localizations of members from 45 <it>S. cerevisiae </it>protein families (each containing one WGD protein pair) and most parsimonious subcellular localization changes inferred based on the phylogeny</p>
            </caption>
            <text>
               <p>The WGD pair is indicated in header for each tree. The compartments inferred to have been gained by neolocalization are underlined in red; those inferred to have resulted from sublocalization are underlined in green. Trees where subcellular relocalization events could not be inferred are labeled accordingly ('not clear').</p>
            </text>
            <file name="gb-2008-9-3-r54-S2.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This research was supported by funds available to HK from the European Union (STREP: PKB140404) and the Swiss National Science Foundation.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <aug>
               <au>
                  <snm>Ohno</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Evolution by Gene Duplication</source>
            <publisher>Berlin: Springer Verlag</publisher>
            <pubdate>1970</pubdate>
         </bibl>
         <bibl id="B2">
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>WH</fnm>
               </au>
            </aug>
            <source>Molecular Evolution</source>
            <publisher>Sunderland MA: Sinauer Associates</publisher>
            <pubdate>1997</pubdate>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Global trends of whole-genome duplications revealed by the ciliate <it>Paramecium tetraurelia</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Aury</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Jaillon</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Duret</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Noel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Jubin</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Porcel</snm>
                  <fnm>BM</fnm>
               </au>
               <au>
                  <snm>Segurens</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Daubin</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Anthouard</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Aiach</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Arnaiz</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Billaut</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Beisson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Blanc</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Bouhouche</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>C&#226;mara</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Duharcourt</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Guigo</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gogendeau</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Katinka</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Keller</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Kissmehl</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Klotz</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Koll</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Mou&#235;l</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lep&#232;re</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Malinsky</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nowacki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nowak</snm>
                  <fnm>JK</fnm>
               </au>
               <au>
                  <snm>Plattner</snm>
                  <fnm>H</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2006</pubdate>
            <volume>444</volume>
            <fpage>171</fpage>
            <lpage>178</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature05230</pubid>
                  <pubid idtype="pmpid" link="fulltext">17086204</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Dosage sensitivity and the evolution of gene families in yeast.</p>
            </title>
            <aug>
               <au>
                  <snm>Papp</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Pal</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hurst</snm>
                  <fnm>LD</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>424</volume>
            <fpage>194</fpage>
            <lpage>197</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01771</pubid>
                  <pubid idtype="pmpid" link="fulltext">12853957</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Preservation of duplicate genes by complementary, degenerative mutations.</p>
            </title>
            <aug>
               <au>
                  <snm>Force</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lynch</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pickett</snm>
                  <fnm>FB</fnm>
               </au>
               <au>
                  <snm>Amores</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Yan</snm>
                  <fnm>YL</fnm>
               </au>
               <au>
                  <snm>Postlethwait</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1999</pubdate>
            <volume>151</volume>
            <fpage>1531</fpage>
            <lpage>1545</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1460548</pubid>
                  <pubid idtype="pmpid" link="fulltext">10101175</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>The evolution of functionally novel proteins after gene duplication.</p>
            </title>
            <aug>
               <au>
                  <snm>Hughes</snm>
                  <fnm>AL</fnm>
               </au>
            </aug>
            <source>Proc Biol Sci</source>
            <pubdate>1994</pubdate>
            <volume>256</volume>
            <fpage>119</fpage>
            <lpage>124</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">8029240</pubid>
                  <pubid idtype="doi">10.1098/rspb.1994.0058</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>On the possibility of constructive neutral evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Stoltzfus</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1999</pubdate>
            <volume>49</volume>
            <fpage>169</fpage>
            <lpage>181</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/PL00006540</pubid>
                  <pubid idtype="pmpid" link="fulltext">10441669</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Expression patterns of duplicate genes in the developing root in <it>Arabidopsis thaliana</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Hughes</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2005</pubdate>
            <volume>60</volume>
            <fpage>247</fpage>
            <lpage>256</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00239-004-0171-z</pubid>
                  <pubid idtype="pmpid" link="fulltext">15785853</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>The probability of duplicate gene preservation by subfunctionalization.</p>
            </title>
            <aug>
               <au>
                  <snm>Lynch</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Force</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2000</pubdate>
            <volume>154</volume>
            <fpage>459</fpage>
            <lpage>473</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1460895</pubid>
                  <pubid idtype="pmpid" link="fulltext">10629003</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>He</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2005</pubdate>
            <volume>169</volume>
            <fpage>1157</fpage>
            <lpage>1164</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1449125</pubid>
                  <pubid idtype="pmpid" link="fulltext">15654095</pubid>
                  <pubid idtype="doi">10.1534/genetics.104.037051</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Prediction of organellar targeting signals.</p>
            </title>
            <aug>
               <au>
                  <snm>Emanuelsson</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>von Heijne</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Biochim Biophys Acta</source>
            <pubdate>2001</pubdate>
            <volume>1541</volume>
            <fpage>114</fpage>
            <lpage>119</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0167-4889(01)00145-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">11750667</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>The tripartite motif family identifies cell compartments.</p>
            </title>
            <aug>
               <au>
                  <snm>Reymond</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Meroni</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Fantozzi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Merla</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Cairo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Luzi</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Riganelli</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Zanaria</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Messali</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Cainarca</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Guffanti</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Minucci</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Pelicci</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Ballabio</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>2001</pubdate>
            <volume>20</volume>
            <fpage>2140</fpage>
            <lpage>2151</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">125245</pubid>
                  <pubid idtype="pmpid" link="fulltext">11331580</pubid>
                  <pubid idtype="doi">10.1093/emboj/20.9.2140</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Retention of a duplicate gene through changes in subcellular targeting: an electron transport protein homologue localizes to the golgi.</p>
            </title>
            <aug>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Doan</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Goodman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Grossman</snm>
                  <fnm>LI</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2003</pubdate>
            <volume>57</volume>
            <fpage>222</fpage>
            <lpage>228</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00239-003-2468-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">14562965</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Protein subcellular relocalization: a new perspective on the origin of novel genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Byun-McKay</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Geeta</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Trends Ecol Evol</source>
            <pubdate>2007</pubdate>
            <volume>22</volume>
            <fpage>338</fpage>
            <lpage>344</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tree.2007.05.002</pubid>
                  <pubid idtype="pmpid" link="fulltext">17507112</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Global analysis of protein localization in budding yeast.</p>
            </title>
            <aug>
               <au>
                  <snm>Huh</snm>
                  <fnm>WK</fnm>
               </au>
               <au>
                  <snm>Falvo</snm>
                  <fnm>JV</fnm>
               </au>
               <au>
                  <snm>Gerke</snm>
                  <fnm>LC</fnm>
               </au>
               <au>
                  <snm>Carroll</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Howson</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Weissman</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>O'Shea</snm>
                  <fnm>EK</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>425</volume>
            <fpage>686</fpage>
            <lpage>691</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature02026</pubid>
                  <pubid idtype="pmpid" link="fulltext">14562095</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Molecular evidence for an ancient duplication of the entire yeast genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Wolfe</snm>
                  <fnm>KH</fnm>
               </au>
               <au>
                  <snm>Shields</snm>
                  <fnm>DC</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1997</pubdate>
            <volume>387</volume>
            <fpage>708</fpage>
            <lpage>713</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/42711</pubid>
                  <pubid idtype="pmpid" link="fulltext">9192896</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Proof and evolutionary analysis of ancient genome duplication in the yeast <it>Saccharomyces cerevisiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Kellis</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Birren</snm>
                  <fnm>BW</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>428</volume>
            <fpage>617</fpage>
            <lpage>624</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature02424</pubid>
                  <pubid idtype="pmpid" link="fulltext">15004568</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Saccharomyces Genome Database</p>
            </title>
            <url>http://www.yeastgenome.org/</url>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Subcellular localization of the yeast proteome.</p>
            </title>
            <aug>
               <au>
                  <snm>Kumar</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Agarwal</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Heyman</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Matson</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Heidtman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Piccirillo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Umansky</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Drawid</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jansen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Cheung</snm>
                  <fnm>KH</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Roeder</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Snyder</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2002</pubdate>
            <volume>16</volume>
            <fpage>707</fpage>
            <lpage>719</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">155358</pubid>
                  <pubid idtype="pmpid" link="fulltext">11914276</pubid>
                  <pubid idtype="doi">10.1101/gad.970902</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Natural history and evolutionary principles of gene duplication in fungi.</p>
            </title>
            <aug>
               <au>
                  <snm>Wapinski</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Pfeffer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Regev</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2007</pubdate>
            <volume>449</volume>
            <fpage>54</fpage>
            <lpage>61</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature06107</pubid>
                  <pubid idtype="pmpid" link="fulltext">17805289</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Controlling the false discovery rate: a practical and powerful approach to multiple testing.</p>
            </title>
            <aug>
               <au>
                  <snm>Benjamini</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hochberg</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Statist Soc B</source>
            <pubdate>1995</pubdate>
            <volume>55</volume>
            <fpage>289</fpage>
            <lpage>300</lpage>
         </bibl>
         <bibl id="B22">
            <title>
               <p>PAML: a program package for phylogenetic analysis by maximum likelihood.</p>
            </title>
            <aug>
               <au>
                  <snm>Yang</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1997</pubdate>
            <volume>13</volume>
            <fpage>555</fpage>
            <lpage>556</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9367129</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Mammalian housekeeping genes evolve more slowly than tissue-specific genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>WH</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2004</pubdate>
            <volume>21</volume>
            <fpage>236</fpage>
            <lpage>239</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/molbev/msh010</pubid>
                  <pubid idtype="pmpid" link="fulltext">14595094</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Ubiquitin-conjugating enzymes: novel regulators of eukaryotic cells.</p>
            </title>
            <aug>
               <au>
                  <snm>Jentsch</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Seufert</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Sommer</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Reins</snm>
                  <fnm>HA</fnm>
               </au>
            </aug>
            <source>Trends Biochem Sci</source>
            <pubdate>1990</pubdate>
            <volume>15</volume>
            <fpage>195</fpage>
            <lpage>198</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0968-0004(90)90161-4</pubid>
                  <pubid idtype="pmpid">2193438</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <