<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2003-4-6-r40</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Method</dochead>
      <bibl>
         <title>
            <p>A method to assess compositional bias in biological sequences and its application to prion-like glutamine/asparagine-rich domains in eukaryotic proteomes</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Harrison</snm>
               <mi>M</mi>
               <fnm>Paul</fnm>
               <insr iid="I1"/>
               <email>harrison@csb.yale.edu</email>
            </au>
            <au id="A2">
               <snm>Gerstein</snm>
               <fnm>Mark</fnm>
               <insr iid="I1"/>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, New Haven, CT 06520-8114, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2003</pubdate>
         <volume>4</volume>
         <issue>6</issue>
         <fpage>R40</fpage>
         <url>http://genomebiology.com/2003/4/6/R40</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/gb-2003-4-6-r40</pubid>
               <pubid idtype="pmpid">12801414</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>6</day>
               <month>2</month>
               <year>2003</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>8</day>
               <month>4</month>
               <year>2003</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>29</day>
               <month>4</month>
               <year>2003</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>30</day>
               <month>5</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>Harrison and Gerstein; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.</collab>
      </cpyrt>
      <shorttitle>
         <p>A method to assess compositional bias in biological sequences and its application to prion-like glutamine/asparagine-rich domains in eukaryotic proteomes</p>
      </shorttitle>
      <shortabs>
         <p>A novel method has been derived to assess compositional biases in biological sequences. It is based on finding the lowest-probability subsequences for a given residue-type set.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>We have derived a novel method to assess compositional biases in biological sequences, which is based on finding the lowest-probability subsequences for a given residue-type set. As a case study, the distribution of prion-like glutamine/asparagine-rich ((Q+N)-rich) domains (which are linked to amyloidogenesis) was assessed for budding and fission yeasts and four other eukaryotes. We find more than 170 prion-like (Q+N)-rich regions in budding yeast, and, strikingly, many fewer in fission yeast. Also, some residues, such as tryptophan or isoleucine, are unlikely to form biased regions in any eukaryotic proteome.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010001">Biochemistry and structural biology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010012">Medicine</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Amyloidogenesis involving domains that contain glutamine and/or asparagine ((Q+N)-rich domains) is linked to prion phenomena in budding yeast, as well as a number of neurological disorders in humans, including Huntington's disease.</p>
         <p>A prion is an alternative conformation for a protein that can direct its own propagation <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. In budding yeast, there are currently four identified prions: [PSI+], [URE3], [RNQ+] and [NU+] <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. [PSI+] arises from the propagation of an alternatively folded amyloid-like form of Sup35p <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Sup35p is part of the complex in budding yeast that controls translation termination and nonsense-codon readthrough <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>. [URE3] is caused by an alternatively folded form of Ure2p, a protein involved in nitrogen metabolism <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B10">10</abbr></abbrgrp>. The determinant sequences of [PSI+] and [URE3] are characterized chiefly by bias for glutamine (Q) and asparagine (N) residues (Table <tblr tid="T1">1</tblr>). [RNQ+] and [NU+] arise from alternative propagatable forms of parts of the Rnq1p and New1p sequences, and were found by searches for further sequences with Q/N compositional bias (Table <tblr tid="T1">1</tblr>) <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B11">11</abbr></abbrgrp>. Prokaryotic and eukaryotic proteomes were assessed for yeast-prion-like domains that comprised a total of 30 or more glutamines and asparagines in an 80-amino-acid stretch <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. [PIN+] is a non-Mendelian inherited trait that is required for the <it>de novo </it>appearance of [PSI+] <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B13">13</abbr></abbrgrp>. Eight candidate sequences for [PIN+], which tend to have a Q- and/or N-rich segment, were identified using a genetic screen and remain to be verified <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
         <tbl id="T1" hint_layout="double">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>The four prion sequences*</p>
            </caption>
            <tblbdy cols="3">
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>Notable single-residue bias counts for prion determinant domain (P<sub>bias </sub>in brackets)<sup>&#8224;</sup></p>
                  </c>
                  <c ca="left">
                     <p>Notable LPSs (for whole sequence)<sup>&#8225;</sup></p>
                  </c>
               </r>
               <r>
                  <c cspan="3">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Prion sequence: Ure2p (YNL229C)</p>
                     <p>Prion determinant is residues 1-65</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <b>MM</b>
                        <ul>
                           <b>NNNGNQVSNLSNALRQVNIGNRNSNTTTDQSNINFEFSTG</b>
                        </ul>
                     </p>
                     <p><ul><b>VNNNNNNNSSSNNNNVQNNNSGR</b>N</ul>GSQNNDNENNIKNTLEQH</p>
                     <p>RQQQQAFSDMSHVEYSRITKFFQEQPLEGYTLFSHRSAPNGFKV</p>
                     <p>AIVLSELGFHYNTIFLDFNLGEHRAPEFVSVNPNARVPALIDHGM</p>
                     <p>DNLSIWESGAILLHLVNKYYKETGNPLLWSDDLADQSQINAWLF</p>
                     <p>FQTSGHAPMIGQALHFRYFHSQKIASAVERYTDEVRRVYGVVE</p>
                     <p>MALAERREALVMELDTENAAAYSAGTTPMSQSRFFDYPVWLV</p>
                     <p>GDKLTIADLAFVPWNNVVDRIGINIKIEFPEVYKWTKHMMRRPA</p>
                     <p>VIKALRGE</p>
                  </c>
                  <c ca="left">
                     <p>N 27/65 (1.9 &#215; 10<sup>-16</sup>)</p>
                  </c>
                  <c ca="left">
                     <p>N, 33 in 2 to 78 (1.2 &#215; 10<sup>-16</sup>)</p>
                     <p>Q, 7 in 82 to 108 (5.5 &#215; 10<sup>-5</sup>)</p>
                     <p/>
                     <p>{QN}, 43 in 2 to 88 (1.2 &#215; 10<sup>-20</sup>)</p>
                     <p/>
                     <p>{DERK}, 10 in 2 to 88 (1.6 &#215; 10<sup>-3</sup>)</p>
                     <p>{VILM}, 4 in 2 to 88 (1.8 &#215; 10<sup>-3</sup>)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Prion sequence: Sup35p (YDR172W)</p>
                     <p>Prion determinant is residues 1-123</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>MSDSN<b><ul>QGNNQQNYQQYSQNGNQQQGNNRYQGYQAYNAQA</ul></b></p>
                     <p>
                        <b>
                           <ul>QPAGGYYQNYQGYSGYQQGGYQQYNPDAGYQQQYNPQGG</ul>
                        </b>
                     </p>
                     <p>
                        <b>
                           <ul>YQQYNPQGGYQQQFNPQGGRGNYKNFNYNNNLQGYQAGF</ul>
                        </b>
                     </p>
                     <p><b><ul>QPQSQ</ul>G</b>MSLNDFQKQQKQAAPKPKKTLKLVSSSGIKLANATKK</p>
                     <p>VGTKPAESDKKEEEKSAETKEPTKEPTKVEEPVKKEEKPVQTEE</p>
                     <p>KTEEKSELPKVEDLKISESTHNTNNANVTSADALIKEQEEEVDDE</p>
                     <p>VVNDMFGGKDHVSLIFMGHVDAGKSTMGGNLLYLTGSVDKRTI</p>
                     <p>EKYEREAKDAGRQGWYLSWVMDTNKEERNDGKTIEVGKAYFE</p>
                     <p>TEKRRYTILDAPGHKMYVSEMIGGASQADVGVLVISARKGEYET</p>
                     <p>GFERGGQTREHALLAKTQGVNKMVVVVNKMDDPTVNWSKER</p>
                     <p>YDQCVSNVSNFLRAIGYNIKTDVVFMPVSGYSGANLKDHVDPK</p>
                     <p>ECPWYTGPTLLEYLDTMNHVDRHINAPFMLPIAAKMKDLGTIVE</p>
                     <p>GKIESGHIKKGQSTLLMPNKTAVEIQNIYNETENEVDMAMCGEQ</p>
                     <p>VKLRIKGVEEEDISPGFVLTSPKNPIKSVTKFVAQIAIVELKSIIAA</p>
                     <p>GFSCVMHVHTAIEEVHIVKLLHKLEKGTNRKSKKPPAFAKKGM</p>
                     <p>KVIAVLETEAPVCVETYQDYPQLGRFTLRDQGTTIAIGKIVKIAE</p>
                  </c>
                  <c ca="left">
                     <p>Q 35/123 (9.6 &#215; 10<sup>-21</sup>)</p>
                     <p>Y 20/123 (4.9 &#215; 10<sup>-9</sup>)</p>
                     <p>G 21/123 (5.9 &#215; 10<sup>-7</sup>)</p>
                     <p>N 20/123 (3.7 &#215; 10<sup>-5</sup>)</p>
                  </c>
                  <c ca="left">
                     <p>Q, 39 in 5 to 134 (7.3 &#215; 10<sup>-24</sup>)</p>
                     <p>Y, 20 in 12 to 112 (1.4 &#215; 10<sup>-10</sup>)</p>
                     <p>E, 23 in 166 to 248 (1.7 &#215; 10<sup>-9</sup>)</p>
                     <p>K, 23 in 130 to 218 (5.5 &#215; 10<sup>-8</sup>)</p>
                     <p>G, 20 in 19 to 122 (1.5 &#215; 10<sup>-7</sup>)</p>
                     <p>N, 20 in 4 to 108 (3.6 &#215; 10<sup>-6</sup>)</p>
                     <p>V, 47 in 188 to 653 (4.5 &#215; 10<sup>-5</sup>)</p>
                     <p/>
                     <p>{QN}, 60 in 4 to 134 (6.1 &#215; 10<sup>-26</sup>)</p>
                     <p/>
                     <p>{DERK}, 6 in 4 to 134 (8.0 &#215; 10<sup>-9</sup>)</p>
                     <p>{VILM}, 2 in 4 to 134 (3.5 &#215; 10<sup>-12</sup>)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Prion sequence: Rnq1p (YCL028W)</p>
                     <p>Prion determinant is residues 153-405</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>MDTDKLISEAESHFSQGNHAEAVAKLTSAAQSNPNDEQMSTIES</p>
                     <p>LIQKIAGYVMDNRSGGSDASQDRAAGGGSSFMNTLMADSKGSS</p>
                     <p>QTQLGKLALLATVMTHSSNKGSSNRGFDVGTVMSMLSGSGGGS</p>
                     <p>QSMGASGLAALASQFFKSGNNSQ<b>G<ul>QGQGQGQGQGQGQGQG</ul></b></p>
                     <p>
                        <ul>
                           <b>QGSFTALASLASSFMNSNNNNQQGQNQSSGGSSFGALASMAS</b>
                        </ul>
                     </p>
                     <p>
                        <ul>
                           <b>SFMHSNNNQNSNNSQQGYNQSYQNGNQNSQGYNNQQYQGG</b>
                        </ul>
                     </p>
                     <p>
                        <ul>
                           <b>NGGYQQQQGQSGGAFSSLASMAQSYLGGGQTQSNQQQYNQ</b>
                        </ul>
                     </p>
                     <p>
                        <ul>
                           <b>QGQNNQQQYQQQGQNYQHQQQGQQQQQGHSSSFSALASM</b>
                        </ul>
                     </p>
                     <p>
                        <ul>
                           <b>ASSYLGNNSNSNSSYGGQQQANEYGRPQHNGQQQSNEYGRP</b>
                        </ul>
                     </p>
                     <p>
                        <ul>
                           <b>QYGGNQNSNGQHESFNFSGNFSQQNNNGNQ</b>
                        </ul>
                        <b>NRY</b>
                     </p>
                  </c>
                  <c ca="left">
                     <p>Q 66/253 (4.2 &#215; 10<sup>-35</sup>)</p>
                     <p>G 42/253 (6.6 &#215; 10<sup>-12</sup>)</p>
                     <p>N 41/253 (7.4 &#215; 10<sup>-9</sup>)</p>
                  </c>
                  <c ca="left">
                     <p>Q, 67 in 152 to 401 (2.1 &#215; 10<sup>-36</sup>)</p>
                     <p>G, 60 in 50 to 399 (3.1 &#215; 10<sup>-17</sup>)</p>
                     <p>N, 41 in 185 to 402 (8.3 &#215; 10<sup>-11</sup>)</p>
                     <p/>
                     <p>{QN}, 110 in 149 to 402 (3.2 &#215; 10<sup>-43</sup>)</p>
                     <p/>
                     <p>{DERK}, 5 in 149 to 402 (1.2 &#215; 10<sup>-23</sup>)</p>
                     <p>{VILM}, 12 in 149 to 402 (8.9 &#215; 10<sup>-17</sup>)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Prion sequence: New1p (YPL226W)</p>
                     <p>Prion determinant is residues 1-153</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <b>MPPKKFKDLNSFLDDQPKDPNLVASPFGGYFKNPAADAGSN</b>
                     </p>
                     <p>
                        <b>NASKKSSYQQQRNWKQGGNYQQGGYQSY</b>
                        <ul>
                           <b>NSNYNNYNNYNN</b>
                        </ul>
                     </p>
                     <p>
                        <ul>
                           <b>YNNYNNYNNYNKYN</b>
                        </ul>
                        <b>GQGYQKSTYKQSAVTPNQSGTPTPSAS</b>
                     </p>
                     <p><b>TTSLTSLNEKLSNLELTPISQFLSKIPECQS</b>ITDCKNQIKLIIEEF</p>
                     <p>GKEGNSTGEKIEEWKIVDVLSKFIKPKNPSLVRESAMLIISNIAQF</p>
                     <p>FSGKPPQEAYLLPFFNVALDCISDKENTVKRAAQHAIDSLLNCFP</p>
                     <p>MEALTCFVLPTILDYLSSGAKWQAKMAALSVVDRIREDSANDL</p>
                     <p>LELTFKDAVPVLTDVATDFKPELAKQGYKTLLDYVSILDNLDLS</p>
                     <p>PRYKLIVDTLQDPSKVPESVKSLSSVTFVAEVTEPSLSLLVPILNR</p>
                     <p>SLNLSSSSQEQLRQTVIVVENLTRLVNNRNEIESFIPLLLPGIQKV</p>
                     <p>VDTASLPEVRELAEKALNVLKEDDEADKENKFSGRLTLEEGRDF</p>
                     <p>LLDHLKDIKADDSCFVKPYMNDETVIKYMSKILTVDSNVNDWK</p>
                     <p>RLEDFLTAVFGGSDSQREFVKQDFIHNLRALFYQEKERADEDEGI</p>
                     <p>EIVNTDFSLAYGSRMLLNKTNLRLLKGHRYGLCGRNGAGKSTL</p>
                     <p>MRAIANGQLDGFPDKDTLRTCFVEHKLQGEEGDLDLVSFIALDE</p>
                     <p>ELQSTSREEIAAALESVGFDEERRAQTVGSLSGGWKMKLELARA</p>
                     <p>MLQKADILLLDEPTNHLDVSNVKWLEEYLLEHTDITSLIVSHDSG</p>
                     <p>FLDTVCTDIIHYENKKLAYYKGNLAAFVEQKPEAKSYYTLTDSN</p>
                     <p>AQMRFPPPGILTGVKSNTRAVAKMTDVTFSYPGAQKPSLSHVSC</p>
                     <p>SLSLSSRVACLGPNGAGKSTLIKLLTGELVPNEGKVEKHPNLRIG</p>
                     <p>YIAQHALQHVNEHKEKTANQYLQWRYQFGDDREVLLKESRKIS</p>
                     <p>EDEKEMMTKEIDIDDGRGKRAIEAIVGRQKLKKSFQYEVKWKY</p>
                     <p>WKPKYNSWVPKDVLVEHGFEKLVQKFDDHEASREGLGYRELIP</p>
                     <p>SVITKHFEDVGLDSEIANHTPLGSLSGGQLVKVVIAGAMWNNPH</p>
                     <p>LLVLDEPTNYLDRDSLGALAVAIRDWSGGVVMISHNNEFVGAL</p>
                     <p>CPEQWIVENGKMVQKGSAQVDQSKFEDGGNADAVGLKASNLA</p>
                     <p>KPSVDDDDSPANIKVKQRKKRLTRNEKKLQAERRRLRYIEWLSS</p>
                     <p>PKGTPKPVDTDDEED</p>
                  </c>
                  <c ca="left">
                     <p>N 26/153 (1.4 &#215; 10<sup>-6</sup>)</p>
                  </c>
                  <c ca="left">
                     <p>N, 16 in 69 to 94 (9.8 &#215; 10<sup>-14</sup>)</p>
                     <p>Y, 13 in 60 to 103 (1.2 &#215; 10<sup>-9</sup>)</p>
                     <p>L, 74 in 253 to 738 (2.4 &#215; 10<sup>-5</sup>)</p>
                     <p>Q, 11 in 49 to 112 (2.9 &#215; 10<sup>-5</sup>)</p>
                     <p>R, 7 in 1149 to 1173 (7.5 &#215; 10<sup>-5</sup>)</p>
                     <p/>
                     <p>{QN}, 27 in 49 to 99 (1.8 &#215; 10<sup>-14</sup>)</p>
                     <p/>
                     <p>{DERK}, 3 in 49 to 99 (5.3 &#215; 10<sup>-4</sup>)</p>
                     <p>{VILM}, 0 in 49 to 99 (8.1 &#215; 10<sup>-7</sup>)</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>*The prion determinant regions (found from experiment) are in bold, the LPSs for the whole protein sequence for the most pronounced single-amino-acid bias, are underlined. <sup>&#8224;</sup>All biases with a P<sub>bias </sub>= 1 &#215; 10<sup>-4 </sup>are listed for each prion sequence. <sup>&#8225;</sup>As examples, the counts for the sets of residues {DERK} and {VILM} that correspond to the {QN} lowest probability sequence are listed for each prion.</p>
            </tblfn>
         </tbl>
         <p>Expanded polyglutamine repeats underlie the pathology of neurodegenerative disorders in humans, the most common of which is Huntington's disease <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. This disorder is caused by inherited expansions of length equal or greater than 39 amino-acid residues in the polyglutamine region of the protein huntingtin <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. (Q+N)-rich regions, polyglutamine and polyasparagine are thought to oligomerize or polymerize through a 'polar zipper' of hydrogen bonds between the side chains <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>.</p>
         <p>Here, we have derived a method for identifying biased regions that relies on defining the lowest-probability subsequences (LPSs) for a given amino-acid composition. For six eukaryotic proteomes (budding yeast, fission yeast, nematode worm, fruit fly, human and <it>Arabidopsis</it>), we have used this formalism to analyze the prevalence of Q- and N-rich regions in the context of other biases. In general, N-rich regions are rarer than Q-rich regions in the eukaryotic proteomes, most notably so in the human proteome. We use the biases for the four known prions of budding yeast to survey comprehensively for (Q+N)-rich domains, and examine the diversity of their subsidiary amino-acid compositions, their functions and their cellular compartments. We find up to around 170 (Q+N)-rich regions in budding yeast, and a relative dearth of such regions in fission yeast. In addition, to provide more context, we discuss some overarching observations on biased regions of any sort.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <p>Our analysis can be broken up as follows. First we analyze the Q, N, and (Q+N) biases in the four known prion sequences of budding yeast, as well as other subsidiary biases for and against certain residue types. We discuss how this relates to prion-determinant domains (that is, regions of the prion sequences that are necessary for the prion phenomenon). Our analysis is performed using a simple algorithm to find the lowest probability subsequences (LPSs) for a given residue bias (see Materials and methods).</p>
         <p>Second, we ask how prevalent are Q- and N-biases in eukaryotes? Motivated by the fact that prion-determinant sequences correspond to LPSs, we examine LPSs for Q and N biases in the context of single-residue biases for all residue types, in all six proteomes. Focusing on the budding-yeast proteome, we also compare single-residue biases observed for known and hypothetical proteins, and for conceptually translated intergenic DNA (igDNA) in all six potential reading frames.</p>
         <p>Then, on the basis of biases for Q and N in combination, we examine the abundance and diversity of (Q+N)-rich regions in the six eukaryotic proteomes. We survey their subsidiary biases for and against certain residues and groups of residues, and their sizes, functional classes and cellular compartments. Finally, to provide more general context, we discuss some overarching perspectives on compositional bias in the eukaryotic proteomes.</p>
         <sec>
            <st>
               <p>Analysis of the identified yeast prion sequences for their biases</p>
            </st>
            <p>Four identified prion protein sequences of budding yeast are Sup35p, Ure2p, Rnq1p and New1p, which form the prions [PSI+], [URE3], [RNQ+] and [NU+] respectively. We extracted the domains that are determinant for prion formation from each of these protein sequences, which have been found previously by experimental study (Table <tblr tid="T1">1</tblr>) <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. Using the formalism described in Materials and methods, we determined the main biases for each prion-determinant domain and an associated probability for each bias. Results are shown for single-residue biases, with examples for the sets of residues {QN}, {DERK} and {VILM} (single-letter amino-acid code; Table <tblr tid="T1">1</tblr>). The groupings {DERK} and {VILM} are 'charged residues' and 'major hydrophobics' respectively <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Charged residues and the major hydrophobics appear to be disfavored for the yeast prions (Table <tblr tid="T1">1</tblr>) <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>; mutation of Q or N to charged residues can lead to loss of prion-forming capability <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. We also derived the LPSs for a given bias for the whole protein sequences (not just the prion-determinant domains). In addition to the well-documented Q and N biases, we also note that three of the four budding-yeast prions have subsidiary biases for tyrosine, glycine and/or serine (Table <tblr tid="T1">1</tblr>). The mild bias for tyrosine is conserved for homologs of the Sup35 prion determinant in other fungi, although it is not clear how this is related to the prion phenomenon <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. It has been suggested that &#960;-stacking of aromatic groups, as in tyrosine and phenylalanine, may play a part in stabilizing amyloid conformations <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>.</p>
            <p>Interestingly, the prion-determinant domains for three of the prion sequences (Sup35p, Ure2p, Rnq1p) are congruent with the top-ranking single-residue LPSs for the whole sequences (these are the underlined sequence regions in Table <tblr tid="T1">1</tblr>). That is, the most biased regions coincide with the experimentally derived prion-determinant domains. These are either for Q or N biases (Table <tblr tid="T1">1</tblr>). However, for the fourth prion sequence (New1p), the prion-determinant domain is comparatively poorly biased for N or Q or {QN} (for example, for N, P<sub>bias </sub>= 1.4 &#215; 10<sup>-6</sup>, where P<sub>bias </sub>is the probability of bias). Its LPS for N bias does, however, coincide with a region derived from a repeat of the amino-acid triplet NYN that has been shown to be necessary for [NU+] prion propagation <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>.</p>
            <p>In the next section, we show how single-residue biases for Q and N rank in terms of their relative abundance in eukaryotic proteomes. After that, we use the Q- and N-bias levels of the LPSs of the four prions in combination to derive a refined set of (Q+N)-rich domains in the six eukaryotic proteomes (see below).</p>
         </sec>
         <sec>
            <st>
               <p>Abundance of Q and N biases in a proteomic context, for budding yeast and five other eukaryotic proteomes</p>
            </st>
            <p>How abundant are the biases for Q and N observed for the yeast prion domains compared to biases for all the other residue types? Are they noticeably more or less prevalent in budding yeast compared to other eukaryotic proteomes? We examined the most prevalent single-residue biases for the six eukaryotic proteomes at P<sub>bias </sub>values corresponding to the LPSs observed in the prion-determinant domains (Table <tblr tid="T1">1</tblr>). This data gives us a perspective on the relative abundance of such biases (arrayed in Table <tblr tid="T2">2</tblr>; the exact threshold used to make this table is P<sub>bias </sub>&lt; 1 &#215; 10<sup>-13</sup>).</p>
            <tbl id="T2" hint_layout="double">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Abundance of biased regions that have biases at the same level as the Q and N biases in the four budding-yeast prions</p>
               </caption>
               <tblbdy cols="13">
                  <r>
                     <c cspan="13" ca="left">
                        <p><b>(a) </b>Biases in terms of total number of regions*</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="13">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Rank</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Budding yeast</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Fission yeast</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Fruit fly</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Nematode</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>
                           <it>Arabidopsis</it>
                        </p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Human</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="13">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>108 <abbrgrp><abbr bid="B1">1</abbr></abbrgrp><sup>&#8224;</sup></p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>74</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>725 (20.7)</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>494</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>345</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>549</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>104 <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> (17.8)</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>40</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>400</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>448</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>292</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>322</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>73 <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> (12.5)</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>37</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>359</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>286</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>242</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>302</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>68 <abbrgrp><abbr bid="B18">18</abbr></abbrgrp></p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>32</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>327</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>270 (8.9)</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>153 (6.0)</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>294</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>58</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>17 (5.6)</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>264</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>220</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>150</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>233</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>37 <abbrgrp><abbr bid="B1">1</abbr></abbrgrp></p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>17</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>231</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>199</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>134</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>188</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>35</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>212</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>184</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>90</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>176 (5.9)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>24 <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>15</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>188</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>146</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>86</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>136</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>20</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>13</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>170</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>132</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>81</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>83</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>19 <abbrgrp><abbr bid="B1">1</abbr></abbrgrp></p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>144 (4.1)</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>102</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>56</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>80</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>144</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>55</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>56</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>59</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>118</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>46</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>47</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>49</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>13</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>74</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>40 (1.3)</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>28</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>31</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>14</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>4 (1.3)</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>47</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>17</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>28 (1.1)</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>23</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>15</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>28</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>17</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>21</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>24</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>15</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>18</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>17</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>22</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>15 (0.6)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>18</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>13</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>19</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>11</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>20</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>10</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>585</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>303</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>3,495</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>2,693</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>1,833</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>2,613</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="13">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="13" ca="left">
                        <p><b>(b) </b>Biases in terms of total number of residues<sup>&#8225;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="13">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Rank</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Budding yeast</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Fission yeast</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Fruit fly</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Nematode</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>
                           <it>Arabidopsis</it>
                        </p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Human</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="13">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>10,630</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>9,035</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>39,186 (16.3)</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>31,917</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>23,229</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>44,427</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>5,900</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>5,805</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>31,936</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>31,216</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>21,124</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>27,352</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>4,704</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>2,887</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>29,345</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>28,192</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>13,462</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>26,363</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>3,924 (10.4)</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>2,657</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>24,320</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>18,126 (8.9)</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>10,313</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>22,131</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>3,745 (10.0)</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>1,854</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>23,384</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>15,994</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>9,459</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>16,681</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>2,049</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>1,669</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>14,730</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>15,262</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>6,852</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>15,459</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>1,910</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>1,185</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>14,448</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>15,224</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>6,835 (6.0)</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>12,156 (5.9)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>1,292</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>1,107 (3.6)</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>12,560</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>14,518</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>6,122</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>9,587</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>961</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>1,087</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>10,067</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>9,124</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>4,061</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>5,667</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>916</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>851</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>9,331</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>7,501</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>3,244</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>5,646</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>554</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>680</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>6,847</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>6,950</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>3,176</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>5,165</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>256</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>486 (1.6)</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>6,302</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>2,606 (1.3)</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>2,315</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>3,189</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>13</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>204</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>425</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>5,695</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>2,361</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>1,259 (1.1)</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>2,964</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>14</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>195</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>257</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>5,690 (2.4)</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>1,352</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>1,044</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>2,085</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>15</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>163</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>238</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>2,651</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>827</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>697</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>1,714 (0.8)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>94</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>217</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>1,179</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>746</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>549</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>1,433</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>17</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>90</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>127</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>915</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>692</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>287</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>1,081</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>18</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>33</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>60</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>798</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>608</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>221</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>924</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>19</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>667</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>404</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>162</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>617</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>20</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>147</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>42</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>541</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>37,620</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>30,627</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>240,198</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>203,662</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>114,427</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>205,182</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="13">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="13" ca="left">
                        <p><b>(c) </b>Ratios of numbers for Q-rich and N-rich biased regions compared to the ratios of their overall abundances as residues<sup>&#167;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="13">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Budding yeast</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Fission yeast</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Fruit fly</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Nematode</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>
                           <it>Arabidopsis</it>
                        </p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Human</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="13">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>R<sub>Q/N </sub>(total residues)</p>
                     </c>
                     <c ca="left">
                        <p>1.05</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>2.28</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>6.89</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>6.96</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>5.43</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>7.09</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>R<sub>Q/N </sub>(total regions)</p>
                     </c>
                     <c ca="left">
                        <p>1.42</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>4.25</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>5.03</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>6.75</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>5.46</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>11.73</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Q/N (composition)</p>
                     </c>
                     <c ca="left">
                        <p>(0.039/0.061) = 0.64</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>(0.038/0. 052) = 0.73</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>(0.052/0 .047) = 1.12</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>(0.041/ 0.049 = 0.83</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>(0.035/ 0.044 = 0.79</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>(0.047/ 0.037 = 1.28</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*The total numbers of regions that have a compositional biased LPS with P<sub>bias </sub>&lt; 1 &#215; 10-13. <sup>&#8224;</sup>The number of LPSs for a particular compositional bias in the budding yeast proteome that overlap a region assigned as coiled coil by the MULTICOIL program <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. <sup>&#8225;</sup>The total numbers of bias residues (for example, total number of serines for a serine bias) for all of the regions tallied for part (a) of the table. <sup>&#167;</sup> R<sub>Q/N </sub>is the ratio of the number of Q-rich regions to N-rich regions as listed in parts (a) and (b) of the table. The overall abundance of the residues is simply the fraction of the total proteome that is either Q or N.</p>
               </tblfn>
            </tbl>
            <p>It is clear that biases for Q and N are relatively more prevalent in the budding yeast proteome than in the other eukaryotic proteomes. Both Q and N are among the top six biases for this organism at this bias level (Table <tblr tid="T2">2</tblr>). This observation is the same regardless of whether the biases are ranked in terms of the total number of bias residues, or the total number of biased regions (Table <tblr tid="T3">3</tblr>), or as a weighted count in which the number of bias residues is multiplied by a factor derived from the amino-acid composition of the proteome (see Additional data file 1 or Supplementary Table A at <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>). For all the proteomes, N biases are always less prevalent than Q biases, being most disfavored in the human proteome, where they are up to 12 times rarer than Q-rich regions (Table <tblr tid="T2">2c</tblr>). The small number of N-rich regions in human sequences is intriguing, and may be due to a cellular toxicity of such regions in higher eukaryotes.</p>
            <tbl id="T3" hint_layout="double">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Comparison of prevalent compositionally biased regions for the whole proteome, translated intergenic DNA, known proteins, hypothetical proteins and dORFs in budding yeast</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c cspan="6" ca="left">
                        <p><b>(a) </b>Proteome</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-5</sup></p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-9</sup></p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-13</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>37,006</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>18,502</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>10,630</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>21,163</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>9,147</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>5,900</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>18,064</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>6,836</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>4,704</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>17,067</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>6462 (9.3)</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>3,924 (10.4)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>15,577 (7.4)</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>5,212 (7.5)</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>3,745 (10.0)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>13,974</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>4,280</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>2,049</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>12,927</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>3,831</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>1,910</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>10,004</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>3,512</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>1,292</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>9,892</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>3,176</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>961</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>9,866</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>2,473</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>916</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>8,934</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>2,115</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>554</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>8,689 (4.1)</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>810</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>256</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>6,939</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>764</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>204</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>5,333</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>662</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>195</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>4,121</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>509</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>163</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>3,293</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>264</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>94</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>2,960</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>262</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>90</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>2,645</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>245</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>33</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>2,009</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>150</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>850</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>211,313</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>69,212</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>37,620</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="6" ca="left">
                        <p><b>(b) </b>Translated igDNA*</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-5</sup></p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-9</sup></p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-13</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>28,949</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>5,692</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>1,211</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>10,074</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>1,280</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>602</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>7,800</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>908</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>490</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>7,551</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>814</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>448</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>6,450</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>753</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>377</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>6,283</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>690</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>366</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>3,789</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>681</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>282</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>3,157</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>675</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>243</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>1,650</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>594</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>222</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>1,613</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>576</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>186</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>1,566</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>380</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>185</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>1,299</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>380</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>178</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>1,136</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>353</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>173 (3.2)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>798 (0.9)</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>299</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>166</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>746</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>242 (1.7)</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>98</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>498 (0.6)</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>125 (0.9)</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>51 (1.0)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>282</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>85</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>39</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>268</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>26</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>16</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>241</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>84,166</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>14,569</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>5,348</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="6" ca="left">
                        <p><b>(c) </b>Known yeast proteins<sup>&#8224;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-5</sup></p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-9</sup></p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-13</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>27,539</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>15,328</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>9,819</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>17,519</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>8,074</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>5,900</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>13,928</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>5,716 (9.9)</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>4,289</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>13,785</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>5,413</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>3,551 (11.9)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>12,854 (7.7)</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>4,520 (7.8)</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>3,348 (11.3)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>12,482</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>3,653</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>1,723</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>11,783</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>2,864</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>1,669</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>1,934</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>595</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>170</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>1,883</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>453</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>62</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>7,299 (4.4)</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>2,434</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>899</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>7,045</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>1,969</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>451</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>6,154</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>608</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>207</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>5,495</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>530</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>162</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>3,973</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>447</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>155</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>3,415</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>443</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>113</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>2,400</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>264</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>78</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>2,158</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>218</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>1,536</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>195</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>1,484</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>60</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>656</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>166,920 (13)</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>57,938 (38)</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>33,070 (19)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="6" ca="left">
                        <p><b>(d) </b>Hypothetical yeast proteins<sup>&#8224;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-5</sup></p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-9</sup></p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-13</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>8,621</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>2,958</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>1,240</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>3,905</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>1,423</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>772</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>3,630</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>1,073</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>576 (13.7)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>3,043</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>680 (6.8)</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>415</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>2,747</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>664 (6.6)</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>262</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>2,506 (6.4)</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>602</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>194 (4.6)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>2,050</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>600</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>187</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>1,934</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>595</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>170</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>1,883</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>453</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>62</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>1,386</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>321</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>55</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>1,267</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>202</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>50</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>1,264</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>146</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>50</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>1,171 (3.0)</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>106</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>49</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>882</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>62</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>49</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>863</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>55</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>33</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>528</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>50</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>33</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>514</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>44</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>16</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>512</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>14</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>389</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>179</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>39,274 (16)</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>10,048 (221)</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>4,213 (150)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="6" ca="left">
                        <p><b>(e) </b>dORFs</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-5</sup></p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-9</sup></p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>P<sub>bias </sub>&lt; 1 &#215; 10<sup>-13</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>459</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>254</p>
                     </c>
                     <c ca="left">
                        <p>R</p>
                     </c>
                     <c ca="left">
                        <p>254</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>307</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>204</p>
                     </c>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>204</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>288</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>138</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>122</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>271</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>129 (11.0)</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>120</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>L</p>
                     </c>
                     <c ca="left">
                        <p>248</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>122</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>99</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>225 (6.8)</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>99</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>74 (8.3)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>208</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>82</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>23 (2.6)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>172 (5.2)</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>72</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>168</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>50</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>163</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>23 (2.0)</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>151</p>
                     </c>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>149</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>111</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>98</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>84</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>67</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>45</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>37</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>23</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>14</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>3,288</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>1,173</p>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>896</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*Translated igDNA ('intergenic DNA') is conceptually translated in six frames. For analysis of intergenic DNA in budding yeast, we used the 'Not Feature' file of sequences in FASTA format distributed by SGD (this contains all genomic DNA that does not overlap an annotated feature <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>). This set of nucleotide sequences was conceptually translated in all six reading frames, and the amino-acid compositional biases were tallied up as for the annotated budding-yeast proteome. A dORF is an open reading frame that is disrupted by one or more frameshifts or premature stop codons, and which is likely to be a pseudogene. A data set of dORFs has been derived previously for the budding-yeast genome <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. <sup>&#8224;</sup>In the totals for known and hypothetical proteins, the number of bias residues per residue of protein is given in parentheses.</p>
               </tblfn>
            </tbl>
            <p>Interestingly, as noted in Table <tblr tid="T2">2</tblr>, there are eight examples of predicted coiled-coil domains <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> that are in our list of (Q+N)-rich domains. Coiled coils are alpha-helical, whereas the prions form beta-sheet-rich aggregates; this may be an artifact of the coiled-coil prediction program <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>, although there are some known viral coiled coils that have short runs of up to five Q residues, and a mild overall Q bias over their whole sequence <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>.</p>
            <p>The prevalent biases in the budding-yeast proteome were broken down into those for hypothetical and known proteins, and compared at three bias levels (Table <tblr tid="T3">3</tblr>). Known proteins are those in open reading frame (ORF) classes 1 through 3 in the MIPS database <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> (these are either characterized proteins or sequences that have homology to a characterized sequence). 'Hypothetical' proteins are the remaining annotations (ORF classes 4 through 6 in the MIPS database). There is little difference in the rankings for biases for the whole proteome, the set of known proteins and the set of hypothetical proteins (Table <tblr tid="T3">3a,c,d</tblr>). Surprisingly, however, total amounts of biased regions are substantially higher for known proteins (Table <tblr tid="T3">3</tblr>); for example at P<sub>bias </sub>&lt; 1 &#215; 10<sup>-9</sup>, eight times as prevalent. Q and N biases both remain high-ranking in the 'known' and 'hypothetical' protein lists, and are lowly ranked for conceptually translated igDNA (Table <tblr tid="T3">3a,b,e</tblr>). In general, the prevalent biases observed for conceptually translated igDNA are very different from those for the annotated proteome (Table <tblr tid="T3">3a,b,e</tblr>). Notably, there is also very little implied bias for negatively charged residues (aspartic acid (D) and glutamic acid (E) combined), relative to positively charged residues (lysine (K) and arginine (R) combined) in the translated igDNA biases. This suggests that negatively charged bias regions in protein-coding sequences would take longer to evolve or need much greater selective pressure than those for positively charged biased regions, and that underlying replication 'slippage' tendencies <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> and mutation biases for the formation of cryptically simple sequences <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> may disfavor such regions.</p>
         </sec>
         <sec>
            <st>
               <p>(Q+N)-rich domains</p>
            </st>
            <p>We derived a list of (Q+N)-rich domains using Q, N, and {Q+N} compositional bias in combination (Table <tblr tid="T4">4</tblr>, see footnotes for details of P<sub>bias </sub>thresholds used). The longest LPS was chosen to define the domain where any of the three LPSs overlap substantially (a threshold of 15 residues was found to be suitable). There are up to approximately 170 such (Q+N)-rich domains in budding yeast. Most strikingly, we note that (Q+N)-rich domains are relatively rarer in fission yeast, with a comparatively large number in fruit fly (Table <tblr tid="T4">4</tblr>). The four known budding-yeast prions have biases against the major hydrophobics {VILM} and charged residues {DERK} (Table <tblr tid="T1">1</tblr>). When these negative biases are accounted for, the number of (Q+N)-rich domains in budding yeast reduces by half to around 100 (Table <tblr tid="T4">4</tblr>). This may be due to selection against amyloidogenesis mechanisms, where such bias is used for a different reason (perhaps in some cases as part of a coiled coil, see above). Subsidiary biases for glycine (G), tyrosine (Y) and serine (S) occur for three of the four yeast prions (Table <tblr tid="T1">1</tblr>). When these are accounted for, a substantial number (30) still remains (Table <tblr tid="T4">4</tblr>). The thresholds used in Table <tblr tid="T4">4</tblr> are derived from the highest P<sub>bias </sub>values for the LPSs of any yeast prion sequence (rounded up to two significant figures) (Table <tblr tid="T4">4</tblr>). These observations on subsidiary biases demonstrate the diversity of (Q+N)-rich domains in eukaryotes, showing that about half of them have other biases that are predicted to be incompatible with prion-like amyloidogenesis mechanisms (Table <tblr tid="T4">4</tblr>).</p>
            <tbl id="T4" hint_layout="single">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Numbers of (Q+N)-rich domains for the six proteomes</p>
               </caption>
               <tblbdy cols="8">
                  <r>
                     <c cspan="2" ca="left">
                        <p>Category</p>
                     </c>
                     <c ca="left">
                        <p>Budding yeast</p>
                     </c>
                     <c ca="left">
                        <p>Fission yeast</p>
                     </c>
                     <c ca="left">
                        <p>Fruit fly</p>
                     </c>
                     <c ca="left">
                        <p>Nematode</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Arabidopsis</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Human</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>(Q+N)-rich domains according to a Q, N or Q+N bias*</p>
                     </c>
                     <c ca="left">
                        <p>172</p>
                     </c>
                     <c ca="left">
                        <p>22</p>
                     </c>
                     <c ca="left">
                        <p>853</p>
                     </c>
                     <c ca="left">
                        <p>315</p>
                     </c>
                     <c ca="left">
                        <p>213</p>
                     </c>
                     <c ca="left">
                        <p>194</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>(1) plus filter for charged and hydrophobic residues<sup>&#8224;</sup></p>
                     </c>
                     <c ca="left">
                        <p>96</p>
                     </c>
                     <c ca="left">
                        <p>14</p>
                     </c>
                     <c ca="left">
                        <p>473</p>
                     </c>
                     <c ca="left">
                        <p>216</p>
                     </c>
                     <c ca="left">
                        <p>125</p>
                     </c>
                     <c ca="left">
                        <p>69</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>(2) plus requirement for a subsidiary bias for G, Y or S<sup>&#8225;</sup></p>
                     </c>
                     <c ca="left">
                        <p>31</p>
                     </c>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>86</p>
                     </c>
                     <c ca="left">
                        <p>80</p>
                     </c>
                     <c ca="left">
                        <p>35</p>
                     </c>
                     <c ca="left">
                        <p>21</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*The total number of (Q+N)-rich domains. These are all LPSs that have a Q or N bias with P<sub>bias </sub>&lt; 1 &#215; <sup>10-13</sup> or a {QN} bias with P<sub>bias </sub>&lt; 1.8 &#215; <sup>10-14</sup>. &#8224;A filter is used so that only LPSs that have a subsidiary bias against {DERK} with a P<sub>bias </sub>&lt; 6.5 &#215; <sup>10-3</sup> and against {VILM} with a P<sub>bias </sub>&lt; 2 &#215; <sup>10-2</sup> are considered. <sup>&#8225;</sup>A filter is used so that only LPSs that have a subsidiary bias for one of the residues G, Y or S with a P<sub>bias </sub>&lt; 5 &#215; <sup>10-4</sup> are considered.</p>
               </tblfn>
            </tbl>
            <p>[PIN+] is a non-Mendelian inherited trait required for the <it>de novo </it>appearance of the [PSI+] prion in budding yeast <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. A recent study derived a list of nine candidate genes responsible for the [PIN+] phenomenon <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Seven of these nine are in found in the (Q+N)-rich domain list here. With regard to the other two, one (<it>PIN2</it>, YOR104W) has a notable subsidiary bias for Y (11 in 51 residues, P<sub>bias </sub>= 9.0 &#215; 10<sup>-7</sup>), and the other (<it>STE18</it>, YJR086W) has a very short Q-rich region (12 in 25 residues, P<sub>bias </sub>= 3.9 &#215; 10<sup>-11</sup>).</p>
            <p>To characterize the (Q+N)-rich domains further, we examined their lengths (Figure <figr fid="F1">1</figr>), and also their prevalent gene Ontology (GO) annotations <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> for the proteins that contain them (Table <tblr tid="T5">5</tblr>), focusing on budding yeast, fruit fly and human. The GO annotations can be considered as 'keywords' that give an indication of the biological role of the (Q+N)-rich domains (Table <tblr tid="T5">5</tblr>). The distribution of lengths for the regions with (Q+N) bias varies markedly from organism to organism, with humans having the largest proportion of very long regions with (Q+N) bias (44% > 275 residues; see Figure <figr fid="F1">1</figr> legend). The fly (Q+N)-rich regions tend to be short, like those in budding yeast (see Figure <figr fid="F1">1</figr> legend). They have a large proportion (around 18%) that localize to the nucleus, with some of these appearing to be related to transcription (Table <tblr tid="T5">5</tblr>). In budding yeast, the distribution of GO compartment annotations for proteins with (Q+N)-rich domains shows that these sequences occur most often in the nucleus (23 annotations), in preference to the cytoplasm (16), and the plasma membrane (9). Those that are placed in the nucleus tend to be transcription factors (see function categories in Table <tblr tid="T5">5</tblr>). Along with transcription, the preferred processes for proteins with (Q+N)-rich domains are 'endocytosis', 'pseudohyphal growth' and 'nuclear pore organization'.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Histogram of the lengths of the (Q+N)-rich domains for budding yeast, fruit fly and human</p>
               </caption>
               <text>
                  <p>Histogram of the lengths of the (Q+N)-rich domains for budding yeast, fruit fly and human. The distribution of sequence lengths for the (Q+N)-rich domains are shown for budding yeast (top panel), fruit fly (middle panel) and human (bottom panel). The <it>y</it>-axis is the number of regions per bin, and the <it>x</it>-axis is for bins with labels <it>x </it>such that each bin contains all sequences with length <it>x </it>to <it>x </it>+ 24 inclusive. The mean and median lengths for each of these distributions are as follows (organism, mean (&#177; SD), median): budding yeast, 209 &#177; 209, 116; fruit fly, 236 &#177; 389, 89; human, 553 &#177; 730, 268. Only the distributions up to bin <it>x </it>= 275 are shown; a sizeable proportion of each distribution is longer than 275 residues (budding yeast 30% of sequences, fruit fly 22% and human 44%).</p>
               </text>
               <graphic file="gb-2003-4-6-r40-1"/>
            </fig>
            <tbl id="T5" hint_layout="single">
               <title>
                  <p>Table 5</p>
               </title>
               <caption>
                  <p>Functional categories for the (Q+N)-rich domains for budding yeast fruit, fly and human</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c ca="left">
                        <p>Organism</p>
                     </c>
                     <c ca="left">
                        <p>GO ontology</p>
                     </c>
                     <c ca="left">
                        <p>Five most frequent category annotations*</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Budding yeast</p>
                     </c>
                     <c ca="left">
                        <p>Component</p>
                     </c>
                     <c ca="left">
                        <p>Nucleus GO:0005634 (23), Cytoplasm GO:0005737 (16), cellular_component_unknown GO:0008372 (14), Plasma_membrane GO:0005886 (9), actin_cortical_path GO:0005857 (8), nuclear pore GO:0005643 (6)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Function</p>
                     </c>
                     <c ca="left">
                        <p>Molecular_function_unknown GO:0005554 (59), transcription_factor GO:0003700 (19), cytoskeletal_adaptor GO:0008093 (7), general transcriptional repressor GO:0016565 (6), general_RNA_polymerase_II_transcription_factor GO:0016251 (6), structural molecule GO:0005198 (6)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Process</p>
                     </c>
                     <c ca="left">
                        <p>Biological_process_unknown GO:0000004 (52), endocytosis GO:0006897 (10), pseudohyphal_growth GO:0007124 (9), transcription GO:0006350 (9), nuclear pore organization GO:0006999 (8), protein amino acid phosphorylation GO:0006468 (7), regulation of cell cycle GO:0000074 (7)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Fly</p>
                     </c>
                     <c ca="left">
                        <p>Component</p>
                     </c>
                     <c ca="left">
                        <p>Nucleus GO:0005634 (157), TFIID_complex GO:0005669 (13), plasma_membrane GO:0005886 (19), cytoplasm GO:0005737 (23), microtubule_associated_protein GO:0005875 (9)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Function</p>
                     </c>
                     <c ca="left">
                        <p>RNA_polymerase_II_transcription_factor GO:0003702 (52), transcription_factor GO:0003700 (39), specific_RNA_polymerase_II_transcription_factor GO:0003704 (36), RNA binding GO:0003723 (30), general RNA polymerase II transcription factor GO:0016251 (17)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Process</p>
                     </c>
                     <c ca="left">
                        <p>Notch receptor signaling pathway GO:0007219 (18), protein amino acid phosphorylation GO:0006468 (18), transcription initiation GO:0006367 (13), gene silencing GO:0016458 (9), neuroblast determination GO:0004725 (9)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human</p>
                     </c>
                     <c ca="left">
                        <p>Component</p>
                     </c>
                     <c ca="left">
                        <p>Nucleus GO:0005634 (52), integral membrane protein GO:0016021 (9), extracellular space GO:0005615 (9), plasma membrane GO:0005887 (7), cytoskeleton GO:0005856 (7).</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Function</p>
                     </c>
                     <c ca="left">
                        <p>Transcription_factor GO:0003700 (22), GO:0003677 DNA binding (20), calcium ion binding GO:0005509 (11), ATP binding GO:0005524 (10), transcription coactivator GO:0003713 (10)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Process</p>
                     </c>
                     <c ca="left">
                        <p>Regulation of transcription GO:0006355 (34), signal transduction GO:0007165 (15), protein amino acid phosphorylation GO:0006468 (7), transcription from PolII promoter GO:0006366 (7), oncogenesis GO:0005198 (7)</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*A description of each GO category is followed by the number in the ontology and the total number of such designations found, in brackets.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Some overarching perspectives on biased regions</p>
            </st>
            <p>To put our case study of Q- or N-rich regions in a general context, we will discuss some overarching perspectives on compositional bias in the eukaryotic proteomes. The behavior of all 20 single-residue biases as a function of decreasing P<sub>bias </sub>in the proteomes of budding yeast, fission yeast, nematode worm, fruit fly, <it>Arabidopsis </it>and human was examined. The curves for seven selected residues are shown for the budding yeast, fission yeast, fruit fly and human proteomes (Figure <figr fid="F2">2</figr>). Each eukaryotic proteome has a characteristic profile of bias proportions (Figure <figr fid="F2">2</figr>). For budding yeast, serine (S) is an abundant bias regardless of the P<sub>bias </sub>threshold (Figure <figr fid="F2">2</figr>). For lower P<sub>bias </sub>values, less than 1 &#215; 10<sup>-15</sup>, these biases arise mainly from serine-rich mannoproteins that are involved in the cell wall (for example, FLO8 <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>). N and Q, however, are prevalent biases only for the lowest P<sub>bias </sub>levels (P<sub>bias </sub>&#8804; 1 &#215; 10<sup>-13</sup>). In all the proteomes, biases for individual hydrophobic residues (for example, isoleucine (I) and leucine (L)) fall off at much milder levels of probability, although less so for leucine because of its involvement in coiled-coil regions (Figure <figr fid="F2">2</figr>). There are no I or tryptophan (W) biases at P<sub>bias </sub>= 1 &#215; 10<sup>-10 </sup>or lower for each of the eukaryotes. It is noticeable that cysteine (C) bias is maintained at relative abundance in the human proteome (Figure <figr fid="F2">2</figr>) to much lower P<sub>bias </sub>levels than in the other eukaryotes studied; this arises from the occurrence of large tandem arrays of cysteine-rich domains that are disulfide-bridged (for example, epidermal growth factor-like domains <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> and/or metal-binding proteins (such as the zinc finger)).</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Each proteome has a characteristic distribution of biases</p>
               </caption>
               <text>
                  <p>Each proteome has a characteristic distribution of biases. The proportion of bias residues (<it>y</it>-axis) counted up for each of the following seven residues (S, Q, N, L, I, D, C) are shown as a function of the bias probability (<it>x</it>-axis). The <it>x</it>-axis comprises bins labeled with -log(P) such that all regions with probabilities from -log(P) to 3.0 -log(P) are included. The end (right-most) bin includes all regions with log probability greater than -log(P). From left to right, the first set of panels is for budding yeast, the second set for fission yeast, the third set for fruit fly and the fourth for human. The rows of panels are labeled at the far right with the appropriate one-letter amino-acid symbol (S, Q, N, L, I, D and C).</p>
               </text>
               <graphic file="gb-2003-4-6-r40-2"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusions</p>
         </st>
         <p>We have carried out an analysis of (Q+N)-rich domains in the complete proteomes of six eukaryotes, using a simple formalism based on finding the LPSs for a given set of amino acids within a protein sequence. We were motivated to use LPSs by the fact that the four known (Q+N)-rich prion-determinant sequences in budding yeast (found previously by experiment) each correspond to an LPS.</p>
         <sec>
            <st>
               <p>Analysis of budding-yeast prion sequences</p>
            </st>
            <p>We have examined the characteristic biases of the four known budding-yeast prion sequences. Supplementary to the well-documented Q and N biases, there are also mild biases for Y, G and/or S in three of the prion-determinant sequences. A substantial fraction (30/172) of the (Q+N)-rich domains found in this survey for budding yeast have such a subsidiary bias. In particular, for Sup35p, the bias for Y is conserved in homologs in other fungi <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, and could potentially contribute to amyloid formation through &#960;-stacking of aromatic groups <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. It is likely that some Q- or N-rich regions may have subsidiary compositions that are there to decrease the likelihood of prion-like amyloidogenesis in higher eukaryotes; this would explain the large number of (Q+N)-rich domains that are deleted when a mild bias against charged and major hydrophobic residues is considered. Interestingly, the prion-determinant domains of three of the four prions correspond closely with the LPSs for the single most abundant residue types. For the fourth, New1p, the LPS corresponds to a triplet repeat (NYN)<sub>n </sub>that appears necessary for prion propagation <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Relative abundance of Q and N bias in eukaryotic proteomes</p>
            </st>
            <p>We examined the relative abundance of biases for all 20 residue types for the six different eukaryotic proteomes. When biases that are at least at the level of the Q and N biases observed for the yeast prion sequences are considered, regions with N bias are always less common than those with Q bias, and become substantially less favored for the human proteome (being 12 times rarer than Q-biased regions). Disfavoring of N-rich regions in the human proteome (and other mammalian proteomes) has also been observed for homopolymeric runs of sequence <abbrgrp><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Occurrence of (Q+N)-rich regions</p>
            </st>
            <p>As a suitable standard, we determined a refined list of domains that are at least as biased as the budding-yeast prion domains (either in terms of Q, N, or {Q+N} compositional bias). Fission yeast appears to have rather fewer (Q+N)-rich domains than budding yeast, which may indicate a relative intolerance to Q/N-based 'polar zipper' oligomerization/polymerization <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. In the fruit fly, the large number of apparent (Q+N)-rich domains tend to be as short as those in budding yeast, with about a fifth (around 18%) of them localizing to the nucleus, some of which are annotated as involved in transcription (by GO classification <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>).</p>
            <p>The analysis of Q/N can be of use to those studying the prion phenomenon in budding yeast and aggregation/amyloidogenesis in the eukaryotic cell. The data for (Q+N)-rich domains is available <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. We are not suggesting that these regions are indeed prion-like; on the contrary, we have shown the diversity and abundance of these domains in a genomic context, and that they can have a variety of functions, compartments, and importantly, that they very often have subsidiary biases which would be disruptive to prion-like amyloidogenesis. The main results here on Q/N bias are robust to the underlying probability model, as a uniform frequency expectation for amino acids (all <it>f</it><sub><it>x </it></sub>= 0.05 in Equation 1, see Materials and methods) produces essentially the same trends (see, for example, Additional data file 2 or Supplementary Table B at <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>).</p>
         </sec>
         <sec>
            <st>
               <p>Some general perspectives on compositional bias</p>
            </st>
            <p>To put this 'case study' of Q/N bias in context, we have also presented some results that offer a more general perspective on the phenomenon of compositional bias. We found that the prevalence of different biases as a function of P<sub>bias </sub>is characteristic for each proteome (Figure <figr fid="F2">2</figr>); however, there are some common trends, such as a disfavorment of regions with pronounced biases for I or W. For budding yeast, compositional biases extracted from conceptually translated igDNA, and for disabled ORFs (so-called dORFs) are very different from those for the annotated proteome. Also, there is surprisingly little difference between the prevalent biases observed for the approximately 2,000 annotated 'hypothetical' proteins and the approximately 4,000 known proteins in the budding-yeast proteome. However, biased regions are substantially more common (at some bias levels more than eight times as common) for known proteins than for hypothetical proteins. These observations may be applicable to gene prediction and verification.</p>
            <p>The algorithm presented here can be developed for other investigations of compositional bias, for structural genomics, and for topics in protein folding and design, and is also readily applicable to nucleic acid sequences.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>Calculating regions that are Q+N-rich or have other biases</p>
            </st>
            <p>Six complete eukaryotic proteomes were downloaded from the web: budding yeast (<it>Saccharomyces cerevisiae </it>from the SGD <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>), nematode worm (<it>Caenorhabditis elegans </it>Wormpep25 <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>), fruit fly (<it>Drosophila melanogaster </it><abbrgrp><abbr bid="B34">34</abbr></abbrgrp>), mustard weed (<it>Arabidopsis thaliana </it><abbrgrp><abbr bid="B35">35</abbr></abbrgrp>) and the Ensembl data set for human <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>). In each protein sequence of these proteomes, we searched for biased regions for each of the 20 amino-acid types as follows. For each individual amino-acid type <it>x</it>, and for the range of window sizes (<it>w</it>) from 25 residues to 2,500 residues, we searched each protein sequence for segments that have compositional bias of the lowest probability (P<sub>bias,min</sub>):</p>
            <p>P<sub>bias,min </sub>= min P(<it>i</it>,<it>w</it>) for all <it>i </it>and <it>x </it>&#160;&#160;&#160; (1)</p>
            <p>where <it>i </it>is each possible start position for a window <it>w </it>in the sequence. The probability P(<it>i</it>,<it>w</it>) is given by a binomial distribution:</p>
            <p>P(<it>i</it>,<it>w</it>) = [{<it>w</it>!/[<it>n</it>!(<it>w </it>- <it>n</it>)!]}.(<it>f</it><sub><it>x</it></sub>)<sup><it>n</it></sup>.(1-<it>f</it><sub><it>x</it></sub>)<sup><it>w</it>-<it>n</it></sup>] &#160;&#160;&#160; (2)</p>
            <p>where <it>f</it><sub><it>x </it></sub>is the proportion of amino-acid type <it>x </it>in all of the sequences of the proteome taken together (or a uniform expected proportion for each amino acid = 0.05). The count for <it>x </it>is denoted <it>n </it>in the window <it>w </it>starting at position <it>i</it>. Such segments with P<sub>bias,min </sub>are termed LPSs. Once an initial LPS is found in a protein sequence, the remainder of the sequence is resubmitted to the procedure until no further LPSs can be found. This is somewhat similar to the procedure in the program SEG for assignment of low-complexity or compositionally biased regions (which is based on the calculation of sequence information entropy), and which also determines an LPS <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. To save on computational time, an initial filtering is applied (before using the procedure described above) using pre-computed threshold tables for each window length for all residue types for a fixed relatively high probability value (P<sub>bias </sub>= 0.001 was found to be suitable).</p>
            <p>This procedure differs from those previously reported as it allows for calculation of biases both for and against amino-acid types, while allowing calculation of subsidiary biases for any predefined sequence or subsequence <abbrgrp><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr></abbrgrp>. We applied this formalism, because we noted that prion-determinant domains for the budding-yeast prions correspond closely to LPSs. The results and trends for Q/N biases reported in this paper use <it>f</it><sub><it>x </it></sub>values derived from the eukaryotic proteomes, but do not differ substantially if a uniform probability model for residue bias is used (all residues having <it>f</it><sub><it>x </it></sub>= 0.05; see <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>).</p>
         </sec>
         <sec>
            <st>
               <p>Calculating biases for any set of amino acids</p>
            </st>
            <p>Equation (2) can be generalized to calculate a bias for any set of amino acids {<it>xyz</it>...}, by summing up the number of residues over the whole set. This is studied in particular for the sets {QN}, {DERK} (charged residues) and {VILM} (major hydrophobics). As for single-residue biases, the LPSs for a sequence are identified.</p>
         </sec>
         <sec>
            <st>
               <p>GO annotations</p>
            </st>
            <p>Annotations for GO categories <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> for the eukaryotic proteomes were downloaded from the Gene Ontology website <abbrgrp><abbr bid="B40">40</abbr></abbrgrp> and counted up as lists of keywords indicative of biological role.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>The abundance of biases counted up in different ways for different bias probability thresholds is available in Additional data file <supplr sid="s1">1</supplr>. A table showing the number of biased regions for all the eukaryotes (for a uniform probability model) is available in Additional data file <supplr sid="s2">2</supplr>. The coordinates of (gln+asn)-rich domains are available for the following organisms: <it>S. cerevisiae </it>(Additional data file <supplr sid="s3">3</supplr>) <it>S. pombe </it>(Additional data file <supplr sid="s4">4</supplr>), <it>C. elegans </it>(Additional data file <supplr sid="s5">5</supplr>), <it>Arabidopsis </it>(Additional data file <supplr sid="s6">6</supplr>), <it>Drosophila </it>(Additional data file <supplr sid="s7">7</supplr>) and human (Additional data file <supplr sid="s8">8</supplr>). The format for each of these files is as follows: field #1 = name, field #2 = sequence length, field #3 = bias (Q or N or {QN}), field #4 = number of bias residues, field #5 = start of QN-rich region, field #6 = end of QN-rich region, field #7 = probability of bias (see manuscript for details).</p>
         <p>The sequences of the proteomes can be found in <it>S. cerevisiae </it>(Additional data file <supplr sid="s9">9</supplr>), <it>S. pombe </it>(Additional data file <supplr sid="s10">10</supplr>), <it>C. elegans </it>(Additional data file <supplr sid="s11">11</supplr>), <it>Arabidopsis </it>(Additional data file <supplr sid="s12">12</supplr>), <it>Drosophila </it>(Additional data file <supplr sid="s13">13</supplr>) and human (Additional data file <supplr sid="s14">14</supplr>). All Additional data files are also available at <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>.</p>
         <suppl id="s1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>The abundance of biases counted up in different ways for different bias probability thresholds</p>
            </caption>
            <text>
               <p>The abundance of biases counted up in different ways for different bias probability thresholds</p>
            </text>
            <file name="gb-2003-4-6-r40-s1.txt">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s2">
            <title>
               <p>Additional data file 2</p>
            </title>
            <caption>
               <p>A table showing the number of biased regions for all the eukaryotes (for a uniform probability model)</p>
            </caption>
            <text>
               <p>A table showing the number of biased regions for all the eukaryotes (for a uniform probability model)</p>
            </text>
            <file name="gb-2003-4-6-r40-s2.txt">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s3">
            <title>
               <p>Additional data file 3</p>
            </title>
            <caption>
               <p>The coordinates of <it>S. cerevisiae</it> (gln+asn)-rich domains</p>
            </caption>
            <text>
               <p>The coordinates of <it>S. cerevisiae</it> (gln+asn)-rich domains</p>
            </text>
            <file name="gb-2003-4-6-r40-s3.tab">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s4">
            <title>
               <p>Additional data file 4</p>
            </title>
            <caption>
               <p>The coordinates of <it>S. pombe</it> (gln+asn)-rich domains</p>
            </caption>
            <text>
               <p>The coordinates of <it>S. pombe</it> (gln+asn)-rich domains</p>
            </text>
            <file name="gb-2003-4-6-r40-s4.tab">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s5">
            <title>
               <p>Additional data file 5</p>
            </title>
            <caption>
               <p>The coordinates of <it>C. elegans</it> (gln+asn)-rich domains</p>
            </caption>
            <text>
               <p>The coordinates of <it>C. elegans</it> (gln+asn)-rich domains</p>
            </text>
            <file name="gb-2003-4-6-r40-s5.tab">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s6">
            <title>
               <p>Additional data file 6</p>
            </title>
            <caption>
               <p>The coordinates of <it>Arabidopsis</it> (gln+asn)-rich domains</p>
            </caption>
            <text>
               <p>The coordinates of <it>Arabidopsis</it> (gln+asn)-rich domains</p>
            </text>
            <file name="gb-2003-4-6-r40-s6.tab">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s7">
            <title>
               <p>Additional data file 7</p>
            </title>
            <caption>
               <p>The coordinates of <it>Drosophila</it> (gln+asn)-rich domains</p>
            </caption>
            <text>
               <p>The coordinates of <it>Drosophila</it> (gln+asn)-rich domains</p>
            </text>
            <file name="gb-2003-4-6-r40-s7.tab">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s8">
            <title>
               <p>Additional data file 8</p>
            </title>
            <caption>
               <p>The coordinates of human (gln+asn)-rich domains</p>
            </caption>
            <text>
               <p>The coordinates of human (gln+asn)-rich domains</p>
            </text>
            <file name="gb-2003-4-6-r40-s8.tab">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s9">
            <title>
               <p>Additional data file 9</p>
            </title>
            <caption>
               <p>The sequence of the <it>S. cerevisiae</it> proteome</p>
            </caption>
            <text>
               <p>The sequence of the <it>S. cerevisiae</it> proteome</p>
            </text>
            <file name="gb-2003-4-6-r40-s9.fasta">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s10">
            <title>
               <p>Additional data file 10</p>
            </title>
            <caption>
               <p>The sequence of the <it>S. pombe</it> proteome</p>
            </caption>
            <text>
               <p>The sequence of the <it>S. pombe</it> proteome</p>
            </text>
            <file name="gb-2003-4-6-r40-s10.fasta">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s11">
            <title>
               <p>Additional data file 11</p>
            </title>
            <caption>
               <p>The sequence of the <it>C.elegans</it> proteome</p>
            </caption>
            <text>
               <p>The sequence of the <it>C.elegans</it> proteome</p>
            </text>
            <file name="gb-2003-4-6-r40-s11.fasta">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s12">
            <title>
               <p>Additional data file 12</p>
            </title>
            <caption>
               <p>The sequence of the <it>Arabidopsis</it> proteome</p>
            </caption>
            <text>
               <p>The sequence of the <it>Arabidopsis</it>> proteome</p>
            </text>
            <file name="gb-2003-4-6-r40-s12.fasta">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s13">
            <title>
               <p>Additional data file 13</p>
            </title>
            <caption>
               <p>The sequence of the <it>Drosophila</it> proteome</p>
            </caption>
            <text>
               <p>The sequence of the <it>Drosophila</it> proteome</p>
            </text>
            <file name="gb-2003-4-6-r40-s13.fasta">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s14">
            <title>
               <p>Additional data file 14</p>
            </title>
            <caption>
               <p>The sequence of the human proteome</p>
            </caption>
            <text>
               <p>The sequence of the human proteome</p>
            </text>
            <file name="gb-2003-4-6-r40-s14.fasta">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Tricia Serio (Brown University) for discussions about budding-yeast prions and comments on the manuscript.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>The prion folding problem.</p>
            </title>
            <aug>
               <au>
                  <snm>Harrison</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Bamborough</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Daggett</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Prusiner</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Cohen</snm>
                  <fnm>FE</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>1997</pubdate>
            <volume>7</volume>
            <fpage>53</fpage>
            <lpage>59</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-440X(97)80007-3</pubid>
                  <pubid idtype="pmpid">9032055</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Thermodynamics of model prions and its implications for the problem of prion protein folding.</p>
            </title>
            <aug>
               <au>
                  <snm>Harrison</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>HS</fnm>
               </au>
               <au>
                  <snm>Prusiner</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Cohen</snm>
                  <fnm>FE</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1999</pubdate>
            <volume>286</volume>
            <fpage>593</fpage>
            <lpage>606</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1998.2497</pubid>
                  <pubid idtype="pmpid" link="fulltext">9973573</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Protein-only inheritance in yeast: something to get [PSI+]-ched about.</p>
            </title>
            <aug>
               <au>
                  <snm>Serio</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Lindquist</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Trends Cell Biol</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>98</fpage>
            <lpage>105</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0962-8924(99)01711-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">10675903</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Prions: portable prion domains.</p>
            </title>
            <aug>
               <au>
                  <snm>Wickner</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Edskes</snm>
                  <fnm>HK</fnm>
               </au>
               <au>
                  <snm>Maddelein</snm>
                  <fnm>ML</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>R335</fpage>
            <lpage>R337</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0960-9822(00)00460-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">10801430</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Prions of yeast as heritable amyloidoses.</p>
            </title>
            <aug>
               <au>
                  <snm>Wickner</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Edskes</snm>
                  <fnm>HK</fnm>
               </au>
               <au>
                  <snm>Maddelein</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Moriyama</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Roberts</snm>
                  <fnm>BT</fnm>
               </au>
            </aug>
            <source>J Struct Biol</source>
            <pubdate>2000</pubdate>
            <volume>130</volume>
            <fpage>310</fpage>
            <lpage>322</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jsbi.2000.4250</pubid>
                  <pubid idtype="pmpid" link="fulltext">10940235</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>The role of Sis1 in the maintenance of the [RNQ+] prion.</p>
            </title>
            <aug>
               <au>
                  <snm>Sondheimer</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Craig</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Lindquist</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>2001</pubdate>
            <volume>20</volume>
            <fpage>2435</fpage>
            <lpage>2442</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">125465</pubid>
                  <pubid idtype="pmpid" link="fulltext">11350932</pubid>
                  <pubid idtype="doi">10.1093/emboj/20.10.2435</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Multiple Gln/Asn-rich prion domains confer susceptibility to induction of the yeast [PSI(+)] prion.</p>
            </title>
            <aug>
               <au>
                  <snm>Osherovich</snm>
                  <fnm>LZ</fnm>
               </au>
               <au>
                  <snm>Weissman</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2001</pubdate>
            <volume>106</volume>
            <fpage>183</fpage>
            <lpage>194</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11511346</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>[PSI+]: an epigenetic modulator of translation termination efficiency.</p>
            </title>
            <aug>
               <au>
                  <snm>Serio</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Lindquist</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Annu Rev Cell Dev Biol</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <fpage>661</fpage>
            <lpage>703</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.cellbio.15.1.661</pubid>
                  <pubid idtype="pmpid" link="fulltext">10611975</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>A small reservoir of disabled ORFs in the yeast genome and its implications for the dynamics of proteome evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Harrison</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Kumar</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lan</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Echols</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Snyder</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2002</pubdate>
            <volume>316</volume>
            <fpage>409</fpage>
            <lpage>419</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.2001.5343</pubid>
                  <pubid idtype="pmpid" link="fulltext">11866506</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Prion filament networks in [URE3] cells of <it>Saccharomyces cerevisiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Speransky</snm>
                  <fnm>VV</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Edskes</snm>
                  <fnm>HK</fnm>
               </au>
               <au>
                  <snm>Wickner</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Steven</snm>
                  <fnm>AC</fnm>
               </au>
            </aug>
            <source>J Cell Biol</source>
            <pubdate>2001</pubdate>
            <volume>153</volume>
            <fpage>1327</fpage>
            <lpage>1336</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1083/jcb.153.6.1327</pubid>
                  <pubid idtype="pmpid" link="fulltext">11402074</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Rnq1: an epigenetic modifier of protein function in yeast.</p>
            </title>
            <aug>
               <au>
                  <snm>Sondheimer</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Lindquist</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Mol Cell</source>
            <pubdate>2000</pubdate>
            <volume>5</volume>
            <fpage>163</fpage>
            <lpage>172</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10678178</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>A census of glutamine/asparagine-rich regions: implications for their conserved function and the prediction of novel prions.</p>
            </title>
            <aug>
               <au>
                  <snm>Michelitsch</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Weissman</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <fpage>11910</fpage>
            <lpage>11915</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">17268</pubid>
                  <pubid idtype="pmpid" link="fulltext">11050225</pubid>
                  <pubid idtype="doi">10.1073/pnas.97.22.11910</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Prions affect the appearance of other prions: the story of [PIN(+)].</p>
            </title>
            <aug>
               <au>
                  <snm>Derkatch</snm>
                  <fnm>IL</fnm>
               </au>
               <au>
                  <snm>Bradley</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Hong</snm>
                  <fnm>JY</fnm>
               </au>
               <au>
                  <snm>Liebman</snm>
                  <fnm>SW</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2001</pubdate>
            <volume>106</volume>
            <fpage>171</fpage>
            <lpage>182</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11511345</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Amyloid fibers are water-filled nanotubes.</p>
            </title>
            <aug>
               <au>
                  <snm>Perutz</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Finch</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Berriman</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lesk</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>5591</fpage>
            <lpage>5595</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">122814</pubid>
                  <pubid idtype="pmpid" link="fulltext">11960014</pubid>
                  <pubid idtype="doi">10.1073/pnas.042681399</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Cause of neural death in neurodegenerative diseases attributable to expansion of glutamine repeats.</p>
            </title>
            <aug>
               <au>
                  <snm>Perutz</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Windle</snm>
                  <fnm>AH</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>412</volume>
            <fpage>143</fpage>
            <lpage>144</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35084141</pubid>
                  <pubid idtype="pmpid" link="fulltext">11449262</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Polar zippers: their role in human disease.</p>
            </title>
            <aug>
               <au>
                  <snm>Perutz</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1994</pubdate>
            <volume>3</volume>
            <fpage>1629</fpage>
            <lpage>1637</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7849580</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>The classification of amino acid conservation.</p>
            </title>
            <aug>
               <au>
                  <snm>Taylor</snm>
                  <fnm>WR</fnm>
               </au>
            </aug>
            <source>J Theor Biol</source>
            <pubdate>1986</pubdate>
            <volume>119</volume>
            <fpage>205</fpage>
            <lpage>218</lpage>
            <xrefbib>
               <pubid idtype="pmpid">3461222</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Molecular basis of a yeast prion species barrier.</p>
            </title>
            <aug>
               <au>
                  <snm>Santoso</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Chien</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Osherovich</snm>
                  <fnm>LZ</fnm>
               </au>
               <au>
                  <snm>Weissman</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2000</pubdate>
            <volume>100</volume>
            <fpage>277</fpage>
            <lpage>288</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10660050</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>A critical role for amino-terminal glutamine/asparagine repeats in the formation and propagation of a yeast prion.</p>
            </title>
            <aug>
               <au>
                  <snm>DePace</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Santoso</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hillner</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Weissman</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1998</pubdate>
            <volume>93</volume>
            <fpage>1241</fpage>
            <lpage>1252</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9657156</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>A possible role for pi-stacking in the self-assembly of amyloid fibrils.</p>
            </title>
            <aug>
               <au>
                  <snm>Gazit</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>FASEB J</source>
            <pubdate>2002</pubdate>
            <volume>16</volume>
            <fpage>77</fpage>
            <lpage>83</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1096/fj.01-0442hyp</pubid>
                  <pubid idtype="pmpid" link="fulltext">11772939</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Additional data</p>
            </title>
            <url>http://bioinfo.mbb.yale.edu/~harrison/qn</url>
         </bibl>
         <bibl id="B22">
            <title>
               <p>MultiCoil: a program for predicting two- and three-stranded coiled coils.</p>
            </title>
            <aug>
               <au>
                  <snm>Wolf</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Berger</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1997</pubdate>
            <volume>6</volume>
            <fpage>1179</fpage>
            <lpage>1189</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9194178</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>LearnCoil-VMF: computational evidence for coiled-coil-like motifs in many viral membrane-fusion proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Singh</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Berger</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>PS</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1999</pubdate>
            <volume>290</volume>
            <fpage>1031</fpage>
            <lpage>1041</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1999.2796</pubid>
                  <pubid idtype="pmpid" link="fulltext">10438601</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>MIPS: a database for genomes and protein sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Mewes</snm>
                  <fnm>HW</fnm>
               </au>
               <au>
                  <snm>Frishman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Gruber</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Geier</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Haase</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kaps</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lemcke</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Mannhaupt</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Pfeiffer</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Schuller</snm>
                  <fnm>C</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>37</fpage>
            <lpage>40</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">102494</pubid>
                  <pubid idtype="pmpid" link="fulltext">10592176</pubid>
                  <pubid idtype="doi">10.1093/nar/28.1.37</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Amino acid reiterations in yeast are overrepresented in particular classes of proteins and show evidence of a slippage-like mutational process.</p>
            </title>
            <aug>
               <au>
                  <snm>Mar Alba</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Santibanez-Koref</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Hancock</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1999</pubdate>
            <volume>49</volume>
            <fpage>789</fpage>
            <lpage>797</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10594180</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Cryptic simplicity in DNA is a major source of genetic variation.</p>
            </title>
            <aug>
               <au>
                  <snm>Tautz</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Trick</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dover</snm>
                  <fnm>GA</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1986</pubdate>
            <volume>322</volume>
            <fpage>652</fpage>
            <lpage>656</lpage>
            <xrefbib>
               <pubid idtype="pmpid">3748144</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.</p>
            </title>
            <aug>
               <au>
                  <snm>Ashburner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ball</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Blake</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Butler</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Cherry</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Dolinski</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dwight</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Eppig</snm>
                  <fnm>JT</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>25</volume>
            <fpage>25</fpage>
            <lpage>29</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/75556</pubid>
                  <pubid idtype="pmpid" link="fulltext">10802651</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p><it>Saccharomyces cerevisiae </it>S288C has a mutation in <it>FLO8</it>, a gene required for filamentous growth.</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Styles</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Fink</snm>
                  <fnm>GR</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1996</pubdate>
            <volume>144</volume>
            <fpage>967</fpage>
            <lpage>978</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8913742</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>The disulphide beta-cross: from cystine geometry and clustering to classification of small disulphide-rich protein folds.</p>
            </title>
            <aug>
               <au>
                  <snm>Harrison</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Sternberg</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1996</pubdate>
            <volume>264</volume>
            <fpage>603</fpage>
            <lpage>623</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1996.0664</pubid>
                  <pubid idtype="pmpid" link="fulltext">8969308</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Amino acid runs in eukaryotic proteomes and disease associations.</p>
            </title>
            <aug>
               <au>
                  <snm>Karlin</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brocchieri</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Bergman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mrazek</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gentles</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>333</fpage>
            <lpage>338</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">117561</pubid>
                  <pubid idtype="pmpid" link="fulltext">11782551</pubid>
                  <pubid idtype="doi">10.1073/pnas.012608599</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Asparagine repeats are rare in mammalian proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Kreil</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Kreil</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Trends Biochem Sci</source>
            <pubdate>2000</pubdate>
            <volume>25</volume>
            <fpage>270</fpage>
            <lpage>271</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0968-0004(00)01594-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">10838564</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Using the <it>Saccharomyces </it>Genome Database (SGD) for analysis of protein similarities and structure.</p>
            </title>
            <aug>
               <au>
                  <snm>Chervitz</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Hester</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Ball</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Dolinski</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dwight</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Harris</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Juvik</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Malekian</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Roberts</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Roe</snm>
                  <fnm>T</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1999</pubdate>
            <volume>27</volume>
            <fpage>74</fpage>
            <lpage>78</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">148101</pubid>
                  <pubid idtype="pmpid" link="fulltext">9847146</pubid>
                  <pubid idtype="doi">10.1093/nar/27.1.74</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Genome sequence of the nematode <it>C. elegans</it>: A platform for investigating biology.</p>
            </title>
            <aug>
               <au>
                  <cnm><it>C. elegans </it>Sequencing Consortium</cnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1998</pubdate>
            <volume>282</volume>
            <fpage>2012</fpage>
            <lpage>2018</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.282.5396.2012</pubid>
                  <pubid idtype="pmpid" link="fulltext">9851916</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>The genome sequence of <it>Drosophila melanogaster</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Adams</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Celniker</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Holt</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Evans</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Gocayne</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Amanatides</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Scherer</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Hoskins</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Galle</snm>
                  <fnm>RF</fnm>
               </au>
               <au>
                  <snm>Venter</snm>
                  <fnm>JC</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>287</volume>
            <fpage>2185</fpage>
            <lpage>2195</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.287.5461.2185</pubid>
                  <pubid idtype="pmpid" link="fulltext">10731132</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Analysis of the genome sequence of the flowering plant <it>Arabidopsis thaliana</it>.</p>
            </title>
            <aug>
               <au>
                  <cnm><it>Arabidopsis</it> Genome Initiative</cnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>408</volume>
            <fpage>796</fpage>
            <lpage>815</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35048692</pubid>
                  <pubid idtype="pmpid" link="fulltext">11130711</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Initial sequencing and analysis of the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Linton</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Birren</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Nusbaum</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Zody</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Baldwin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Devon</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dewar</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Doyle</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>FitzHugh</snm>
                  <fnm>W</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>409</volume>
            <fpage>860</fpage>
            <lpage>921</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35057062</pubid>
                  <pubid idtype="pmpid" link="fulltext">11237011</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Analysis of compositionally biased regions in sequence databases.</p>
            </title>
            <aug>
               <au>
                  <snm>Wootton</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Federhen</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Methods Enzymol</source>
            <pubdate>1996</pubdate>
            <volume>266</volume>
            <fpage>554</fpage>
            <lpage>571</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8743706</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Methods and algorithms for statistical analysis of protein sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Brendel</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Bucher</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Nourbakhsh</snm>
                  <fnm>IR</fnm>
               </au>
               <au>
                  <snm>Blaisdell</snm>
                  <fnm>BE</fnm>
               </au>
               <au>
                  <snm>Karlin</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1992</pubdate>
            <volume>89</volume>
            <fpage>2002</fpage>
            <lpage>2006</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">48584</pubid>
                  <pubid idtype="pmpid" link="fulltext">1549558</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>CAST: an iterative algorithm for the complexity analysis of sequence tracts.</p>
            </title>
            <aug>
               <au>
                  <snm>Promponas</snm>
                  <fnm>VJ</fnm>
               </au>
               <au>
                  <snm>Enright</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Tsoka</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kreil</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Leroy</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hamodrakas</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ouzounis</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>915</fpage>
            <lpage>922</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/16.10.915</pubid>
                  <pubid idtype="pmpid" link="fulltext">11120681</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Gene Ontology Consortium</p>
            </title>
            <url>http://www.geneontology.org</url>
         </bibl>
      </refgrp>
   </bm>
</art>
