<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2008-9-2-r29</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Finding exonic islands in a sea of non-coding sequence: splicing related constraints on protein composition and evolution are common in intron-rich genomes</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Warnecke</snm>
               <fnm>Tobias</fnm>
               <insr iid="I1"/>
               <email>tw233@bath.ac.uk</email>
            </au>
            <au id="A2">
               <snm>Parmley</snm>
               <mi>L</mi>
               <fnm>Joanna</fnm>
               <insr iid="I1"/>
               <email>j.parmley@cmbi.ru.nl</email>
            </au>
            <au id="A3" ca="yes">
               <snm>Hurst</snm>
               <mi>D</mi>
               <fnm>Laurence</fnm>
               <insr iid="I1"/>
               <email>l.d.hurst@bath.ac.uk</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, BA2 7AY, UK</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>2</issue>
         <fpage>R29</fpage>
         <url>http://genomebiology.com/2008/9/2/R29</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18257921</pubid>
               <pubid idtype="doi">10.1186/gb-2008-9-2-r29</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>5</day>
               <month>9</month>
               <year>2007</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>23</day>
               <month>11</month>
               <year>2007</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>7</day>
               <month>2</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>07</day>
               <month>02</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Warnecke et al.; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <shorttitle>
         <p>Splicing constraints on sequence</p>
      </shorttitle>
      <shortabs>
         <p>Biased usage of amino acids near exon-intron boundaries is phylogenetically widespread and characteristic of species for which there are expected to be problems defining exons.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>In mammals, splice-regulatory domains impose marked trends on the relative abundance of certain amino acids near exon-intron boundaries. Is this a mammalian particularity or symptomatic of exonic splicing regulation across taxa? Are such trends more common in species that <it>a priori </it>have a harder time identifying exon ends, that is, those with pre-mRNA rich in intronic sequence? We address these questions surveying exon composition in a sample of phylogenetically diverse genomes.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Biased amino acid usage near exon-intron boundaries is common throughout the metazoa but not restricted to the metazoa. There is extensive cross-species concordance as to which amino acids are affected, and reduced/elevated abundances are well predicted by knowledge of splice enhancers. Species expected to rely on exon definition for splicing, that is, those with a higher ratio of intronic to coding sequence, more introns per gene and longer introns, exhibit more amino acid skews. Notably, this includes the intron-rich basidiomycete <it>Cryptococcus neoformans</it>, which, unlike intron-poor ascomycetes (<it>Schizosaccharomyces pombe</it>, <it>Saccharomyces cerevisiae</it>), exhibits compositional biases reminiscent of the metazoa. Strikingly, 5 prime ends of nematode exons deviate radically from normality: amino acids strongly preferred near boundaries are strongly avoided in other species, and vice versa. This we suggest is a measure to avoid attracting <it>trans</it>-splicing machinery.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Constraints on amino acid composition near exon-intron boundaries are phylogenetically widespread and characteristic of species where exon localization should be problematic. That compositional biases accord with sequence preferences of splice-regulatory proteins and are absent in ascomycetes is consistent with selection on exonic splicing regulation.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The maxim that 'form follows function', dogmatically adhered to in some early 20th century design and architecture, refers to the idea that the final function of a product should be the only determinant of its design. Phenotypic products of evolutionary processes have also frequently been analyzed in this seductively simple framework.</p>
         <p>However, costs of production, the availability of raw materials, and other factors regularly lead to marketable goods being suboptimally designed as far as their immediate function is concerned. Likewise, in, for example, mammals, amino acid content of a protein reflects localized GC content <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p>
         <p>The need to encode, in exonic sequence, information relevant for correct splicing is another factor with the potential to influence protein composition <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Located in the exonic parts of primary mRNA transcripts, exonic splicing enhancers (ESEs) are short (6-8 nucleotides) nucleotide motifs that have been established as a core component of the pre-mRNA splicing mechanism in metazoans <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Playing a critical role in constitutive as well as alternative splicing <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, they function at multiple stages of spliceosome assembly by interacting with corresponding RNA recognition motifs in a number of different <it>trans</it>-factors <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. We will primarily focus on SR (serine-arginine) proteins because their binding specificities and functions in splicing regulation have been most extensively characterized. SR proteins appear critical for establishing, in conjunction with other proteins, cross-exon complexes that enable faithful communication between splice sites <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>.</p>
         <p>Recognition of exonic alongside intronic sequence motifs has been proposed to be pivotal in organisms where a majority of exons are flanked by much larger introns, allowing exons to be efficiently identified and not lost in a sea of intronic sequence <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. Furthermore, whereas in <it>Saccharomyces cerevisiae </it>splice sites and branch point sequences show a high degree of conservation to ensure the intron is correctly targeted by the splicing machinery, these recognition motifs tend to be less well conserved in multicellular organisms <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and intron-rich fungal genomes <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>.</p>
         <p>Experimentally raising the number of natural exonic enhancer sites leads to an additive increase in splicing activity <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. Importantly, ESEs function in a position-dependent manner, their efficiency in catalyzing splicing decreasing with increasing distance from the splice site <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. The significant enrichment near exon-intron boundaries for GAA (a codon known to be overrepresented in ESEs) compared with the synonymous GAG is consistent with this finding <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>. More generally, in mammals codons enriched in ESEs are more common near intron-exon boundaries <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>.</p>
         <p>A recent study by Parmley <it>et al</it>. <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> suggests ESEs have also left an imprint on the amino acid composition of proteins. Exploring exonic sequences adjacent to exon-intron boundaries in human and mouse, the authors reported marked trends in the relative abundance of certain amino acids when one moves away from the boundary. Some amino acids, such as lysine (K) and isoleucine (I), are strongly preferred near boundaries whereas others, such as proline (P) and alanine (A), are significantly avoided (for a full list see Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr>). This is the case for both 5' and 3' ends of exons. Considering separately the two-fold and four-fold blocks of the six-fold degenerate amino acids, the authors also showed that these trends are owing to avoidances/preferences at the nucleotide level and that there is a high degree of correspondence between the codons preferred and their involvement in computationally predicted and experimentally verified ESEs.</p>
         <tbl id="T1">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>Amino acids significantly preferred (-) or avoided (+) at 3' ends of exons across species</p>
            </caption>
            <tblbdy cols="24">
               <r>
                  <c cspan="23" ca="left">
                     <p>Amino acids<sup>*&#8224;</sup></p>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c cspan="23">
                     <hr/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>A</p>
                  </c>
                  <c ca="left">
                     <p>C</p>
                  </c>
                  <c ca="left">
                     <p>D</p>
                  </c>
                  <c ca="left">
                     <p>E</p>
                  </c>
                  <c ca="left">
                     <p>F</p>
                  </c>
                  <c ca="left">
                     <p>G</p>
                  </c>
                  <c ca="left">
                     <p>H</p>
                  </c>
                  <c ca="left">
                     <p>I</p>
                  </c>
                  <c ca="left">
                     <p>K</p>
                  </c>
                  <c ca="left">
                     <p>L4</p>
                  </c>
                  <c ca="left">
                     <p>L2</p>
                  </c>
                  <c ca="left">
                     <p>M</p>
                  </c>
                  <c ca="left">
                     <p>N</p>
                  </c>
                  <c ca="left">
                     <p>P</p>
                  </c>
                  <c ca="left">
                     <p>Q</p>
                  </c>
                  <c ca="left">
                     <p>R4</p>
                  </c>
                  <c ca="left">
                     <p>R2</p>
                  </c>
                  <c ca="left">
                     <p>S4</p>
                  </c>
                  <c ca="left">
                     <p>S2</p>
                  </c>
                  <c ca="left">
                     <p>T</p>
                  </c>
                  <c ca="left">
                     <p>V</p>
                  </c>
                  <c ca="left">
                     <p>W</p>
                  </c>
                  <c ca="left">
                     <p>Y</p>
                  </c>
                  <c ca="left">
                     <p>Species (number of exons)<sup>&#8225;</sup></p>
                  </c>
               </r>
               <r>
                  <c cspan="24">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>7</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>5</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>6</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>Human (178,438)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>6</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>5</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>7</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>5</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>Mouse (126,268)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>5</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>6</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>D. rerio</it> (41,264)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>4</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>6</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>5</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>5</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>C. elegans </it>(79,958)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>6</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>4</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>8</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>5</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>5</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>6</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>7</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>C. briggsae </it>(74,178)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>A. gambiae </it>(7,930)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>D. melanogaster </it>(48,933)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>5</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>5</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>6</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c ca="left">
                     <p><it>A. mellifera </it>(45,426)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>A. thaliana </it>(109,900)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>S. pombe </it>(2,403)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>S. cerevisiae </it>(417)</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>*Indices signify rank order of slope coefficients, separately for negative and positive trends. <sup>&#8224;</sup>L2, R2, S2 and L4, R4, S4 signify the two-fold and four-fold degenerate blocks of leucine, arginine, and serine, respectively. <sup>&#8225;</sup><it>S. cerevisiae </it>terminal exons were retained given the small number of genes with more than one intron (eight).</p>
            </tblfn>
         </tbl>
         <tbl id="T2">
            <title>
               <p>Table 2</p>
            </title>
            <caption>
               <p>Amino acids significantly preferred (-) or avoided (+) at 5' ends of exons across species</p>
            </caption>
            <tblbdy cols="24">
               <r>
                  <c cspan="23" ca="left">
                     <p>Amino acids<sup>*&#8224;</sup></p>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c cspan="23">
                     <hr/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>A</p>
                  </c>
                  <c ca="left">
                     <p>C</p>
                  </c>
                  <c ca="left">
                     <p>D</p>
                  </c>
                  <c ca="left">
                     <p>E</p>
                  </c>
                  <c ca="left">
                     <p>F</p>
                  </c>
                  <c ca="left">
                     <p>G</p>
                  </c>
                  <c ca="left">
                     <p>H</p>
                  </c>
                  <c ca="left">
                     <p>I</p>
                  </c>
                  <c ca="left">
                     <p>K</p>
                  </c>
                  <c ca="left">
                     <p>L4</p>
                  </c>
                  <c ca="left">
                     <p>L2</p>
                  </c>
                  <c ca="left">
                     <p>M</p>
                  </c>
                  <c ca="left">
                     <p>N</p>
                  </c>
                  <c ca="left">
                     <p>P</p>
                  </c>
                  <c ca="left">
                     <p>Q</p>
                  </c>
                  <c ca="left">
                     <p>R4</p>
                  </c>
                  <c ca="left">
                     <p>R2</p>
                  </c>
                  <c ca="left">
                     <p>S4</p>
                  </c>
                  <c ca="left">
                     <p>S2</p>
                  </c>
                  <c ca="left">
                     <p>T</p>
                  </c>
                  <c ca="left">
                     <p>V</p>
                  </c>
                  <c ca="left">
                     <p>W</p>
                  </c>
                  <c ca="left">
                     <p>Y</p>
                  </c>
                  <c ca="left">
                     <p>Species (number of exons)<sup>&#8225;</sup></p>
                  </c>
               </r>
               <r>
                  <c cspan="24">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>5</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>7</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>8</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>6</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>4</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>7</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>5</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>6</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>Human (178,438)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>5</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>7</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>7</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>6</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>4</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>5</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>6</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>Mouse (126,268)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>5</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>4</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>D. rerio</it> (41,264)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>5</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>5</sub></p>
                  </c>
                  <c ca="left">
                     <p><it>C. elegans </it>(79,958)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>5</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>5</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>6</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>C. briggsae </it>(74,178)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>A. gambiae </it>(7,930)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>D. melanogaster </it>(48,933)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>3</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>1</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>4</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>-<sub>5</sub></p>
                  </c>
                  <c ca="left">
                     <p>-<sub>6</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>A. mellifera </it>(45,426)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>+<sub>1</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>3</sub></p>
                  </c>
                  <c ca="left">
                     <p>+<sub>2</sub></p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>A. thaliana </it>(109,900)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>S. pombe </it>(2,403)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p><it>S. cerevisiae </it>(417)</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>*Indices signify rank order of slope coefficients, separately for negative and positive trends. <sup>&#8224;</sup>L2, R2, S2 and L4, R4, S4 signify the two-fold and four-fold degenerate blocks of leucine, arginine, and serine, respectively. <sup>&#8225;</sup><it>S. cerevisiae </it>terminal exons were retained given the small number of genes with more than one intron (eight).</p>
            </tblfn>
         </tbl>
         <p>But are these trends a peculiarity of mammals or common in other taxa? Does the presence or absence of trends correspond to what is known about the significance of exonic splicing regulation in each species? For example, a recent survey of several eukaryote genomes showed the SR protein family to be greatly expanded in metazoans but scarcely represented in unicellular genomes <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. A failure to find preference trends in <it>S. cerevisiae</it>, an organism lacking SR proteins <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, might corroborate the hypothesis that preference patterns are indeed caused by ESEs. Moreover, if there are discernible trends in other species, do we repeatedly see the same amino acids avoided or preferred or are trends largely unique to each species? Also, are mammals unusual in showing a tight correlation between 5' and 3' trends, and may divergent results bear implications for the workings of the splicing machinery? Finally do we find more skews in species that <it>a priori </it>are expected to have a harder time identifying exons, that is, those in which exons are relatively small islands in a sea of intronic sequence? Here we examine these issues with exon data from a diverse set of species.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Preference trends are widespread in multicellular species</p>
            </st>
            <p>Exons from eight metazoan species (Human (Hs), mouse (Mm), <it>Danio rerio </it>(Dr), <it>Caenorrhabditis elegans </it>(Ce), <it>Caenorrhabditis briggsae </it>(Cb), <it>Anopheles gambiae </it>(Ag), <it>Drosophila melanogaster </it>(Dm), <it>Apis mellifera </it>(Am)), one plant (<it>Arabidopsis thaliana</it> (At)) and two ascomycetous fungi (<it>S. cerevisiae </it>(Sc), <it>Schizosaccharomyces pombe </it>(Sp)), were examined for trends in amino acid composition as one approaches the exon-intron boundary. Species were chosen from among a relatively small set of organisms for which high quality comparative data on splice-regulatory proteins have recently become available <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. As splice site signals can extend into exons and our focus is on exonic splicing regulation, we removed the first full codon at the exon-intron boundary (see Materials and methods). Thereafter, rank correlations (rho) between distance from the boundary (34 codons into the exon; see Materials and methods) and proportional usage of the amino acid were computed independently for 5' and 3' regions of exons. Further, for all amino acids independently we fitted a linear regression extracting the slope of the line to be used as a crude diagnostic for the strength of amino acid preference/avoidance. Figure <figr fid="F1">1</figr> illustrates the different types of relationship observed.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Nature and diversity of amino acid abundance trends near exon-intron boundaries</p>
               </caption>
               <text>
                  <p>Nature and diversity of amino acid abundance trends near exon-intron boundaries. Relative abundance of glutamine (Q), methionine (M), and lysine (K) as a function of distance from the boundary across 5' ends of <it>D. melanogaster </it>exons is shown. Glutamine is significantly avoided near the boundary (rho = 0.86, <it>P</it> &lt; 1.84E-7), lysine is preferred (rho = -0.65, <it>P </it>&lt; 6.2E-5), whilst no significant trend is evident for methionine (rho = 0.096, <it>P</it> = 0.59). Note that a negative slope/rho value indicates a preference near the exon-intron boundary. Typically, where patterns of preference/avoidance are evident, we observe quasi-monotonic decreases/increases in relative abundance across the sequence range analyzed.</p>
               </text>
               <graphic file="gb-2008-9-2-r29-1"/>
            </fig>
            <p>Two-fold and four-fold blocks of the six-fold degenerate amino acids were considered as distinct groupings so that a total of 46 tests (23 amino acid groups 5' and 3') were carried out for each species. Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr> give a comprehensive by-species overview of amino acid preferences/avoidances, significant after Bonferroni correction (N = 46 comparisons, <it>P </it>&lt; 0.0011). Additional data file 1 contains the complete set of rank correlations for all 11 species.</p>
            <p>The most conspicuous feature of Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr> is arguably the commonality of trends in the metazoa and the scarcity of trends in the ascomycetous yeast species. The two-fold block of leucine (L2) in <it>S. cerevisiae </it>is the only amino acid grouping exhibiting a significant preference trend (rho = -0.4482, <it>P </it>&lt; 0.0003). This is in stark contrast to the suite of multicellular eukaryotes where an extensive range of avoidance and preference trends is observed. Only three multicellular species display fewer than 13 significant trends (Dm, Ag, At) whereas five (Hs, Mm, Ce, Cb, Am) display more than 20. For <it>D. melanogaster </it>and <it>C. elegans</it>, we tested whether the results might be biased as a result of exon homology, but in either case found amino acid abundance patterns at exon ends to be virtually identical in a set of homology-reduced genes (Dm, N = 8,840; Ce, N = 11,790; Additional data files 2 and 3).</p>
            <p>The role of exonic guidance in splicing organization has been linked to multiple aspects of genome composition and pre-mRNA structure, including intron/exon length <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>, intron number <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> and density <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> and splice site information content <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. The number of significant amino acid trends per species tightly covaries with some of these factors, notably the mean number of introns per gene (rho = 0.95, <it>P </it>&lt; 0.0001), median coding sequence (median CDS) per gene (rho = -0.97, <it>P </it>&lt; 0.0007), genomic number of introns (rho = 0.86, <it>P </it>&lt; 0.003), and intron length (log10(mean length): rho = 0.83, <it>P </it>&lt; 0.006) as expected under a model where complex transcripts with multiple long introns elicit increasing reliance on exon definition <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. On the other hand, neither SR protein family size (rho = 0.59, <it>P </it>= 0.09) nor splice site information content (5', rho = -0.26, <it>P </it>= 0.50; 3', rho = 0.43, <it>P </it>= 0. 25) show any relationship with the number of amino acid skews near intron-exon boundaries. The latter observation is perhaps the more interesting as it suggests that there is no straightforward compensatory relationship between splice site information content and the need for exonic regulation across species.</p>
            <p>Finally, the number of exons from which amino acid trends were derived, although correlated with the number of trends (rho = 0.86, <it>P </it>&lt; 0.003), does not feature among the top predictors when multicollinearity is controlled for (Additional data files 4-6). Together with the observation that we find relatively few trends in <it>Arabidopsis</it>, despite the substantial number of exons sampled, this suggests that sample size is not the critical factor in detecting different numbers of trends across species. We must stress, however, that the above results should be regarded as strictly exploratory given the small number of observations (Additional data file 4). A greater number of species with more comprehensive phylogenetic sampling will be required to validate the results in the future.</p>
            <p>The preeminence of exon-intron structure in predicting the number of amino acid trends suggests that the intron-poor ascomycetous fungi analyzed here might not be representative of their kingdom. We therefore analyzed the composition of exon ends in <it>Cryptococcus neoformans </it>(Cn), an intron-rich basidiomycete. Strikingly, we find a large number (26) of preference and avoidance trends in this species (Table <tblr tid="T3">3</tblr> and Additional data file 1), with some marked similarities in comparison to metazoan trends, particularly 5'. Furthermore, the inclusion of <it>C. neoformans </it>data in the analysis of potential predictor variables does not substantially change previous results: the mean number of introns per gene (rho = 0.91, <it>P </it>&lt; 0.0002), median CDS per gene (rho = -0.68, <it>P </it>&lt; 0.032) and the genomic number of introns (rho = 0.72, <it>P </it>&lt; 0.02) remain strong predictors (Additional data file 6).</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Amino acids significantly preferred (-) or avoided (+) at 3' (top rows) and 5' (bottom rows) exon ends of <it>C. neoformans </it>compared to human</p>
               </caption>
               <tblbdy cols="24">
                  <r>
                     <c cspan="23" ca="left">
                        <p>Amino acids<sup>*&#8224;</sup></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="23">
                        <hr/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>G</p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>I</p>
                     </c>
                     <c ca="left">
                        <p>K</p>
                     </c>
                     <c ca="left">
                        <p>L4</p>
                     </c>
                     <c ca="left">
                        <p>L2</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                     <c ca="left">
                        <p>N</p>
                     </c>
                     <c ca="left">
                        <p>P</p>
                     </c>
                     <c ca="left">
                        <p>Q</p>
                     </c>
                     <c ca="left">
                        <p>R4</p>
                     </c>
                     <c ca="left">
                        <p>R2</p>
                     </c>
                     <c ca="left">
                        <p>S4</p>
                     </c>
                     <c ca="left">
                        <p>S2</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>V</p>
                     </c>
                     <c ca="left">
                        <p>W</p>
                     </c>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="left">
                        <p>Species (number of exons)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="24">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>+<sub>3</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>7</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>3</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>2</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>1</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>5</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>6</sub></p>
                     </c>
                     <c ca="left">
                        <p>+<sub>2</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>+<sub>1</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>4</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>+<sub>4</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Human (178,438): 3'</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>6</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>+<sub>1</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>2</sub></p>
                     </c>
                     <c ca="left">
                        <p>+<sub>4</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>7</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>1</sub></p>
                     </c>
                     <c ca="left">
                        <p>+<sub>3</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>3</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>+<sub>6</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>5</sub></p>
                     </c>
                     <c ca="left">
                        <p>+<sub>2</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>+<sub>5</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>4</sub></p>
                     </c>
                     <c ca="left">
                        <p><it>C. neoformans </it>(28,446): 3'</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>+<sub>2</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>4</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>5</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>+<sub>7</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>3</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>1</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>2</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>8</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>6</sub></p>
                     </c>
                     <c ca="left">
                        <p>+<sub>1</sub></p>
                     </c>
                     <c ca="left">
                        <p>+<sub>4</sub></p>
                     </c>
                     <c ca="left">
                        <p>+<sub>3</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>7</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>+<sub>5</sub></p>
                     </c>
                     <c ca="left">
                        <p>+<sub>6</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Human (178438): 5'</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>9</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>5</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>7</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>3</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>1</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>6</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>+<sub>2</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>+<sub>3</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>4</sub></p>
                     </c>
                     <c ca="left">
                        <p>+<sub>1</sub></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-<sub>8</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>10</sub></p>
                     </c>
                     <c ca="left">
                        <p>-<sub>2</sub></p>
                     </c>
                     <c ca="left">
                        <p><it>C. neoformans </it>(28,446): 5'</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*Indices signify rank order of slope coefficients, separately for negative and positive trends. <sup>&#8224;</sup>L2, R2, S2 and L4, R4, S4 signify the two-fold and four-fold degenerate blocks of leucine, arginine, and serine, respectively.</p>
               </tblfn>
            </tbl>
            <p>Virtually nothing is known about the splicing mechanism in <it>C. neoformans </it>but the demonstration of alternative splicing pathways in this species <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> as well as low splice site information content (Additional data file 5) <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> make the presence of exonic splicing regulation a credible possibility. Consistent with this, the predicted <it>C. neoformans </it>proteome contains multiple proteins resembling known eukaryotic SR proteins, particularly in that they harbor RNA recognition domains (Additional data file 7). This is suggestive of involvement in splicing, albeit evidently insufficient to reach conclusions about specific functional roles of these proteins.</p>
         </sec>
         <sec>
            <st>
               <p>Cross-species patterns</p>
            </st>
            <p>Whilst the spectra of amino acids preferred/avoided by individual species are ultimately unique in breadth (how many trends) and composition (which amino acids are affected), there is considerable cross-specific overlap in terms of whether a particular trend is present at all, its direction, and relative strength (as measured by the slope of the line of best fit). Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr> illustrate that this particular agreement is virtually perfect between human and mouse <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, with marginal differences in the relative strength of individual trends, and that directionality is conserved throughout. Considering zebrafish (Dr) as the only other vertebrate in our sample alongside these species, we notice that its spectrum is slightly diminished in breadth and contains a few trends not seen in the two mammals (G (3'), V (5',3')). However, overall concordance in composition and strength is still remarkably good, and the 'mammalian pattern of directionality' perfectly adhered to. The nematode pair almost matches the human-mouse dyad in terms of overall concordance of preference patterns, with directionality perfectly conserved.</p>
            <p>For the most part, the patterns of preference/avoidance are repeatable across species. Table <tblr tid="T4">4</tblr> shows pairwise comparisons between species giving rank correlations (rho) for the slopes derived from all 23 amino acid groupings. For the vertebrate group both 5' and 3' correlations are very high (all rho > 0.9, all <it>P </it>&lt; 1.81E-06; 90 tests, significance threshold, <it>P </it>&lt; 5.56E-04), with human and mouse in almost perfect agreement. More remarkably, however, some strong correlations also exist 3' between the vertebrates and, for example, <it>Anopheles </it>(all rho > 0.87, all <it>P </it>&lt; 2.94E-06) and <it>Drosophila </it>(all rho > 0.75, all <it>P </it>&lt; 2.9E-05). The 3' correlations are less impressive for the remaining species (Am, At, Cn) but <it>Apis </it>(all rho > 0.75, all <it>P </it>&lt; 4.11E-05) and even <it>Cryptococcus </it>(all rho > 0.69, all <it>P </it>&lt; 5.56E-04) boast remarkably strong 5' correlations with the vertebrates. Focusing on specific amino acid trends, isoleucine (I) stands out in that it is strongly preferred near 3' boundaries across all species; others are well represented, albeit not universal, through the entire phylogeny - for example, 5' avoidance of glutamine (Q), and 3' preference for phenylalanine (F).</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Cross-species correlations of preference slope coefficients considering all 23 amino acid groupings, 5' (bottom-left) and 3' (top-right)<sup>*&#8224;</sup></p>
               </caption>
               <tblbdy cols="11">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Hs</p>
                     </c>
                     <c ca="left">
                        <p>Mm</p>
                     </c>
                     <c ca="left">
                        <p>Dr</p>
                     </c>
                     <c ca="left">
                        <p>Ce</p>
                     </c>
                     <c ca="left">
                        <p>Cb</p>
                     </c>
                     <c ca="left">
                        <p>Ag</p>
                     </c>
                     <c ca="left">
                        <p>Dm</p>
                     </c>
                     <c ca="left">
                        <p>Am</p>
                     </c>
                     <c ca="left">
                        <p>At</p>
                     </c>
                     <c ca="left">
                        <p>Cn</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Hs</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>0.99<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.93<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.71<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.67<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.88<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.84<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.53<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.11</p>
                     </c>
                     <c ca="left">
                        <p>0.08</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mm</p>
                     </c>
                     <c ca="left">
                        <p>0.99<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>0.92<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.69<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.67<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.88<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.85<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.60<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.20</p>
                     </c>
                     <c ca="left">
                        <p>0.15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Dr</p>
                     </c>
                     <c ca="left">
                        <p>0.92<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.90<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>0.74<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.71<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.87<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.77<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.48<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.16</p>
                     </c>
                     <c ca="left">
                        <p>0.14</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ce</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.43<sup>+</sup></b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.39</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.40</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>0.98<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.84<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.72<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.37</p>
                     </c>
                     <c ca="left">
                        <p>0.24</p>
                     </c>
                     <c ca="left">
                        <p>0.16</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cb</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.60<sup>+</sup></b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.56</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.65<sup>+</sup></b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>0.78<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>0.82<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.71<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.34</p>
                     </c>
                     <c ca="left">
                        <p>0.21</p>
                     </c>
                     <c ca="left">
                        <p>0.17</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ag</p>
                     </c>
                     <c ca="left">
                        <p>0.62<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.60<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.61<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.26</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>0.89<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.50<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.18</p>
                     </c>
                     <c ca="left">
                        <p>0.18</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Dm</p>
                     </c>
                     <c ca="left">
                        <p>0.64<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.61<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.51<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.04</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.14</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>0.64<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>0.57<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.21</p>
                     </c>
                     <c ca="left">
                        <p>0.15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Am</p>
                     </c>
                     <c ca="left">
                        <p>0.76<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.79<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.77<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.32</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.41</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>0.48<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.46<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>0.66<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.55<sup>+</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>At</p>
                     </c>
                     <c ca="left">
                        <p>0.44<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.44<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.50<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.36</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.36</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>0.06</p>
                     </c>
                     <c ca="left">
                        <p>0.19</p>
                     </c>
                     <c ca="left">
                        <p>0.40</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>0.75<sup>++</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cn</p>
                     </c>
                     <c ca="left">
                        <p>0.72<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.69<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.75<sup>++</sup></p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.31</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>-0.53</b>
                           <sup>+</sup>
                        </p>
                     </c>
                     <c ca="left">
                        <p>0.39</p>
                     </c>
                     <c ca="left">
                        <p>0.21</p>
                     </c>
                     <c ca="left">
                        <p>0.54<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>0.52<sup>+</sup></p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*<it>S. pombe </it>and <it>S. cerevisiae </it>omitted for clarity given the absence of significant correlations. <sup>&#8224;</sup>Negative correlations in bold. +, significant at <it>P </it>= 0.05; + +, significant at <it>P </it>= 0.05/90 = 5.56E-04 (N = 90 tests).</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Deviant nematodes</p>
            </st>
            <p>The strong cross-species concordance in preference patterns makes one observation all the more striking. The nematode 5' spectra behave in a highly counterintuitive manner in that the 'mammalian pattern of directionality' is violated on several occasions: where we do find significant trends in nematodes and other species (E, K, L2, Q, R4, R2, T), all but glutamine (Q) show discrepant directionality (Table <tblr tid="T2">2</tblr>). For example, whereas lysine (K) is strongly preferred near boundaries in vertebrates and some insects (Dm, Am), it appears to be strongly avoided in the 5' region of nematode exons (Figure <figr fid="F2">2</figr>). Table <tblr tid="T4">4</tblr> also underlines the exceptional position of nematodes: 5' correlations between nematodes and any other species are pervasively negative. No single correlation across all amino acids is significantly different from zero applying the adjusted significance threshold (<it>P </it>&lt; 5.56E-04), owing to several trends collapsing into insignificance rather than fully reversing sign. However, the pervasiveness of this pattern is nonetheless noteworthy, especially considering that the same is not the case for the 3' spectra where we find a coherent agreement between nematodes and vertebrates (minimum rho > 0.65, all significant at <it>P </it>&lt; 5.92E-04) and only the two-fold block of serine (S2) shows a reverse pattern of directionality among the significant trends for individual amino acids.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Relative amino acid abundance of lysine (K) at 5' ends of exons in six species</p>
               </caption>
               <text>
                  <p>Relative amino acid abundance of lysine (K) at 5' ends of exons in six species. Proportional usage of lysine <it>vis-&#224;-vis </it>all other amino acids is plotted against distance from the exon-intron boundary measured in amino acids. Variable degrees of preference for lysine near the boundary are evident for non-nematode species (Am, rho = -0.67, <it>P </it>= 2.71E-05, &#946;(slope) = -0.017; Dr, rho = -0.79, <it>P </it>= 6.51E-07, &#946; = -0.035; Dm, rho = -0.65, <it>P </it>= 6.11E-05, &#946; = -0.020; Hs, rho = -0.90, <it>P </it>= 3.67E-09, &#946; = -0.041) whereas nematodes show strong avoidance trends (Ce, rho = 0.89, <it>P </it>= 5.26E-08, &#946; = 0.030; Cb, rho = 0.92, <it>P </it>= 0, &#946; = 0.033).</p>
               </text>
               <graphic file="gb-2008-9-2-r29-2"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Many species obey an approximately symmetric pattern of preference trends 5' and 3'</p>
            </st>
            <p>This curious discrepancy between 5' and 3' spectra of amino acid trends in nematodes led us to investigate further the relationship of 5' and 3' patterns across species. Considering all amino acid trends simultaneously, rank correlations between slope coefficients (5'~3') were computed. Furthermore, we wanted to explicitly test the hypothesis that preference trends show a 'symmetric' behavior, that is, that individual amino acids exhibit preference trends of similar strength and direction at 5' and 3' ends. To this end, we carried out standardized major axis regressions (SMA; see Materials and methods) <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp> for 5' versus 3' trends in each species and compared the resulting regression line with one expected under perfect symmetry (y = x). The results are given in Table <tblr tid="T5">5</tblr> and graphically represented in Figure <figr fid="F3">3</figr>. Human and mouse show very substantial positive correlations between 5' and 3' preference trends (Hs, rho = 0.8528, <it>P </it>= 1.96E-06; Mm, rho = 0.8626, <it>P </it>= 2.28E-06). Although diminished in strength, we also see significant correlations for <it>Drosophila </it>and <it>Danio</it>. As expected from the previous analysis, correlations for nematodes are negative, albeit not significantly so (Ce, rho = -0.1413, <it>P </it>= 0.5185; Cb, rho = -0.4358, <it>P </it>= 0.0388). However, the SMA results allow us to reject any notion of <it>C. elegans </it>or <it>C. briggsae </it>adhering to a symmetric pattern of amino acid usage, the respective confidence intervals (CIs) ruling out a symmetry slope of &#946; = 1 (CI (Ce), [-1.118; -0.7309]; CI (Cb), [-0.7474; -0.5139]). No other species for which an SMA could be carried out (Table <tblr tid="T5">5</tblr>; Materials and methods) deviate significantly from a symmetric model, although symmetry of amino acid trends varies greatly and can only really be called a defining characteristic of exon ends in vertebrates.</p>
            <tbl id="T5">
               <title>
                  <p>Table 5</p>
               </title>
               <caption>
                  <p>Intraspecific 5'~3' correlations of preference slopes for all 23 amino acid groupings</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3" ca="center">
                        <p>SMA</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Rho</p>
                     </c>
                     <c ca="center">
                        <p><it>P</it>-value*</p>
                     </c>
                     <c ca="center">
                        <p>Slope (&#946;)</p>
                     </c>
                     <c ca="center">
                        <p>Lower Cl<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>Upper Cl<sup>&#8224;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human</p>
                     </c>
                     <c ca="center">
                        <p>0.85</p>
                     </c>
                     <c ca="center">
                        <p>1.96E-06</p>
                     </c>
                     <c ca="center">
                        <p>1.04</p>
                     </c>
                     <c ca="center">
                        <p>0.83</p>
                     </c>
                     <c ca="center">
                        <p>1.29</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mouse</p>
                     </c>
                     <c ca="center">
                        <p>0.86</p>
                     </c>
                     <c ca="center">
                        <p>2.28E-06</p>
                     </c>
                     <c ca="center">
                        <p>0.99</p>
                     </c>
                     <c ca="center">
                        <p>0.80</p>
                     </c>
                     <c ca="center">
                        <p>1.23</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>D. rerio</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0.66</p>
                     </c>
                     <c ca="center">
                        <p>8.3E-04</p>
                     </c>
                     <c ca="center">
                        <p>1.04</p>
                     </c>
                     <c ca="center">
                        <p>0.78</p>
                     </c>
                     <c ca="center">
                        <p>1.40</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>C. elegans</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>-0.14</p>
                     </c>
                     <c ca="center">
                        <p>0.52</p>
                     </c>
                     <c ca="center">
                        <p>-1.11</p>
                     </c>
                     <c ca="center">
                        <p>-0.73</p>
                     </c>
                     <c ca="center">
                        <p>-1.69</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>C. briggsae</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>-0.44</p>
                     </c>
                     <c ca="center">
                        <p>0.04</p>
                     </c>
                     <c ca="center">
                        <p>-0.75</p>
                     </c>
                     <c ca="center">
                        <p>-0.51</p>
                     </c>
                     <c ca="center">
                        <p>-1.09</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>A. gambiae</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0.57</p>
                     </c>
                     <c ca="center">
                        <p>5.16E-03</p>
                     </c>
                     <c ca="center">
                        <p>1.08</p>
                     </c>
                     <c ca="center">
                        <p>0.79</p>
                     </c>
                     <c ca="center">
                        <p>1.48</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>D. melanogaster</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0.61</p>
                     </c>
                     <c ca="center">
                        <p>2.49E-03</p>
                     </c>
                     <c ca="center">
                        <p>1.15</p>
                     </c>
                     <c ca="center">
                        <p>0.82</p>
                     </c>
                     <c ca="center">
                        <p>1.62</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>A. mellifera</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0.39</p>
                     </c>
                     <c ca="center">
                        <p>0.06</p>
                     </c>
                     <c ca="center">
                        <p>1.32</p>
                     </c>
                     <c ca="center">
                        <p>0.88</p>
                     </c>
                     <c ca="center">
                        <p>1.96</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>A. thaliana</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>-0.22</p>
                     </c>
                     <c ca="center">
                        <p>0.30</p>
                     </c>
                     <c ca="center">
                        <p>NA<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>NA<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>NA<sup>&#8225;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>S. pombe</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0.22</p>
                     </c>
                     <c ca="center">
                        <p>0.31</p>
                     </c>
                     <c ca="center">
                        <p>0.77</p>
                     </c>
                     <c ca="center">
                        <p>0.50</p>
                     </c>
                     <c ca="center">
                        <p>1.17</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>S. cerevisiae</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0.16</p>
                     </c>
                     <c ca="center">
                        <p>0.46</p>
                     </c>
                     <c ca="center">
                        <p>2.42<sup>&#167;</sup></p>
                     </c>
                     <c ca="center">
                        <p>1.58</p>
                     </c>
                     <c ca="center">
                        <p>3.70</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>C. neoformans</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0.02</p>
                     </c>
                     <c ca="center">
                        <p>0.92</p>
                     </c>
                     <c ca="center">
                        <p>NA<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>NA<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>NA<sup>&#8225;</sup></p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*With 12 species significance is indicated by <it>P </it>= 0.05/12 = 4.17E-03. <sup>&#8224;</sup>CI = 0.95, the regression line was forced through the origin. <sup>&#8225;</sup>See Materials and methods. <sup>&#167;</sup>Adequacy of SMA regression analysis is seriously in doubt for <it>S. cerevisiae </it>because normal distribution of residuals is strongly violated. NA, not available.</p>
               </tblfn>
            </tbl>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Variable symmetry in amino acid abundance trends comparing 5' and 3' exon ends within species</p>
               </caption>
               <text>
                  <p>Variable symmetry in amino acid abundance trends comparing 5' and 3' exon ends within species. Intraspecific correlations between the 5' (x-axis) and 3' (y-axis) slopes as extracted from individually fitted linear models considering all 23 amino acid groupings are shown. Approximately symmetric arrangements are particularly evident for some species (notably vertebrates) whereas nematode arrangements (Ce, Cb) are not symmetric. Further notable is the higher variability of slope coefficients in some species (vertebrates and nematodes) <it>vis-&#224;-vis </it>others (Am, At). Amino acids are represented by their one letter code (two-fold blocks are denoted by '2'). The regression lines are from SMA regressions. Lines were not fitted for <it>Arabidopsis</it>, <it>Cryptococcus </it>and <it>S. cerevisiae </it>given concerns about the adequacy of this technique for these datasets (see Materials and methods). For associated statistics consult Table 5.</p>
               </text>
               <graphic file="gb-2008-9-2-r29-3"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Amino acid trends are largely consistent with participation in ESE motifs</p>
            </st>
            <p>Intriguingly, asymmetries in the amino acid composition of nematode exon ends appear to be mirrored by a corresponding asymmetry of regulatory motifs. Robinson <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, using a computational approach to characterize candidate ESEs in <it>C. elegans</it>, found that 5' and 3' ends were distinguished by different classes of consensus motifs. Crucially, he found purine-rich human-like candidate motifs to be associated with 3' ends but not 5' ends of nematode exons, which is broadly consistent with our observation that amino acids encoded by purine-rich codons tend to be, in contrast to other animals, disfavored at 5' ends (Table <tblr tid="T2">2</tblr> and Figure <figr fid="F3">3</figr>).</p>
            <p>For mammals, the prediction that amino acids preferred near boundaries should correspond to those favored in ESEs was tested by Parmley <it>et al</it>. <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. The authors defined a metric that quantifies the involvement of amino acids in splice enhancer hexamers relative to the null expectation that every codon is represented in ESEs around its genomic frequency. As predicted, these hexamer preference indices (HPIs), computed for each amino acid grouping, were found to correlate with preference trends, strongly preferred amino acids on average associated with higher HPI values.</p>
            <p>This relationship holds true for human as well as murine ESE sets and amino acid trends, considering either rank correlation coefficients (rho<sub>x</sub>; Hs HPI~rho<sub>x</sub>, rho = -0.54, <it>P </it>&lt; 0.00001, N = 46; Mm HPI~rho<sub>x</sub>, rho = -0.49, <it>P </it>= 0.0005, N = 46) or the slope (&#946;) of the fitted linear model (Hs HPI~&#946;, rho = -0.57, <it>P </it>&lt; 0.0001, N = 46; Mm HPI~&#946;, rho = -0.52, <it>P </it>= 0.0002, N = 46).</p>
            <p>As expected from the demonstration that ESEs can act at varying distances from the splice site <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, human ESEs do not exhibit a reading frame bias beyond what is expected from the genomic frequencies of the underlying codons (Additional data file 8). They can also, in principle, incorporate most codons (Additional data file 8). In consequence, the defined set of amino acids we find avoided or preferred are likely not due to ultimate exclusion of certain codons but because different efficacy and specificity across ESEs mean that often only a well-defined subset of codons can be used to specify the desired ESE.</p>
            <p>Unexpectedly, when we derived HPIs for zebrafish amino acids, using a set of ESEs obtained from the same source <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, we found a significant correlation of reverse sign (Dr HPI~rho<sub>x </sub>(5'), rho = 0.6, <it>P </it>&lt; 0.003, N = 46; HPI~rho<sub>x </sub>(3'), rho = 0.59, <it>P </it>&lt; 0.0033, N = 46). Many experimentally verified ESEs have been characterized as A-rich and C-poor relative to the background frequency of these nucleotides in coding sequence. Whilst we found this to be the case for putative human ESE motifs not shared with zebrafish (A, 47.38% (ESE) versus 25.57% (exonic); C, 15.28% versus 25.99%, N(ESE) = 204), and for ESEs present in both species (A, 50% versus 25.57%; C, 6.37% versus 25.99%, N = 34), unique zebrafish ESEs (that is, ESEs not present in human) from this dataset were unusually enriched in C (39.47% versus 25.99%, N = 288) and relatively poor in A (18.40% versus 25.57%). Although one would expect ESE motifs to vary across taxa, the discrepancies are so pronounced as to sit awkwardly next to the substantial similarities in amino acid trends (Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr>). One criterion used by the Burge group <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> to identify candidate ESE motifs was for such motifs to be more common near weak versus strong splice sites. Therefore, one possible explanation is that C-richness is a characteristic of zebrafish ESEs near weak splice sites but not generally, so that the predicted ESEs are not representative of ESEs across the zebrafish genome. Alternatively, comparatively lower quality of the, then recent, zebrafish genome build might be responsible for the divergent results. A re-examination of these putative zebrafish ESEs with an updated genome build may be worthwhile.</p>
         </sec>
         <sec>
            <st>
               <p>Reduced rates of evolution near the exon-intron boundary in species where ESEs are essential components of the splicing machinery</p>
            </st>
            <p>To further advance the hypothesis that gradients in amino acid abundance near exon-intron boundaries are a critical feature of exon ends in metazoans, we examined the degree of amino acid conservation as a function of distance from the boundary. For three pairs of species (<it>S. cerevisiae</it>-<it>Saccharomyces castellii</it>, <it>D. melanogaster</it>-<it>Drosophila pseudoobscura </it>(Dps); <it>C. elegans</it>-<it>C. briggsae</it>) sets of orthologous internal exons were derived from various sources and aligned at the amino acid level (see Materials and methods). Mirroring results from a comparison of human-mouse orthologues <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, we found strong and highly significant positive correlations of strikingly linear character (Figure <figr fid="F4">4</figr>) between distance from the boundary and amino acid substitution rate for the <it>Drosophila </it>and <it>Caenorhabditis </it>pairs, whilst proximity to the boundary did not appear to confer a higher level of amino acid conservation in the <it>Saccharomyces </it>comparison. Restricting the analysis to exons of at least 70 codons in length, we obtained qualitatively equivalent results (Drosophilae 5', rho = 0.53, <it>P </it>&lt; 0.002, N = 3,690; Drosophilae 3', rho = 0.77, <it>P </it>= 9.70E-07, N = 3,690; Caenorhabdites 5', rho = 0.74, <it>P </it>= 2.33E-06, N = 6,273; Caenorhabdites 3', rho = 0.58, <it>P </it>= 4.5E-04, N = 6,273). This restriction ensures that all exons contribute an approximately equal share of information to each codon position from the boundary and eliminates the potential confounder that short exons might, for reasons unrelated to splicing, feature more frequently in highly conserved genes and create misleading trends by virtue of their disproportionate contribution to substitution rate information closer to the boundary.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Frequency of nonsynonymous change as a function of distance from the exon-intron boundary</p>
               </caption>
               <text>
                  <p>Frequency of nonsynonymous change as a function of distance from the exon-intron boundary. Amino acids are significantly more likely to be conserved near the exon-intron boundary comparing <b>(a) </b><it>C. elegans</it>-<it>C. briggsae </it>(5', rho = 0.957, <it>P </it>= 0; 3', rho = 0.96. <it>P </it>= 0; N = 19,347 exons