<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2008-9-5-r79</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>CpG island density and its correlations with genomic features in mammalian genomes</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Han</snm>
               <fnm>Leng</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <insr iid="I3"/>
               <email>hanl2@vcu.edu</email>
            </au>
            <au id="A2">
               <snm>Su</snm>
               <fnm>Bing</fnm>
               <insr iid="I2"/>
               <insr iid="I4"/>
               <email>sub@mail.kiz.ac.cn</email>
            </au>
            <au id="A3">
               <snm>Li</snm>
               <fnm>Wen-Hsiung</fnm>
               <insr iid="I5"/>
               <email>whli@uchicago.edu</email>
            </au>
            <au id="A4" ca="yes">
               <snm>Zhao</snm>
               <fnm>Zhongming</fnm>
               <insr iid="I1"/>
               <insr iid="I6"/>
               <email>zzhao@vcu.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Psychiatry, Virginia Commonwealth University, Richmond, VA 23298, USA</p>
            </ins>
            <ins id="I2">
               <p>State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China</p>
            </ins>
            <ins id="I3">
               <p>Graduate School, Chinese Academy of Sciences, Beijing 100039, China</p>
            </ins>
            <ins id="I4">
               <p>Kunming Primate Research Center, Chinese Academy of Sciences, Kunming, Yunnan 650223, China</p>
            </ins>
            <ins id="I5">
               <p>Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA</p>
            </ins>
            <ins id="I6">
               <p>Department of Human Genetics and Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA 23284, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>5</issue>
         <fpage>R79</fpage>
         <url>http://genomebiology.com/2008/9/5/R79</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18477403</pubid>
               <pubid idtype="doi">10.1186/gb-2008-9-5-r79</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>7</day>
               <month>4</month>
               <year>2008</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>8</day>
               <month>4</month>
               <year>2008</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>13</day>
               <month>5</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>13</day>
               <month>05</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Han et al.; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <shorttitle>
         <p>CpG island density</p>
      </shorttitle>
      <shortabs>
         <p>A systematic analysis of CpG islands in ten mammalian genomes suggests that an increase in chromosome number elevates GC content and prevents loss of CpG islands.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>CpG islands, which are clusters of CpG dinucleotides in GC-rich regions, are considered gene markers and represent an important feature of mammalian genomes. Previous studies of CpG islands have largely been on specific loci or within one genome. To date, there seems to be no comparative analysis of CpG islands and their density at the DNA sequence level among mammalian genomes and of their correlations with other genome features.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>In this study, we performed a systematic analysis of CpG islands in ten mammalian genomes. We found that both the number of CpG islands and their density vary greatly among genomes, though many of these genomes encode similar numbers of genes. We observed significant correlations between CpG island density and genomic features such as number of chromosomes, chromosome size, and recombination rate. We also observed a trend of higher CpG island density in telomeric regions. Furthermore, we evaluated the performance of three computational algorithms for CpG island identifications. Finally, we compared our observations in mammals to other non-mammal vertebrates.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Our study revealed that CpG islands vary greatly among mammalian genomes. Some factors such as recombination rate and chromosome size might have influenced the evolution of CpG islands in the course of mammalian evolution. Our results suggest a scenario in which an increase in chromosome number increases the rate of recombination, which in turn elevates GC content to help prevent loss of CpG islands and maintain their density. These findings should be useful for studying mammalian genomes, the role of CpG islands in gene function, and molecular evolution.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>CpG islands (CGIs) are clusters of CpG dinucleotides in GC-rich regions and represent an important feature of mammalian genomes <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Mammalian genomic DNA generally shows a great deficit of CpG dinucleotides, for example, the ratio of the observed over the expected CpGs (Obs<sub>CpG</sub>/Exp<sub>CpG</sub>) is approximately 0.20-0.25 in the human and mouse genomes <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. This deficit is largely attributed to the hypermutability of methylated CpGs to TpGs (or CpAs in the complementary strand) <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. In comparison, CpGs in CGIs are often unmethylated and their frequencies are close to random expectation (for example, Obs<sub>CpG</sub>/Exp<sub>CpG </sub>= ~0.8 in the promoter-associated CGIs <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>). CGIs are often associated with the 5' end of genes and considered as gene markers <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>. However, a comparison of the human, mouse, and rat genomes indicated that, although these three genomes encode similar numbers of genes, the number of CGIs in the mouse (15,500) or rat (15,975) genome is far fewer than that (27,000) identified in the non-repetitive portions of the human genome <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>. The difference is probably due to a faster rate of loss of CGIs in the rodent lineage, rather than faster gains of CGIs in the human lineage <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B9">9</abbr></abbrgrp>. However, it remains unclear whether the loss-of-CGI model holds for other mammalian genomes. Furthermore, to our best knowledge, there has been no comprehensive analysis of CGIs and their density at the DNA sequence level in mammals.</p>
         <p>There are three major algorithms for identifying CGIs in a genomic sequence. The original algorithm was proposed by Gardiner-Garden and Frommer <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> in 1987; the three parameters are GC content >50%, Obs<sub>CpG</sub>/Exp<sub>CpG </sub>>0.60, and length >200 bp. This algorithm, often with some modifications, has been widely applied in the analysis of CGIs in single genes, small sets of genomic sequences, or single genomes. However, many repeats (for example, <it>Alu</it>), which are abundant in the vertebrate genome, also meet the criteria, so this algorithm has usually been used to scan CGIs only in non-repeat portions of the genome <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>. Second, Takai and Jones <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> evaluated the three parameters in Gardiner-Garden and Frommer's algorithm using human gene data and suggested an optimal set of parameters (GC content &#8805;55%, Obs<sub>CpG</sub>/Exp<sub>CpG </sub>&#8805;0.65, and length &#8805;500 bp). This algorithm can effectively exclude false positive CGIs from repeats and more likely identify CGIs associate with the 5' end of human genes; it seems to be suitable for other genomes too <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Third, more recently, Hackenberg <it>et al</it>. <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> developed a new algorithm, namely CpGcluster, that entirely depends on the statistical significance of a CpG cluster from random sequences in the same chromosome. Because CpGcluster does not require a minimum length (for example, it identified CpG clusters as short as 8 bp) <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, it likely identifies many more CGIs (for example, 197,727 in the human genome) than other algorithms. In particular, CpGcluster may exaggerate the number of CGIs (that is, CpG clusters) in low GC-content chromosomes, which often have low gene density, because its CpG clusters were identified relative to the background (random) CpG property. Another similar CpG cluster algorithm identifies CpG clusters by requiring a minimum number of CpGs in each sequence fragment <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Since loss of CGIs is likely an evolutionary trend in at least some genomes <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B9">9</abbr><abbr bid="B17">17</abbr></abbrgrp>, CpGcluster may be able to identify those CGIs that have undergone degradation and thus can not meet the criteria of Takai and Jones' or Gardiner-Garden and Frommer's algorithms.</p>
         <p>Our major aim is to survey extant CGIs (that is, CGIs that meet the three typical criteria: length, GC content, and Obs<sub>CpG</sub>/Exp<sub>CpG</sub>) and their distribution in today's genomes, rather than to identify regions that might originally be CGIs, even though they do not meet the three typical criteria. A comparative study of the features of such CGIs will be helpful for studying the evolution of CGIs and sequence composition changes in the course of genome evolution. Recent genome sequencing projects have released a number of mammalian genomes with good quality annotations, but only few non-mammalian vertebrate genomes. Thus, in this study we focused on the analysis and comparison of CGIs and their correlations with genomic features in mammalian genomes. For our aim, it is appropriate to apply the same CGI detection algorithm to screen CGIs in multiple genomes for comparison. According to the introduction of the three algorithms above, we selected Takai and Jones' algorithm as a major algorithm in this study.</p>
         <p>We conducted a systematic survey of CGIs in ten sequenced mammalian genomes: eight completely sequenced eutherian genomes (human (<it>Homo sapiens</it>), chimpanzee (<it>Pan troglodytes</it>), macaque (<it>Macaca mulatta</it>), mouse (<it>Mus musculus</it>), rat (<it>Rattus norvegicus</it>), dog (<it>Canis familiaris</it>), cow (<it>Bos taurus</it>), and horse (<it>Equus caballus</it>)); one completely sequenced metatherian genome (opossum (<it>Monodelphis domestica</it>)); and one prototherian genome (platypus (<it>Ornithorhynchus anatinus</it>)) whose sequence was completed with a 6&#215; coverage, though it has not been completely assembled. We also compared the observations from these mammals to seven other non-mammal vertebrates.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>CGIs and CGI density in ten mammalian genomes</p>
            </st>
            <p>We first present our analysis of CGIs identified by Takai and Jones' algorithm <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> in ten mammalian genome sequences. The conclusions are essentially the same when we used the popular algorithm by Gardiner-Garden and Frommer <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> or the recently developed algorithm CpGcluster <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> (see Discussion). The species names and the sources of genome sequences are shown in the Materials and methods. Table <tblr tid="T1">1</tblr> summarizes the genome information and statistics of CGIs. Except for the platypus, these genomes had similar sizes (2.0-3.3 Gb) and similar numbers of annotated genes (20,000-30,000; Additional data file 1). However, both the number of CGIs and the CGI density (measured by the average number of CGIs per Mb) vary greatly among genomes. The dog genome has the largest number of CGIs (58,327) and the platypus genome has the highest CGI density (35.9 CGIs/Mb). Remarkably, the number of CGIs in the dog genome is nearly three times that in the rat (19,568) or mouse (20,458) genome, even though the number of dog genes has been estimated to be smaller than those of human or mouse genes (dog, 19,300 <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>; human, 20,000-25,000 <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>; mouse, approximately 30,000 <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>). The CGI density (per Mb) ranges from 7.5 (opossum) to 35.9 (platypus) in the 10 genomes investigated. These results suggest that, although genes are often associated with CGIs, the extant CGIs are distributed very differently among genomic regions (for example, genes versus non-coding regions) in mammalian genomes.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>CpG islands and other genomic features in ten mammalian genomes</p>
               </caption>
               <tblbdy cols="11">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="5" ca="center">
                        <p>Genome</p>
                     </c>
                     <c cspan="5" ca="center">
                        <p>CpG islands</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="5">
                        <hr/>
                     </c>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Species</p>
                     </c>
                     <c ca="center">
                        <p>Size (Gb)*</p>
                     </c>
                     <c ca="center">
                        <p>Number of chromosome pairs</p>
                     </c>
                     <c ca="center">
                        <p>Number of arms<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>GC content (%)</p>
                     </c>
                     <c ca="center">
                        <p>Obs<sub>CpG</sub>/Exp<sub>CpG</sub></p>
                     </c>
                     <c ca="center">
                        <p>Number of CGIs</p>
                     </c>
                     <c ca="center">
                        <p>CGI density (/Mb)</p>
                     </c>
                     <c ca="center">
                        <p>Avgerage length (bp)</p>
                     </c>
                     <c ca="center">
                        <p>GC content (%)</p>
                     </c>
                     <c ca="center">
                        <p>Obs<sub>CpG</sub>/Exp<sub>CpG</sub></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human</p>
                     </c>
                     <c ca="center">
                        <p>2.85</p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                     <c ca="center">
                        <p>82</p>
                     </c>
                     <c ca="center">
                        <p>40.9</p>
                     </c>
                     <c ca="center">
                        <p>0.236</p>
                     </c>
                     <c ca="center">
                        <p>37,531</p>
                     </c>
                     <c ca="center">
                        <p>13.2</p>
                     </c>
                     <c ca="center">
                        <p>1,089</p>
                     </c>
                     <c ca="center">
                        <p>62.0</p>
                     </c>
                     <c ca="center">
                        <p>0.743</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Chimpanzee</p>
                     </c>
                     <c ca="center">
                        <p>2.75</p>
                     </c>
                     <c ca="center">
                        <p>24</p>
                     </c>
                     <c ca="center">
                        <p>84</p>
                     </c>
                     <c ca="center">
                        <p>40.7</p>
                     </c>
                     <c ca="center">
                        <p>0.233</p>
                     </c>
                     <c ca="center">
                        <p>35,845</p>
                     </c>
                     <c ca="center">
                        <p>13.0</p>
                     </c>
                     <c ca="center">
                        <p>1,011</p>
                     </c>
                     <c ca="center">
                        <p>60.3</p>
                     </c>
                     <c ca="center">
                        <p>0.761</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Macaque</p>
                     </c>
                     <c ca="center">
                        <p>2.65</p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>84</p>
                     </c>
                     <c ca="center">
                        <p>40.7</p>
                     </c>
                     <c ca="center">
                        <p>0.245</p>
                     </c>
                     <c ca="center">
                        <p>39,498</p>
                     </c>
                     <c ca="center">
                        <p>14.9</p>
                     </c>
                     <c ca="center">
                        <p>957</p>
                     </c>
                     <c ca="center">
                        <p>60.8</p>
                     </c>
                     <c ca="center">
                        <p>0.749</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mouse</p>
                     </c>
                     <c ca="center">
                        <p>2.48</p>
                     </c>
                     <c ca="center">
                        <p>20</p>
                     </c>
                     <c ca="center">
                        <p>40</p>
                     </c>
                     <c ca="center">
                        <p>41.7</p>
                     </c>
                     <c ca="center">
                        <p>0.192</p>
                     </c>
                     <c ca="center">
                        <p>20,458</p>
                     </c>
                     <c ca="center">
                        <p>8.2</p>
                     </c>
                     <c ca="center">
                        <p>1,043</p>
                     </c>
                     <c ca="center">
                        <p>60.6</p>
                     </c>
                     <c ca="center">
                        <p>0.756</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Rat</p>
                     </c>
                     <c ca="center">
                        <p>2.48</p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>64</p>
                     </c>
                     <c ca="center">
                        <p>41.9</p>
                     </c>
                     <c ca="center">
                        <p>0.220</p>
                     </c>
                     <c ca="center">
                        <p>19,568</p>
                     </c>
                     <c ca="center">
                        <p>7.9</p>
                     </c>
                     <c ca="center">
                        <p>1,004</p>
                     </c>
                     <c ca="center">
                        <p>59.7</p>
                     </c>
                     <c ca="center">
                        <p>0.758</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Dog</p>
                     </c>
                     <c ca="center">
                        <p>2.31</p>
                     </c>
                     <c ca="center">
                        <p>39</p>
                     </c>
                     <c ca="center">
                        <p>80</p>
                     </c>
                     <c ca="center">
                        <p>41.0</p>
                     </c>
                     <c ca="center">
                        <p>0.244</p>
                     </c>
                     <c ca="center">
                        <p>58,327</p>
                     </c>
                     <c ca="center">
                        <p>25.3</p>
                     </c>
                     <c ca="center">
                        <p>1,102</p>
                     </c>
                     <c ca="center">
                        <p>62.2</p>
                     </c>
                     <c ca="center">
                        <p>0.753</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cow</p>
                     </c>
                     <c ca="center">
                        <p>2.29</p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                     <c ca="center">
                        <p>62</p>
                     </c>
                     <c ca="center">
                        <p>41.9</p>
                     </c>
                     <c ca="center">
                        <p>0.236</p>
                     </c>
                     <c ca="center">
                        <p>36,729</p>
                     </c>
                     <c ca="center">
                        <p>16.0</p>
                     </c>
                     <c ca="center">
                        <p>1,023</p>
                     </c>
                     <c ca="center">
                        <p>61.2</p>
                     </c>
                     <c ca="center">
                        <p>0.740</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Horse</p>
                     </c>
                     <c ca="center">
                        <p>2.03</p>
                     </c>
                     <c ca="center">
                        <p>32</p>
                     </c>
                     <c ca="center">
                        <p>92</p>
                     </c>
                     <c ca="center">
                        <p>41.0</p>
                     </c>
                     <c ca="center">
                        <p>0.285</p>
                     </c>
                     <c ca="center">
                        <p>33,135</p>
                     </c>
                     <c ca="center">
                        <p>16.3</p>
                     </c>
                     <c ca="center">
                        <p>937</p>
                     </c>
                     <c ca="center">
                        <p>59.2</p>
                     </c>
                     <c ca="center">
                        <p>0.749</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Opossum</p>
                     </c>
                     <c ca="center">
                        <p>3.34</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>24</p>
                     </c>
                     <c ca="center">
                        <p>37.6</p>
                     </c>
                     <c ca="center">
                        <p>0.129</p>
                     </c>
                     <c ca="center">
                        <p>24,938</p>
                     </c>
                     <c ca="center">
                        <p>7.5</p>
                     </c>
                     <c ca="center">
                        <p>919</p>
                     </c>
                     <c ca="center">
                        <p>60.8</p>
                     </c>
                     <c ca="center">
                        <p>0.698</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Platypus<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>0.41</p>
                     </c>
                     <c ca="center">
                        <p>26</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>43.3</p>
                     </c>
                     <c ca="center">
                        <p>0.296</p>
                     </c>
                     <c ca="center">
                        <p>14,686</p>
                     </c>
                     <c ca="center">
                        <p>35.9</p>
                     </c>
                     <c ca="center">
                        <p>929</p>
                     </c>
                     <c ca="center">
                        <p>56.8</p>
                     </c>
                     <c ca="center">
                        <p>0.785</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*The nucleotides marked as 'N' were not included in the analysis. <sup>&#8224;</sup>Number of arms in a female. <sup>&#8225;</sup>Incomplete genome sequences (only 19 partially assembled chromosomes). NA, not available.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Correlations between CGI density and other genomic features</p>
            </st>
            <p>We examined the correlations between CGI density and other genomic features. Because of incomplete genome sequence and lack of some chromosome data in platypus, we present the correlation results only for the other nine genomes; the conclusion will likely be the same when the platypus data become available (Additional data file 2). We found a highly significant positive correlation between CGI density and number of chromosome pairs in a genome (<it>r </it>= 0.88, <it>P </it>= 7.9 &#215; 10<sup>-4</sup>; Figure <figr fid="F1">1a</figr>) and a significant correlation between CGI density and number of chromosome arms (<it>r </it>= 0.62, <it>P </it>= 0.037). As expected, there was a significant positive correlation between CGI density and Obs<sub>CpG</sub>/Exp<sub>CpG </sub>(<it>r </it>= 0.63, <it>P </it>= 0.035). No significant correlation was found between CGI density and genome size (<it>r </it>= -0.53, <it>P </it>= 0.073) or genome GC content (<it>r </it>= 0.24, <it>P </it>= 0.27).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Correlations between CGI density and genomic features in nine mammalian genomes</p>
               </caption>
               <text>
                  <p>Correlations between CGI density and genomic features in nine mammalian genomes. The platypus chromosomes were excluded because of incomplete genome sequence data and chromosome data. <b>(a) </b>CGI density (per Mb) versus number of chromosome pairs. <b>(b) </b>CGI density (per Mb) versus log<sub>10</sub>(chromosome size). The Y chromosomes were excluded because of insufficient data. <b>(c) </b>CGI density (per Mb) versus chromosome GC content (%). <b>(d)</b> CGI density (per Mb) versus chromosome Obs<sub>CpG</sub>/Exp<sub>CpG</sub>.</p>
               </text>
               <graphic file="gb-2008-9-5-r79-1"/>
            </fig>
            <p>There were a total of 219 chromosomes available in these 9 genomes after excluding the Y chromosomes. We found a highly significant negative correlation between CGI density and log<sub>10</sub>(chromosome size) (<it>r </it>= -0.51, <it>P </it>= 2.6 &#215; 10<sup>-16</sup>; Figure <figr fid="F1">1b</figr>), a highly significant positive correlation between CGI density and GC content of the chromosome (<it>r </it>= 0.65, <it>P </it>= 3.5 &#215; 10<sup>-28</sup>; Figure <figr fid="F1">1c</figr>), and a highly significant positive correlation between CGI density and Obs<sub>CpG</sub>/Exp<sub>CpG </sub>(<it>r </it>= 0.75, <it>P </it>= 2.8 &#215; 10<sup>-41</sup>; Figure <figr fid="F1">1d</figr>). We further separated the chromosomes into different groups by their sizes (&lt;25, 25-50, 50-75, 75-100, 100-150, 150-200, and >200 Mb). Interestingly, as the average size of a chromosome group increases, the CGI density decreases (Table <tblr tid="T2">2</tblr>). Indeed, the CGI density in small mammalian chromosomes (size &lt;25 Mb) is, on average, about three times that in large chromosomes (size >200 Mb). We noted that the platypus (2n = 52), which has six pairs of large chromosomes but many small chromosomes <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, has a much higher CGI density than the other nine mammalian genomes (Table <tblr tid="T1">1</tblr>). These results are consistent with the previous observation that CGIs are highly concentrated on the microchromosomes in chickens <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>CGI densities in chromosomes with different sizes in nine mammalian genomes</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c ca="left">
                        <p>Chromosome size (Mb)</p>
                     </c>
                     <c ca="center">
                        <p>Number of chromosomes</p>
                     </c>
                     <c ca="center">
                        <p>CGI density/Mb &#177; SD</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>&lt;25</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>29.7 &#177; 17.7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>25-50</p>
                     </c>
                     <c ca="center">
                        <p>35</p>
                     </c>
                     <c ca="center">
                        <p>24.0 &#177; 13.2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>50-75</p>
                     </c>
                     <c ca="center">
                        <p>47</p>
                     </c>
                     <c ca="center">
                        <p>21.7 &#177; 11.3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>75-100</p>
                     </c>
                     <c ca="center">
                        <p>43</p>
                     </c>
                     <c ca="center">
                        <p>14.7 &#177; 7.4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>100-150</p>
                     </c>
                     <c ca="center">
                        <p>49</p>
                     </c>
                     <c ca="center">
                        <p>11.7 &#177; 4.6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>150-200</p>
                     </c>
                     <c ca="center">
                        <p>26</p>
                     </c>
                     <c ca="center">
                        <p>9.7 &#177; 2.6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>>200</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>9.4 &#177; 3.6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="center">
                        <p>219</p>
                     </c>
                     <c ca="center">
                        <p>16.4 &#177; 10.5</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>SD, standard deviation.</p>
               </tblfn>
            </tbl>
            <p>The dog has overall smaller chromosomes and high CGI density, while the opossum has a few large chromosomes and low CGI density. To check whether our correlation analysis was largely driven by these two species, we performed a similar analysis but excluded the dog and opossum data. The same conclusion still held. For example, we found a significant correlation between CGI density and number of chromosome pairs (<it>r </it>= 0.75, <it>P </it>= 0.026) and a significant correlation between CGI density and log<sub>10</sub>(chromosome size) (<it>r </it>= -0.49, <it>P </it>= 5.9 &#215; 10<sup>-12</sup>).</p>
            <p>CGIs are considered gene markers, so they are expected to highly correlate with gene density <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B22">22</abbr></abbrgrp>. It is interesting to investigate whether the above correlation results still hold when gene information is excluded. We identified CGIs in the intergenic regions of nine mammalian genomes and found significant correlations between intergenic CGI density and log<sub>10</sub>(chromosome size) (<it>r </it>= -0.55, <it>P </it>= 7.3 &#215; 10<sup>-19</sup>), GC content of the chromosome (<it>r </it>= 0.39, <it>P </it>= 8.6 &#215; 10<sup>-10</sup>), and Obs<sub>CpG</sub>/Exp<sub>CpG </sub>(<it>r </it>= 0.67, <it>P </it>= 3.7 &#215; 10<sup>-30</sup>). Details are shown in Additional data file 3.</p>
            <p>It is also interesting to examine whether the correlations between CGI density and other genomic factors would hold in different genomic regions. We used human data because of their high quality annotations. According to gene annotations in the NCBI database, we identified 24,228 CGIs overlapped or within genes (gene-associated CGIs), 13,026 CGIs whose whole sequences were within intergenic regions (intergenic CGIs), 12,136 CGIs whose whole sequences were within gene regions (intragenic CGIs), and 11,192 CGIs overlapped with transcriptional start sites (TSS CGIs) in the human genome. Table <tblr tid="T3">3</tblr> shows significant correlations between CGI density and genomic features (log<sub>10</sub>(chromosome size), GC content, and Obs<sub>CpG</sub>/Exp<sub>CpG</sub>) in all genomic regions when we compare the data at the chromosome level.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Correlation between CGI density and genomic features in different human genomic regions</p>
               </caption>
               <tblbdy cols="9">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Gene-associated CGIs (24,228)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Intergenic CGIs (13,026)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Intragenic CGIs (12,136)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>TSS CGIs (11,192)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>r</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>P</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>r</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>P</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>r</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>P</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>r</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>P</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="9">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Log<sub>10</sub>(chromosome size)</p>
                     </c>
                     <c ca="center">
                        <p>-0.54</p>
                     </c>
                     <c ca="center">
                        <p>3.9 &#215; 10<sup>-3</sup></p>
                     </c>
                     <c ca="center">
                        <p>-0.55</p>
                     </c>
                     <c ca="center">
                        <p>3.4 &#215; 10<sup>-3</sup></p>
                     </c>
                     <c ca="center">
                        <p>-0.55</p>
                     </c>
                     <c ca="center">
                        <p>3.1 &#215; 10<sup>-3</sup></p>
                     </c>
                     <c ca="center">
                        <p>-0.51</p>
                     </c>
                     <c ca="center">
                        <p>7.0 &#215; 10<sup>-3</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GC content</p>
                     </c>
                     <c ca="center">
                        <p>0.88</p>
                     </c>
                     <c ca="center">
                        <p>1.7 &#215; 10<sup>-8</sup></p>
                     </c>
                     <c ca="center">
                        <p>0.87</p>
                     </c>
                     <c ca="center">
                        <p>2.9 &#215; 10<sup>-8</sup></p>
                     </c>
                     <c ca="center">
                        <p>0.85</p>
                     </c>
                     <c ca="center">
                        <p>1.9 &#215; 10<sup>-7</sup></p>
                     </c>
                     <c ca="center">
                        <p>0.91</p>
                     </c>
                     <c ca="center">
                        <p>5.4 &#215; 10<sup>-10</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Obs<sub>CpG</sub>/Exp<sub>CpG</sub></p>
                     </c>
                     <c ca="center">
                        <p>0.92</p>
                     </c>
                     <c ca="center">
                        <p>1.5 &#215; 10<sup>-10</sup></p>
                     </c>
                     <c ca="center">
                        <p>0.91</p>
                     </c>
                     <c ca="center">
                        <p>8.3 &#215; 10<sup>-10</sup></p>
                     </c>
                     <c ca="center">
                        <p>0.92</p>
                     </c>
                     <c ca="center">
                        <p>2.5 &#215; 10<sup>-10</sup></p>
                     </c>
                     <c ca="center">
                        <p>0.91</p>
                     </c>
                     <c ca="center">
                        <p>1.0 &#215; 10<sup>-9</sup></p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>Table <tblr tid="T4">4</tblr> summarizes the correlations between CGIs and genomic features based on nine or ten genomes using three CGI identification algorithms.</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Summary of correlations between CGI density and genomic features</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="left">
                        <p>Algorithm</p>
                     </c>
                     <c ca="left">
                        <p>Genomic features</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>r</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>P</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>Shown in figure</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TJ (9 genomes)</p>
                     </c>
                     <c ca="left">
                        <p>Chromosome pairs</p>
                     </c>
                     <c ca="center">
                        <p>0.88</p>
                     </c>
                     <c ca="center">
                        <p>7.9 &#215; 10<sup>-4</sup></p>
                     </c>
                     <c ca="center">
                        <p>1a</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Log<sub>10</sub>(chromosome size)</p>
                     </c>
                     <c ca="center">
                        <p>-0.51</p>
                     </c>
                     <c ca="center">
                        <p>2.6 &#215; 10<sup>-16</sup></p>
                     </c>
                     <c ca="center">
                        <p>1b</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Chromosome GC content</p>
                     </c>
                     <c ca="center">
                        <p>0.65</p>
                     </c>
                     <c ca="center">
                        <p>3.5 &#215; 10<sup>-28</sup></p>
                     </c>
                     <c ca="center">
                        <p>1c</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Chromosome Obs<sub>CpG</sub>/Exp<sub>CpG</sub></p>
                     </c>
                     <c ca="center">
                        <p>0.75</p>
                     </c>
                     <c ca="center">
                        <p>2.8 &#215; 10<sup>-41</sup></p>
                     </c>
                     <c ca="center">
                        <p>1d</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Chromosome arms</p>
                     </c>
                     <c ca="center">
                        <p>0.62</p>
                     </c>
                     <c ca="center">
                        <p>0.037</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Genome size</p>
                     </c>
                     <c ca="center">
                        <p>-0.53</p>
                     </c>
                     <c ca="center">
                        <p>0.073*</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Genomic GC content</p>
                     </c>
                     <c ca="center">
                        <p>0.24</p>
                     </c>
                     <c ca="center">
                        <p>0.27*</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Genomic Obs<sub>CpG</sub>/Exp<sub>CpG</sub></p>
                     </c>
                     <c ca="center">
                        <p>0.63</p>
                     </c>
                     <c ca="center">
                        <p>0.035</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TJ (9 genomes, intergenic CGIs)</p>
                     </c>
                     <c ca="left">
                        <p>Chromosome pairs</p>
                     </c>
                     <c ca="center">
                        <p>0.79</p>
                     </c>
                     <c ca="center">
                        <p>0.005</p>
                     </c>
                     <c ca="center">
                        <p>S2a</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Log<sub>10</sub>(chromosome size)</p>
                     </c>
                     <c ca="center">
                        <p>-0.55</p>
                     </c>
                     <c ca="center">
                        <p>7.3 &#215; 10<sup>-19</sup></p>
                     </c>
                     <c ca="center">
                        <p>S2b</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Chromosome GC content</p>
                     </c>
                     <c ca="center">
                        <p>0.39</p>
                     </c>
                     <c ca="center">
                        <p>8.6 &#215; 10<sup>-10</sup></p>
                     </c>
                     <c ca="center">
                        <p>S2c</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Chromosome Obs<sub>CpG</sub>/Exp<sub>CpG</sub></p>
                     </c>
                     <c ca="center">
                        <p>0.67</p>
                     </c>
                     <c ca="center">
                        <p>3.7 &#215; 10<sup>-30</sup></p>
                     </c>
                     <c ca="center">
                        <p>S2d</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TJ (10 genomes)</p>
                     </c>
                     <c ca="left">
                        <p>Chromosome pairs</p>
                     </c>
                     <c ca="center">
                        <p>0.58</p>
                     </c>
                     <c ca="center">
                        <p>0.039</p>
                     </c>
                     <c ca="center">
                        <p>S1a</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Log<sub>10</sub>(chromosome size)</p>
                     </c>
                     <c ca="center">
                        <p>-0.70</p>
                     </c>
                     <c ca="center">
                        <p>2.6 &#215; 10<sup>-37</sup></p>
                     </c>
                     <c ca="center">
                        <p>S1b</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Chromosome GC content</p>
                     </c>
                     <c ca="center">
                        <p>0.64</p>
                     </c>
                     <c ca="center">
                        <p>3.7 &#215; 10<sup>-29</sup></p>
                     </c>
                     <c ca="center">
                        <p>S1c</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Chromosome Obs<sub>CpG</sub>/Exp<sub>CpG</sub></p>
                     </c>
                     <c ca="center">
                        <p>0.89</p>
                     </c>
                     <c ca="center">
                        <p>1.5 &#215; 10<sup>-81</sup></p>
                     </c>
                     <c ca="center">
                        <p>S1d</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GF (9 genomes)</p>
                     </c>
                     <c ca="left">
                        <p>Chromosome pairs</p>
                     </c>
                     <c ca="center">
                        <p>0.92</p>
                     </c>
                     <c ca="center">
                        <p>2.0 &#215; 10<sup>-4</sup></p>
                     </c>
                     <c ca="center">
                        <p>S5a</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Log<sub>10</sub>(chromosome size)</p>
                     </c>
                     <c ca="center">
                        <p>-0.63</p>
                     </c>
                     <c ca="center">
                        <p>1.3 &#215; 10<sup>-25</sup></p>
                     </c>
                     <c ca="center">
                        <p>S5b</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Chromosome GC content</p>
                     </c>
                     <c ca="center">
                        <p>0.72</p>
                     </c>
                     <c ca="center">
                        <p>3.2 &#215; 10<sup>-37</sup></p>
                     </c>
                     <c ca="center">
                        <p>S5c</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Chromosome Obs<sub>CpG</sub>/Exp<sub>CpG</sub></p>
                     </c>
                     <c ca="center">
                        <p>0.81</p>
                     </c>
                     <c ca="center">
                        <p>2.4 &#215; 10<sup>-53</sup></p>
                     </c>
                     <c ca="center">
                        <p>S5d</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CpGcluster (9 genomes)</p>
                     </c>
                     <c ca="left">
                        <p>Chromosome pairs</p>
                     </c>
                     <c ca="center">
                        <p>0.81</p>
                     </c>
                     <c ca="center">
                        <p>0.004</p>
                     </c>
                     <c ca="center">
                        <p>S6a</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Log<sub>10</sub>(chromosome size)</p>
                     </c>
                     <c ca="center">
                        <p>-0.52</p>
                     </c>
                     <c ca="center">
                        <p>1.6 &#215; 10<sup>-16</sup></p>
                     </c>
                     <c ca="center">
                        <p>S6b</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Chromosome GC content</p>
                     </c>
                     <c ca="center">
                        <p>0.21</p>
                     </c>
                     <c ca="center">
                        <p>0.001</p>
                     </c>
                     <c ca="center">
                        <p>S6c</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Chromosome Obs<sub>CpG</sub>/Exp<sub>CpG</sub></p>
                     </c>
                     <c ca="center">
                        <p>0.61</p>
                     </c>
                     <c ca="center">
                        <p>5.5 &#215; 10<sup>-24</sup></p>
                     </c>
                     <c ca="center">
                        <p>S6d</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*Insignificant correlation. GF, Gardiner-Garden and Frommer's algorithm; TJ, Takai and Jones' algorithm.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>CGI density and recombination rate</p>
            </st>
            <p>Recombination rate correlates with both the number of chromosomes and the number of chromosome arms, and elevates the GC content, probably via biased gene conversion <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>. Fine-scale recombination rates vary extensively among populations <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>, genomic regions <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, or the homologous regions between two closely related organisms (human and chimpanzee) <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr></abbrgrp>, suggesting a rapid evolution of local pattern of recombination rates. Many genomic features, including CpG dinucleotide frequencies (but not CGIs or CGI density) in genomic sequences, have been employed to analyze the pattern of recombination rate. Here we examined specifically the relationship between CGI density and recombination rate at the genome level. We retrieved human recombination rate data (window size, 1 Mb, 2,772 windows) from the UCSC Genome Browser <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. We found a significant positive correlation between CGI density and recombination rate (<it>r </it>= 0.18, <it>P </it>= 1.1 &#215; 10<sup>-22</sup>).</p>
            <p>We obtained another set of recombination rate data (in 5 Mb and 10 Mb windows) for the human, mouse and rat from Jensen-Seaman <it>et al</it>. <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. We discarded those regions that had more than 50% 'N's ('N' denotes an uncertain nucleotide in the sequence) or whose recombination rate was 0. In the latter case, it was likely due to insufficient available genetic markers or a small number of meioses used to construct the genetic maps <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. Again, we found a significant correlation between CGI density and recombination rate, regardless of window size (5 Mb or 10 Mb; Table <tblr tid="T5">5</tblr> and Additional data file 4). For example, the correlation coefficient was 0.33 (<it>P </it>= 5.9 &#215; 10<sup>-16</sup>) for human recombination rates measured in a 5 Mb window (Figure <figr fid="F2">2</figr>). The correlation became stronger as the window size increased. Furthermore, the extent of the correlation was different among the three genomes. For example, the coefficients were 0.33 (human), 0.24 (mouse), and 0.17 (rat), respectively, when the 5 Mb window was used.</p>
            <tbl id="T5">
               <title>
                  <p>Table 5</p>
               </title>
               <caption>
                  <p>Correlation between CGI density and recombination rate in human, mouse and rat</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Window size (Mb)</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>r</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>P</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.18</p>
                     </c>
                     <c ca="center">
                        <p>1.1 &#215; 10<sup>-22</sup></p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0.33</p>
                     </c>
                     <c ca="center">
                        <p>5.9 &#215; 10<sup>-16</sup></p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>0.40</p>
                     </c>
                     <c ca="center">
                        <p>1.7 &#215; 10<sup>-12</sup></p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mouse</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0.24</p>
                     </c>
                     <c ca="center">
                        <p>3.6 &#215; 10<sup>-7</sup></p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>0.33</p>
                     </c>
                     <c ca="center">
                        <p>8.0 &#215; 10<sup>-8</sup></p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Rat</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0.17</p>
                     </c>
                     <c ca="center">
                        <p>8.1 &#215; 10<sup>-5</sup></p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>0.26</p>
                     </c>
                     <c ca="center">
                        <p>1.7 &#215; 10<sup>-5</sup></p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>The detailed distributions are shown in Additional data file 4. Human recombination rate data measured with a 1 Mb window were based on the deCODE genetic map and downloaded from the UCSC Genome Browser [30]. Recombination rate data measured with 5 Mb and 10 Mb windows were prepared by Jensen-Seaman <it>et al</it>. [31] and downloaded from the associated supplementary material website.</p>
               </tblfn>
            </tbl>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Correlation between CGI density and recombination rate (cM/Mb) in the human genome; a 5 Mb window was used</p>
               </caption>
               <text>
                  <p>Correlation between CGI density and recombination rate (cM/Mb) in the human genome; a 5 Mb window was used.</p>
               </text>
               <graphic file="gb-2008-9-5-r79-2"/>
            </fig>
            <p>Recombination rates were found to increase from the centromeric towards telomeric regions <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. Interestingly, we observed a trend of higher CGI density in the telomeric regions (Figure <figr fid="F3">3</figr>) in many chromosomes. This feature supports a positive correlation between CGI density and recombination rate. However, this finding is opposite to a previous observation of no correlation between CGI features and chromosomal telomere position based on a small gene dataset <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Distribution of CGI density (per Mb) on human chromosome 8</p>
               </caption>
               <text>
                  <p>Distribution of CGI density (per Mb) on human chromosome 8. The data indicate a trend of higher CGI density in telomeric regions.</p>
               </text>
               <graphic file="gb-2008-9-5-r79-3"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Comparison of CGIs in non-mammalian vertebrate genomes</p>
            </st>
            <p>To retrieve information on the CGIs in vertebrate genomes, we scanned CGIs in seven non-mammalian vertebrate genomes, including the chicken, lizard and five fish (tetraodon, medaka, zebrafish, stickleback and fugu) genomes. Except for lizard and fugu, all these genomes had assembled chromosomes.</p>
            <p>Table <tblr tid="T6">6</tblr> shows the CGIs and other genome information for the seven non-mammalian vertebrates. The CGI density had a much wider range (14.7-161.6 per Mb) among these genomes. The CGI densities in the chicken (23.0 per Mb) and green anole lizard (25.9 per Mb) were similar to that in the dog (25.3 per Mb), higher than that in the other eight therians, but lower than that (35.9 per Mb) in the platypus (prototherian) (Table <tblr tid="T1">1</tblr>). It is worth noting that both the chicken and platypus have many small chromosomes. The chicken karyotype consists of 39 chromosomes, of which 33 are classified as microchromosomes <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. At the DNA sequence level, chicken chromosomes were separated into three groups (large macrochromosomes, intermediate chromosomes and microchromosomes) by the International Chicken Genome Sequencing Consortium <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. Using this classification, we found that CGI density in the 20 chicken microchromosomes (51.7 per Mb) was much higher than that (15.0 per Mb) in the 6 large macrochromosomes (Table <tblr tid="T6">6</tblr>), consistent with an earlier report <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. We did not estimate the CGI density in the large or small chromosomes of platypus because the available assembled genome sequences (410 Mb) represent only a small portion of the genome, which is expected to be about the same size as the human genome <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>.</p>
            <tbl id="T6">
               <title>
                  <p>Table 6</p>
               </title>
               <caption>
                  <p>CpG islands and other genomic features in non-mammalian genomes</p>
               </caption>
               <tblbdy cols="10">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="4" ca="center">
                        <p>Genome</p>
                     </c>
                     <c cspan="5" ca="center">
                        <p>CpG islands</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="4">
                        <hr/>
                     </c>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Species</p>
                     </c>
                     <c ca="center">
                        <p>Length (Mb)*</p>
                     </c>
                     <c ca="center">
                        <p>Number of chromosome pairs</p>
                     </c>
                     <c ca="center">
                        <p>GC content (%)</p>
                     </c>
                     <c ca="center">
                        <p>Obs<sub>CpG</sub>/Exp<sub>CpG</sub></p>
                     </c>
                     <c ca="center">
                        <p>Number of CGIs</p>
                     </c>
                     <c ca="center">
                        <p>CGI density (/Mb)</p>
                     </c>
                     <c ca="center">
                        <p>Avgerage length (bp)</p>
                     </c>
                     <c ca="center">
                        <p>GC content (%)</p>
                     </c>
                     <c ca="center">
                        <p>Obs<sub>CpG</sub>/Exp<sub>CpG</sub></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="10">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Chicken<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>985</p>
                     </c>
                     <c ca="center">
                        <p>39</p>
                     </c>
                     <c ca="center">
                        <p>41.4</p>
                     </c>
                     <c ca="center">
                        <p>0.248</p>
                     </c>
                     <c ca="center">
                        <p>22,623</p>
                     </c>
                     <c ca="center">
                        <p>23.0</p>
                     </c>
                     <c ca="center">
                        <p>1,098</p>
                     </c>
                     <c ca="center">
                        <p>60.0</p>
                     </c>
                     <c ca="center">
                        <p>0.844</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Microchromosome</p>
                     </c>
                     <c ca="center">
                        <p>167</p>
                     </c>
                     <c ca="center">
                        <p>20</p>
                     </c>
                     <c ca="center">
                        <p>45.7</p>
                     </c>
                     <c ca="center">
                        <p>0.305</p>
                     </c>
                     <c ca="center">
                        <p>8,634</p>
                     </c>
                     <c ca="center">
                        <p>51.7</p>
                     </c>
                     <c ca="center">
                        <p>1,040</p>
                     </c>
                     <c ca="center">
                        <p>60.4</p>
                     </c>
                     <c ca="center">
                        <p>0.810</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Macrochromosome</p>
                     </c>
                     <c ca="center">
                        <p>674</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>40.0</p>
                     </c>
                     <c ca="center">
                        <p>0.219</p>
                     </c>
                     <c ca="center">
                        <p>10,125</p>
                     </c>
                     <c ca="center">
                        <p>15.0</p>
                     </c>
                     <c ca="center">
                        <p>1,138</p>
                     </c>
                     <c ca="center">
                        <p>59.6</p>
                     </c>
                     <c ca="center">
                        <p>0.863</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Lizard</p>
                     </c>
                     <c ca="center">
                        <p>1,742</p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>40.4</p>
                     </c>
                     <c ca="center">
                        <p>0.296</p>
                     </c>
                     <c ca="center">
                        <p>45,171</p>
                     </c>
                     <c ca="center">
                        <p>25.9</p>
                     </c>
                     <c ca="center">
                        <p>899</p>
                     </c>
                     <c ca="center">
                        <p>56.8</p>
                     </c>
                     <c ca="center">
                        <p>0.728</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Tetraodon</p>
                     </c>
                     <c ca="center">
                        <p>187</p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>45.9</p>
                     </c>
                     <c ca="center">
                        <p>0.601</p>
                     </c>
                     <c ca="center">
                        <p>30,175</p>
                     </c>
                     <c ca="center">
                        <p>161.6</p>
                     </c>
                     <c ca="center">
                        <p>1,013</p>
                     </c>
                     <c ca="center">
                        <p>56.7</p>
                     </c>
                     <c ca="center">
                        <p>0.782</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Stickleback</p>
                     </c>
                     <c ca="center">
                        <p>391</p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>44.5</p>
                     </c>
                     <c ca="center">
                        <p>0.662</p>
                     </c>
                     <c ca="center">
                        <p>61,768</p>
                     </c>
                     <c ca="center">
                        <p>157.8</p>
                     </c>
                     <c ca="center">
                        <p>824</p>
                     </c>
                     <c ca="center">
                        <p>55.8</p>
                     </c>
                     <c ca="center">
                        <p>0.842</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Medaka</p>
                     </c>
                     <c ca="center">
                        <p>582</p>
                     </c>
                     <c ca="center">
                        <p>24</p>
                     </c>
                     <c ca="center">
                        <p>40.1</p>
                     </c>
                     <c ca="center">
                        <p>0.479</p>
                     </c>
                     <c ca="center">
                        <p>21,522</p>
                     </c>
                     <c ca="center">
                        <p>37.0</p>
                     </c>
                     <c ca="center">
                        <p>746</p>
                     </c>
                     <c ca="center">
                        <p>55.8</p>
                     </c>
                     <c ca="center">
                        <p>0.784</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Zebrafish</p>
                     </c>
                     <c ca="center">
                        <p>1,524</p>
                     </c>
                     <c ca="center">
                        <p>25</p>
                     </c>
                     <c ca="center">
                        <p>36.5</p>
                     </c>
                     <c ca="center">
                        <p>0.531</p>
                     </c>
                     <c ca="center">
                        <p>22,392</p>
                     </c>
                     <c ca="center">
                        <p>14.7</p>
                     </c>
                     <c ca="center">
                        <p>1,162</p>
                     </c>
                     <c ca="center">
                        <p>57.0</p>
                     </c>
                     <c ca="center">
                        <p>0.869</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Fugu</p>
                     </c>
                     <c ca="center">
                        <p>351</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                     <c ca="center">
                        <p>45.5</p>
                     </c>
                     <c ca="center">
                        <p>0.565</p>
                     </c>
                     <c ca="center">
                        <p>47,251</p>
                     </c>
                     <c ca="center">
                        <p>134.5</p>
                     </c>
                     <c ca="center">
                        <p>872</p>
                     </c>
                     <c ca="center">
                        <p>56.0</p>
                     </c>
                     <c ca="center">
                        <p>0.808</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*The nucleotides marked as 'N' were not included in the analysis. <sup>&#8224;</sup>Only 30 chromosomes were used in the analysis because chromosomes 29-31 and 33-38 were too small to assemble [39]. The microchromosomes included chromosomes GGA11-28, 32 and W and the macrochromosomes included chromosomes GGA1-5 and Z.</p>
               </tblfn>
            </tbl>
            <p>CGI densities in the five fish genomes varied to a much greater extent than in the mammalian genomes. The CGI densities in tetraodon (161.6 per Mb) and stickleback (157.8 per Mb) were about 11 times that in zebrafish (14.7 per Mb). The Obs<sub>CpG</sub>/Exp<sub>CpG </sub>ratios in the fish genomes (0.479-0.662) were also much higher than those (0.129-0.296) in the mammalian, the chicken (0.248) and the lizard (0.296) genomes. Fishes are cold-blooded vertebrates and lack GC-rich isochores <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. An early study found certain fish did not have elevated GC content in nonmethylated CGIs <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>, so our comparison of CGIs in fishes should be taken with caution.</p>
            <p>In contrast to the observation in mammalian genomes, the correlation between CGI density and number of chromosome pairs in the seven non-mammals was not significant (<it>r </it>= -0.42, <it>P </it>= 0.17). We further examined CGI density at the chromosome level in the five non-mammalian genomes (chicken, tetraodon, stickleback, medaka and zebrafish), whose assembled chromosomes are available, and compared it to the nine mammalian genomes. To distinguish the features of CGIs among different genomes, we separated them into different groups: primates (human, chimpanzee and macaque), rodents (mouse and rat), dog-horse-cow, opossum, chicken and fish (tetraodon, stickleback, medaka and zebrafish). Figure <figr fid="F4">4</figr> shows the plots of CGI density over chromosome GC content. Although there is an overall trend of increasing CGI density with chromosome GC content in both the mammals and non-mammals, their distributions of CGI densities over the chromosome GC content are different. In mammals, CGI density is high in dog-horse-cow and low in rodents, but extensive overlaps are seen among different groups, especially between primates and other groups (Figure <figr fid="F4">4a</figr>). This pattern is more evident in the plots of CGI density versus log<sub>10</sub>(chromosome size) or versus chromosome Obs<sub>CpG</sub>/Exp<sub>CpG </sub>ratios (Additional data file 5). Interestingly, we found an overall distinct distribution pattern among non-mammal genomes, especially among the fish genomes (Figure <figr fid="F4">4b</figr>). The chromosomes from each fish genome clustered but they were separated from other fish genomes (Figure <figr fid="F4">4b</figr>, Additional data file 5). Finally, when all species were plotted together, there were overlaps between mammals and non-mammals, but overall, fish chromosomes and chicken microchromosomes could be separated from the mammalian chromosomes (Figure <figr fid="F4">4b</figr>, Additional data file 5).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>CGI density comparison between mammals and non-mammals</p>
               </caption>
               <text>
                  <p>CGI density comparison between mammals and non-mammals. This figure shows the distribution of CGI density (per Mb) versus chromosome GC content (%). <b>(a) </b>Comparison of four groups in mammals. <b>(b) </b>Comparison of mammals, chicken and fish.</p>
               </text>
               <graphic file="gb-2008-9-5-r79-4"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>Influence of CGI identification algorithms</p>
            </st>
            <p>There are three major algorithms for identifying CGIs in a genomic sequence (reviewed in the Background). The major aim in this study is to investigate and compare the CGIs in today's mammalian genomes, rather than to identify CGIs in the mammalian ancestral sequences. Thus, our analysis may provide insights into how CGIs have evolved and their association with gene function and other genomic factors. Since CGIs have been widely documented to be approximately 1 kb long <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B6">6</abbr></abbrgrp>, Takai and Jones' stringent criteria seem to be the most appropriate for our analysis. To assure the reliability of our analysis, we performed similar analysis using Gardiner-Garden and Frommer's algorithm (only on the non-repeat portions of the genomes) and CpGcluster with the ten mammalian genomes and seven other vertebrate genomes under study. The conclusions were the same; see detailed results in Table <tblr tid="T4">4</tblr> and Additional data files 6 and 7. For example, there was a significant positive correlation between CGI density and chromosome number, using Gardiner-Garden and Frommer's algorithm (<it>r </it>= 0.92, <it>P </it>= 2.0 &#215; 10<sup>-4</sup>; Additional data file 6) or CpGcluster (<it>r </it>= 0.81, <it>P </it>= 0.004; Additional data file 7).</p>
            <p>However, we found that the number of CGIs identified by CpGcluster or Gardiner-Garden and Frommer's algorithm was remarkably larger than that identified by Takai and Jones' algorithm (Additional data file 8); for example, the numbers of CGIs identified in the human genome was 37,531 (Takai and Jones), 76,678 (Gardiner-Garden and Frommer), and 197,727 (CpGcluster). The number of genes was estimated to be approximately in the range 20,000-30,000 in mammalian genomes (Additional data file 1). Since CGIs have been widely considered as gene markers, both the Gardiner-Garden and Frommer algorithm and CpGcluster likely identified either many CGIs that are not associated with genes or multiple CGIs that share one gene. To address the latter case, we evaluated the length distribution of CGIs identified by the three algorithms. Among all these vertebrate genomes, the majority of CGIs identified by CpGcluster were shorter than 500 bp (Additional data file 8), which is the minimum length in Takai and Jones' algorithm. For example, the proportions of human CGIs identified by CpGcluster were 44.3% (&lt;200 bp), 45.9% (200-500 bp), 7.3% (500-1,000 bp), 1.9% (1,000-1,500 bp), 0.4% (1,500-2,000 bp), and 0.2% (&#8805;2,000 bp). For Gardiner-Garden and Frommer's algorithm, the proportion of CGIs shorter than 500 bp was also large, for example, 65.8% in the human CGIs and 64.8% in the opossum CGIs (Additional data file 8). Based on the evaluation above, we consider that our analysis using Takai and Jones' algorithm is the most reliable and appropriate, though further evaluation of species-specific algorithms may enhance our results.</p>
         </sec>
         <sec>
            <st>
               <p>Evolution of CGIs</p>
            </st>
            <p>It was hypothesized that CGIs arose once at the dawn of vertebrate evolution and vertebrate ancestral genes were embedded in entirely non-methylated DNA during the divergence of vertebrates <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Genome-wide methylation has been found to be common in vertebrates (except for promoter-associated CGIs) and fractional methylation common in invertebrates. The transition from fractional to global methylation likely occurred around the origin of vertebrates <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. Many CGIs might have lost their typical features due to <it>de novo </it>methylation at their CpG sites and subsequent high deamination rates at the newly methylated CpG sites, leading to TpG and CpA dinucleotides. Excess of TpGs and CpAs as well as other vanishing CGI features (decreasing length, Obs<sub>CpG</sub>/Exp<sub>CpG </sub>ratio and GC content) has been found in the homologous gene regions, evidence of frequent CGI losses in mouse and human genes and a faster loss rate in mice <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B9">9</abbr><abbr bid="B17">17</abbr></abbrgrp>. Recent methylation studies revealed weak CGIs in promoter regions (promoters with intermediate CpG content, ICPs), most of which were not found in the CGI library, had a faster loss rate of CpGs than stronger CGIs (promoters with high CpG content, HCPs), suggesting that strong CGIs might be protected from methylation and are thus better conserved during evolution <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr></abbrgrp>. Using the data in Weber <it>et al</it>. <abbrgrp><abbr bid="B37">37</abbr></abbrgrp> and Mikkelsen <it>et al</it>. <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, we found that HCP density has stronger correlations with genomic features than ICPs in both the human and mouse genomes. The CGIs identified by the Takai-Jones algorithm are different from HCPs or ICPs. However, when we separated the promoter-associated CGIs identified by the Takai-Jones algorithm into HCGIs (those that satisfied the HCP criteria) and non-HCGIs, we also found that HCGIs had stronger correlations with genomic features than non-HCGIs. This supports the observations from the methylation studies mentioned above. Although loss of CGIs is likely a major evolutionary scenario in mammals, little comparative analysis at the DNA sequence level has been performed yet, because CGIs have been thought to be poorly conserved between species <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B9">9</abbr></abbrgrp>. Our CGI analysis indicated that rodents have the lowest CGI density and most other eutherians have moderate CGI density when compared to platypus (Table <tblr tid="T1">1</tblr>). Platypus is one of the only three extant monotremes and has a fascinating mixture of features typical of mammals and of reptiles and birds. Monotremes (mammalian subclass Prototheria) are the oldest branch of the mammalian tree, diverging 210 million years ago from the therian mammals <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Although the platypus genome is incomplete, its higher CGI density is likely true because high frequencies of GC and CG dinucleotides and high GC content have been reported <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Further, our analysis of the chicken (bird) and green anole lizard genomic sequences, the only reptilian genome available at present, showed higher CGI density than most of the therians (except dogs) we examined. These data support an overall decrease in CGIs in mammalian genomes.</p>
            <p>Below we discuss specific CGI features of a few species. The low number of CGIs in the rodent genome is likely due to a much higher rate of CGI loss and a weaker selective constraint in the rodent lineage <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B17">17</abbr></abbrgrp>. Interestingly, the dog has a notably large number of CGIs and high CGI density among the nine therians investigated. Our further analysis revealed that the difference is due to the substantial enrichment of CGIs in dog's intergenic and intronic regions, while the number of CGIs associated with the 5' end of genes is similar to the human and the mouse (data not shown). Whether and how CGIs have accumulated in dog requires further investigation. It is also worth noting that opossum, which belongs to metatheria, is another evolutionarily ancient lineage of mammals. The CGI density is very low (7.5 per Mb). This is likely attributed to its large chromosomes (Table <tblr tid="T1">1</tblr>), as large chromosomes are correlated with low CGI density (Figure <figr fid="F1">1</figr>). Large chromosomes reduce recombination rate, which has a positive correlation with CGI density (Figure <figr fid="F2">2</figr>).</p>
         </sec>
         <sec>
            <st>
               <p>Other possible factors that might influence CGI density</p>
            </st>
            <p>It is interesting to examine whether species traits such as lifespan, body temperature and body mass are related to CGI density. The small body size and short lifespan of mice were speculated to allow for their tolerance towards leaky control of gene activity, including erosion of CGIs <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. A previous study also revealed that methylation status is correlated with body temperatures in fish and affected by the local environment <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. It was also proposed that GC content of the isochores is driven by increasing body temperature, which has selective advantages because of being more thermally stable in higher GC-content regions <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. Our correlation analysis found a significant correlation between CGI density and body temperature in eight eutherians (<it>r </it>= 0.67, <it>P </it>= 0.035) and nine therians (<it>r </it>= 0.63, <it>P </it>= 0.034; Figure <figr fid="F5">5a</figr>). However, when platypus and/or chicken were added, the correlation became insignificant. Furthermore, we did not find a significant correlation between CGI density and lifespan in the eight eutherians (<it>r </it>= 0.14, <it>P </it>= 0.38) or nine therians (<it>r </it>= 0.26, <it>P </it>= 0.25; Figure <figr fid="F5">5b</figr>). Some factors might have affected the estimation of lifespan, making the analysis unreliable. First, living environments are much different between domesticated and wild animals; meanwhile, modern medical treatment has increased human longevity. Second, lifespan in the same species may differ according to factors such as sex <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> and hormonal regulation <abbrgrp><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr></abbrgrp>. Third, the divergence among mammals is low when compared to other vertebrates. In summary, our analysis of these species traits should be considered preliminary.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Correlation between CGI density and other genetic factors</p>
               </caption>
               <text>
                  <p>Correlation between CGI density and other genetic factors. <b>(a) </b>Significant correlation between CGI density and body temperature. <b>(b) </b>Insignificant correlation between CGI density and lifespan.</p>
               </text>
               <graphic file="gb-2008-9-5-r79-5"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>This study represents a systematic comparative genomic analysis of CGIs and CGI density at the DNA sequence level in mammals. It reveals significant correlations between CGI density and genomic features such as number of chromosome pairs, chromosome size, and recombination rate. Our results suggest a genome evolution scenario in which an increase in chromosome number increases the rate of recombination, which in turn elevates GC content to help prevent loss of CGIs and maintain CGI density. We compared CGI features in other non-mammalian vertebrates and discussed other factors such as body temperature and lifespan that have previously been speculated to influence sequence composition evolution.</p>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>Genome sequences and genome information</p>
            </st>
            <p>We downloaded the assembled genome sequences (ten mammalian genomes and seven non-mammalian vertebrate genomes) from the National Center for Biotechnology Information (NCBI) <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> and the UCSC Genome Browser <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. The species names and data sources are provided in Table <tblr tid="T7">7</tblr>. The repeat-masked sequences of these genomes were downloaded from the UCSC Genome Browser <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. We used the EMBOSS package <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> to calculate the genome size, the GC content and the Obs<sub>CpG</sub>/Exp<sub>CpG </sub>ratios. Gene numbers were based on the annotations in Ensembl <abbrgrp><abbr bid="B46">46</abbr></abbrgrp> and also in the literature (details are shown in Additional data file 1). At present, it remains a great challenge to obtain an accurate estimation of the gene number in a genome, but we suspect that the actual gene numbers in these genomes are likely in a smaller range than the range 20,000-30,000 in Additional data file 1.</p>
            <tbl id="T7">
               <title>
                  <p>Table 7</p>
               </title>
               <caption>
                  <p>Names and sequence information of ten mammals and other vertebrates</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>Common name</p>
                     </c>
                     <c ca="left">
                        <p>Species name</p>
                     </c>
                     <c ca="center">
                        <p>Sequence build</p>
                     </c>
                     <c ca="center">
                        <p>Data source</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mammal</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Human</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Homo sapiens</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>35.1</p>
                     </c>
                     <c ca="center">
                        <p>NCBI [44]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Chimpanzee</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Pan troglodytes</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2.1</p>
                     </c>
                     <c ca="center">
                        <p>NCBI [44]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Macaque</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Macaca mulatta</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1.1</p>
                     </c>
                     <c ca="center">
                        <p>NCBI [44]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Mouse</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Mus musculus</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>34.1</p>
                     </c>
                     <c ca="center">
                        <p>NCBI [44]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Rat</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Rattus norvegicus</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4.1</p>
                     </c>
                     <c ca="center">
                        <p>NCBI [44]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Dog</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Canis familiaris</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2.1</p>
                     </c>
                     <c ca="center">
                        <p>NCBI [44]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Cow</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Bos taurus</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3.1</p>
                     </c>
                     <c ca="center">
                        <p>NCBI [44]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Horse</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Equus caballus</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1.1</p>
                     </c>
                     <c ca="center">
                        <p>NCBI [44]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Opossum</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Monodelphis domestica</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2.1</p>
                     </c>
                     <c ca="center">
                        <p>NCBI [44]</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Platypus*</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Ornithorhynchus anatinus</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1.1</p>
                     </c>
                     <c ca="center">
                        <p>NCBI [44]</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Non-mammal vertebrate</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Chicken<sup>&#8224;</sup></p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Gallus gallus</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2.1</p>
                     </c>
                     <c ca="center">
                        <p>NCBI [44]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Green anole lizard<sup>&#8225;</sup></p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Anolis carolinensis</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>anoCar1</p>
                     </c>
                     <c ca="center">
                        <p>UCSC [30]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Tetraodon</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Tetraodon nigroviridis</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>tetNig1</p>
                     </c>
                     <c ca="center">
                        <p>UCSC [30]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Stickleback</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Gasterosteus aculeatus</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>gasAcu1</p>
                     </c>
                     <c ca="center">
                        <p>UCSC [30]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Medaka</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Oryzias latipes</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>oryLat1</p>
                     </c>
                     <c ca="center">
                        <p>UCSC [30]</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>Zebrafish</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Danio rerio</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>danRer5</p>
                     </c>
                     <c ca="center">
                        <p>UCSC [30]</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Fugu<sup>&#8225;</sup></p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Takifugu rubripes</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>fr2</p>
                     </c>
                     <c ca="center">
                        <p>UCSC [30]</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*The platypus genome was partially assembled. Only chromosomes 1-7, 10-12, 14, 15, 17, 18, 20, X1-X3, and X5 were available. <sup>&#8224;</sup>Only chromosomes 1-28, 32, W, and Z were available. <sup>&#8225;</sup>No assembled chromosomes.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Identification of CpG islands</p>
            </st>
            <p>We used three algorithms to identify CGIs. First, we used the stringent search criteria in the Takai and Jones algorithm <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>: GC content &#8805;55%, Obs<sub>CpG</sub>/Exp<sub>CpG </sub>&#8805;0.65, and length &#8805;500 bp. Second, we used the algorithm originally developed by Gardiner-Garden and Frommer <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>: GC content >50%, Obs<sub>CpG</sub>/Exp<sub>CpG </sub>>0.60, and length >200 bp. Because some repeats (for example, <it>Alu</it>) meet these criteria, we scanned CGIs in the non-repeat portions of these genomes only, as similarly done in other genome-wide identification studies <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B11">11</abbr></abbrgrp>. For both the Takai and Jones and the Gardiner-Garden and Frommer algorithms, we used the CpG island searcher program (CpGi130) available at <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. Third, we used CpGcluster developed by Hackenberg <it>et al. </it><abbrgrp><abbr bid="B15">15</abbr></abbrgrp> to scan CGIs in the whole genome.</p>
            <p>We used the method of Jiang and Zhao <abbrgrp><abbr bid="B48">48</abbr></abbrgrp> to identify CGIs in different genomic regions (genes, intergenic regions, intragenic regions, and TSS regions). Briefly, we compared the locations of CGIs with the coordinates of genic, intergenic, and intragenic regions and TSSs based on the human gene annotation information from the NCBI database (build 35.1) <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B49">49</abbr></abbrgrp>. CGIs overlapped with any genes were classified as gene-associated CGIs; CGIs whose whole sequences were in intergenic regions were classified as intergenic CGIs; CGIs whose sequences were in gene regions were classified as intragenic CGIs; and CGIs overlapped with TSSs were classified as TSS CGIs.</p>
         </sec>
         <sec>
            <st>
               <p>Recombination rate and CGI density</p>
            </st>
            <p>We retrieved human recombination rate data based on the deCODE genetic map <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> from the UCSC Genome Browser <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. The recombination rates were measured in 1 Mb windows. We obtained another set of recombination rates from Jensen-Seaman <it>et al</it>. <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. These data were measured in 5 Mb and 10 Mb windows for the human, mouse and rat and are available in the supplementary material for Jensen-Seaman <it>et al</it>. <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. For both datasets, we discarded those regions having more than 50% 'N's <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. We also discarded those regions whose recombination rates were 0 because of too few genetic markers found in these regions <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Body temperature and lifespan in mammals</p>
            </st>
            <p>Records of body temperature in a species may vary to some extent in the literature because they might be measured in different environments (for example, time of day, season, or geographical location) or different sites of the body. The body temperatures of ten mammals in this study were obtained from the literature (details are shown in Additional data file 9). When a species has a range of body temperatures in the literature, the average was used as the representative temperature. There are several measurements of lifespan, such as maximum lifespan, average lifespan, and lifespan of each sex. We used maximum lifespan, which was based on reports in the literature and from the AnAge database <abbrgrp><abbr bid="B51">51</abbr></abbrgrp> (Additional data file 9).</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>CGI, CpG island; HCGI, CGI satisfying the HCP criteria; HCP, high CpG content promoter; ICP, intermediate CpG content promoter; TSS, transcriptional start site.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>LH prepared the data, carried out the data analysis, and contributed to the writing of the manuscript. BS participated in study design and coordination. WHL participated in study design and contributed to the writing of the manuscript. ZZ conceived of the study, participated in the data analysis and interpretation, and contributed to the writing of the manuscript. All authors read and approved the final manuscript.</p>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>The following additional data are available. Additional data file <supplr sid="S1">1</supplr> is a table that lists the numbers of genes estimated in mammalian genomes. Additional data file <supplr sid="S2">2</supplr> shows the correlations between CGI density and genomic features in ten mammalian genomes (including platypus). Additional data file <supplr sid="S3">3</supplr> shows the correlations between intergenic CGI density and genomic features in nine mammalian genomes. Additional data file <supplr sid="S4">4</supplr> shows the correlations between CGI density and average recombination rate (cM/Mb) in the human, mouse and rat genomes. Additional data file <supplr sid="S5">5</supplr> provides the comparison of CpG islands and other genomic features between mammalian and non-mammalian genomes. Additional data file <supplr sid="S6">6</supplr> shows the correlations between CGI density and genomic features in mammalian genomes using the Gardiner-Garden and Frommer algorithm in the non-repeat portions of genomes. Additional data file <supplr sid="S7">7</supplr> shows the correlations between CGI density and genomic features in mammalian genomes using the CpGcluster algorithm. Additional data file <supplr sid="S8">8</supplr> lists the numbers of CGIs in each genome identified by the three algorithms and shows their length distribution. Additional data file <supplr sid="S9">9</supplr> lists the body temperature and lifespan for each species.</p>
         <suppl id="S1">
            <title>
               <p>Additional file 1</p>
            </title>
            <text>
               <p>Numbers of genes estimated in mammalian genomes.</p>
            </text>
            <file name="gb-2008-9-5-r79-S1.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S2">
            <title>
               <p>Additional file 2</p>
            </title>
            <text>
               <p>Correlations between CGI density and genomic features in ten mammalian genomes (including platypus).</p>
            </text>
            <file name="gb-2008-9-5-r79-S2.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S3">
            <title>
               <p>Additional file 3</p>
            </title>
            <text>
               <p>Correlations between intergenic CGI density and genomic features in nine mammalian genomes.</p>
            </text>
            <file name="gb-2008-9-5-r79-S3.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S4">
            <title>
               <p>Additional file 4</p>
            </title>
            <text>
               <p>Correlations between CGI density and average recombination rate (cM/Mb) in the human, mouse and rat genomes.</p>
            </text>
            <file name="gb-2008-9-5-r79-S4.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S5">
            <title>
               <p>Additional file 5</p>
            </title>
            <text>
               <p>Comparison of CpG islands and other genomic features between mammalian and non-mammalian genomes.</p>
            </text>
            <file name="gb-2008-9-5-r79-S5.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S6">
            <title>
               <p>Additional file 6</p>
            </title>
            <text>
               <p>Correlations between CGI density and genomic features in mammalian genomes using the Gardiner-Garden and Frommer algorithm in the non-repeat portions of genomes. In both Additional data files 6 and 7, the platypus chromosomes were excluded because of incomplete genome sequence data and chromosome data. The conclusion would be the same when the platypus data were included.</p>
            </text>
            <file name="gb-2008-9-5-r79-S6.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S7">
            <title>
               <p>Additional file 7</p>
            </title>
            <text>
               <p>Correlations between CGI density and genomic features in mammalian genomes using the CpGcluster algorithm. In both Additional data files 6 and 7, the platypus chromosomes were excluded because of incomplete genome sequence data and chromosome data. The conclusion would be the same when the platypus data were included.</p>
            </text>
            <file name="gb-2008-9-5-r79-S7.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S8">
            <title>
               <p>Additional file 8</p>
            </title>
            <text>
               <p>The first sheet ('overview') summarizes the total number of CGIs in each genome identified by each algorithm. The length distribution of CGIs in each genome is shown in each additional sheet.</p>
            </text>
            <file name="gb-2008-9-5-r79-S8.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S9">
            <title>
               <p>Additional file 9</p>
            </title>
            <text>
               <p>Body temperature and lifespan for each species.</p>
            </text>
            <file name="gb-2008-9-5-r79-S9.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank the two anonymous reviewers for valuable comments. We are grateful to Dr John Speakman for suggestions on estimating lifespan and body temperature. This project was supported by the Thomas F and Kate Miller Jeffress Memorial Trust Fund and a NARSAD Young Investigator Award to Z Zhao and NIH grants to WH Li.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>CpG-rich islands and the function of DNA methylation.</p>
            </title>
            <aug>
               <au>
                  <snm>Bird</snm>
                  <fnm>AP</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1986</pubdate>
            <volume>321</volume>
            <fpage>209</fpage>
            <lpage>213</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/321209a0</pubid>
                  <pubid idtype="pmpid" link="fulltext">2423876</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Initial sequencing and analysis of the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Linton</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Birren</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Nusbaum</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Zody</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Baldwin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Devon</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dewar</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Doyle</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>FitzHugh</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Funke</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gage</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Harris</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Heaford</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Howland</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kann</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Lehoczky</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>LeVine</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>McEwan</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>McKernan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Meldrim</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
   