<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2002-3-10-research0058</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Asymmetric directional mutation pressures in bacteria</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Lobry</snm>
               <mi>R</mi>
               <fnm>Jean</fnm>
               <insr iid="I1"/>
               <email>lobry@biomserv.univ-lyon1.fr</email>
            </au>
            <au id="A2">
               <snm>Sueoka</snm>
               <fnm>Noboru</fnm>
               <insr iid="I2"/>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Laboratoire BBE CNRS UMR 5558, Universit&#233; Claude Bernard, 43 Bd du 11 Novembre 1918, F-69622 Villeurbanne cedex, France</p>
            </ins>
            <ins id="I2">
               <p>Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, CO 80309-0347, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2002</pubdate>
         <volume>3</volume>
         <issue>10</issue>
         <fpage>research0058.1</fpage>
         <lpage>research0058.14</lpage>
         <url>http://genomebiology.com/2002/3/10/research/0058</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/gb-2002-3-10-research0058</pubid>
               <pubid idtype="pmpid">12372146</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>13</day>
               <month>11</month>
               <year>2001</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>18</day>
               <month>6</month>
               <year>2002</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>15</day>
               <month>8</month>
               <year>2002</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>26</day>
               <month>9</month>
               <year>2002</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2002</year>
         <collab>Lobry and Sueoka, licensee BioMed Central Ltd</collab>
      </cpyrt>
      <shorttitle>
         <p>Asymmetric directional mutation pressures in bacteria</p>
      </shorttitle>
      <shortabs>
         <p>When there are no strand-specific biases in mutation and selection rates between the two strands of DNA, the average nucleotide composition is theoretically expected to be A = T and G = C within each strand. By focusing on weakly selected regions that could be oriented with respect to replication in 43 out of 51 completely sequenced bacterial chromosomes, asymmetric directional mutation pressures have been detected.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>When there are no strand-specific biases in mutation and selection rates (that is, in the substitution rates) between the two strands of DNA, the average nucleotide composition is theoretically expected to be A = T and G = C within each strand. Deviations from these equalities are therefore evidence for an asymmetry in selection and/or mutation between the two strands. By focusing on weakly selected regions that could be oriented with respect to replication in 43 out of 51 completely sequenced bacterial chromosomes, we have been able to detect asymmetric directional mutation pressures.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Most of the 43 chromosomes were found to be relatively enriched in G over C and T over A, and slightly depleted in G+C, in their weakly selected positions (intergenic regions and third codon positions) in the leading strand compared with the lagging strand. Deviations from A = T and G = C were highly correlated between third codon positions and intergenic regions, with a lower degree of deviation in intergenic regions, and were not correlated with overall genomic G+C content.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>During the course of bacterial chromosome evolution, the effects of asymmetric directional mutation pressures are commonly observed in weakly selected positions. The degree of deviation from equality is highly variable among species, and within species is higher in third codon positions than in intergenic regions. The orientation of these effects is almost universal and is compatible in most cases with the hypothesis of an excess of cytosine deamination in the single-stranded state during DNA replication. However, the variation in G+C content between species is influenced by factors other than asymmetric mutation pressure.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010014">Microbiology and parasitology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The G+C content of bacterial DNA ranges from around 25% to around 75% [<abbr bid="B1">1</abbr>,<abbr bid="B2">2</abbr>]. In contrast to the nuclear genomes of some vertebrates, the intragenomic heterogeneity of G+C content is generally small in bacteria [<abbr bid="B3">3</abbr>,<abbr bid="B4">4</abbr>], although there are some exceptions [<abbr bid="B5">5</abbr>,<abbr bid="B6">6</abbr>,<abbr bid="B7">7</abbr>]. The wide interspecies variation and narrow intraspecies heterogeneity of the DNA G+C content was interpreted to be the result of differences in the bidirectional mutation rates between AT and GC pairs [<abbr bid="B8">8</abbr>]. This kind of mutation pressure is said to be the G+C pressure. From Watson-Crick base-pairing rules [<abbr bid="B9">9</abbr>], the G+C content is exactly the same for the two complementary strands in a DNA duplex, so the G+C pressure effects are symmetric with respect to the two strands. The effects of G+C pressure are dramatically increased in regions where selective pressure is weak: for instance, in the third codon positions in protein-coding sequences, where most substitutions do not alter the encoded amino acid, the average G+C content ranges from at least 7% to 95% between bacteria of different species [<abbr bid="B7">7</abbr>,<abbr bid="B10">10</abbr>,<abbr bid="B11">11</abbr>]. It should be stressed that symmetric directional mutation effects are not restricted to purely neutral regions, and strongly influence amino-acid frequencies in proteins [<abbr bid="B12">12</abbr>,<abbr bid="B13">13</abbr>,<abbr bid="B14">14</abbr>,<abbr bid="B15">15</abbr>,<abbr bid="B16">16</abbr>].</p>
         <p>The two strands of a DNA duplex must have the same G+C content, but the different bases can still vary in frequency. For instance, one strand may have more G than C, say G/(G+C) = 0.8, and by complementarity the other one will have more C than G, G/(G+C) = 0.2 in this case. The strand-specific asymmetries in DNA have been the subject of considerable research [<abbr bid="B17">17</abbr>,<abbr bid="B18">18</abbr>,<abbr bid="B19">19</abbr>,<abbr bid="B20">20</abbr>,<abbr bid="B21">21</abbr>,<abbr bid="B22">22</abbr>,<abbr bid="B23">23</abbr>]. Briefly, the theoretical predictions are as follows: when mutation and selection are assumed to be symmetric with respect to the two strands of DNA, parity rule 1 (PR1) holds where the following pairs of substitution rates are equal: r<sub>AT</sub> = r<sub>TA</sub>, r<sub>GC</sub> = r<sub>CG</sub>, r<sub>AG</sub> = r<sub>TC</sub>, r<sub>GA</sub> = r<sub>CT</sub>, r<sub>AC</sub> = r<sub>TG</sub>, r<sub>CA</sub> = r<sub>GT</sub> [<abbr bid="B20">20</abbr>]. Here, r<sub>AT</sub>, and so on, are substitution rates of A &#8594; T, and so on, in a specific strand. Under the PR1 hypothesis, the intrastrand base composition at equilibrium is expected to be always in a particular state, the parity rule 2 (PR2) state, such that A = T and G = C within each strand [<abbr bid="B20">20</abbr>,<abbr bid="B21">21</abbr>,<abbr bid="B24">24</abbr>]. This PR2 state is more than just an equilibrium point, because convergence towards PR2 continues even if the substitution rates are modified: there is no way to escape from the PR2 state under the PR1 hypothesis [<abbr bid="B25">25</abbr>]. However, a bias in mutation or selection between the two strands of DNA can generate deviations from PR2 (that is, A &#8800; T, or G &#8800; C, or both). In this situation, the underlying cause, regardless of its mutational or selective nature, is asymmetric between the two strands.</p>
         <p>Note that there is a serious confusion in the literature about substitution matrices that are 'symmetric' in the sense that the two strands evolve with the same pattern of substitutions (PR1 hypothesis), and 'symmetric' in the sense that the terms above and below the diagonal are the same (the SYM model, where r<sub>ij</sub> = r<sub>ji</sub>). The SYM model adopted previously by others, especially in the field of molecular phylogeny [<abbr bid="B26">26</abbr>], assumes the equality of intrastrand mutation rates between any two bases: in contrast to the PR1 model, the SYM model does not incorporate DNA base-pairing rules. The SYM model has long been recognized as unrealistic because it predicts A = T = G = C at equilibrium without biases. The PR1 model is not a special case of the SYM model; rejection of the SYM model does not lead to rejection of the PR1 model.</p>
         <p>The effect of asymmetric mutation pressure on the amino-acid content of proteins has been reported [<abbr bid="B27">27</abbr>,<abbr bid="B28">28</abbr>,<abbr bid="B29">29</abbr>]. In terms of amino-acid composition, the diversifying selection on integral membrane proteins (enriched in hydrophobic amino acids) and on cytoplasmic proteins (enriched in hydrophilic amino acids) is generally the most important factor in within-proteome variability [<abbr bid="B30">30</abbr>]. In <it>Borrelia burgdorferi</it>, however, the effects of this selective pressure are quantitatively less important than those resulting from asymmetric mutation pressure [<abbr bid="B31">31</abbr>], stressing again that mutation pressure is far from being a negligible phenomenon.</p>
         <p>Deviations from PR2 have been studied using many different approaches, including correspondence discriminant analysis [<abbr bid="B32">32</abbr>], linear discriminant analysis with rotating windows [<abbr bid="B28">28</abbr>], correspondence analysis applied to relative synonymous codon usage [<abbr bid="B27">27</abbr>,<abbr bid="B33">33</abbr>,<abbr bid="B34">34</abbr>], analysis of variance [<abbr bid="B35">35</abbr>], a comparative genomics approach [<abbr bid="B22">22</abbr>,<abbr bid="B36">36</abbr>,<abbr bid="B37">37</abbr>,<abbr bid="B38">38</abbr>], simple DNA walks [<abbr bid="B39">39</abbr>], detrended DNA walks split by codon positions [<abbr bid="B40">40</abbr>,<abbr bid="B41">41</abbr>,<abbr bid="B42">42</abbr>], skewed oligomer distributions [<abbr bid="B43">43</abbr>], and moving windows plots with direct [<abbr bid="B44">44</abbr>,<abbr bid="B45">45</abbr>] or cumulated [<abbr bid="B46">46</abbr>,<abbr bid="B47">47</abbr>] or codon position-dependent [<abbr bid="B48">48</abbr>] skew indices. Here we have used a simple graphical representation, very close to the raw data, that provides a straightforward visualization and interpretation of results.</p>
         <p>Consider a complete annotated bacterial chromosome in which origin and terminus of replication have been located so that we know whether a subsequence is on the leading strand or the lagging strand at replication (Figure <figr fid="F1">1</figr>). Let A<sub>3</sub>, C<sub>3</sub>, G<sub>3</sub>, and T<sub>3</sub> denote the base counts in the sense strand of a given coding sequence when only the weakly selected third codon positions are considered. Deviations from PR2 in third codon positions are characterized by G<sub>3</sub>/(G<sub>3</sub>+C<sub>3</sub>) and A<sub>3</sub>/(A<sub>3</sub>+T<sub>3</sub>) to draw PR2-plots [<abbr bid="B7">7</abbr>].</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Representation of the strands used for analysis in this paper</p>
            </caption>
            <text>
               <p>Representation of the strands used for analysis in this paper. The leading strand for replication is black and the lagging one is shaded gray; note the switch at the origin of replication. The sense strand of coding sequences is represented by white arrows. The sense sequence of a coding region that is transcribed in the same direction as the motion of the replication fork is in the leading strand, whereas a sense sequence that is transcribed in the opposite direction is in the lagging strand. The strand used for large intergenic spaces, represented by large boxes, is always the published strand from 5' to 3'. The strand used for potentially transcribed untranslated spaces, that is, small intergenic regions among co-oriented genes, is represented by small boxes.</p>
            </text>
            <graphic file="gb-2002-3-10-research0058-1"/>
         </fig>
         <p>Figure <figr fid="F2">2a</figr> shows diagrammatically what would be expected under the null hypothesis PR1: the PR2 state should be observed. Although there will statistical fluctuations, spreading genes across the space indicated by each circle, the circles are centered at the midpoint of the plot (0.5, 0.5) which corresponds to the PR2 state. The range of dispersion is greater for the leading-strand group, because we have assumed here (for clarity, to avoid overlap in the figures), that there is an excess of coding sequences in the leading strand, as is usually observed in bacteria [<abbr bid="B48">48</abbr>]. Figure <figr fid="F2">2b</figr> shows diagrammatically what would be expected when replication-associated mutation pressure enriches the leading strand in G and T. The leading-strand coding sequences, whose sense strand is, by definition, the leading strand for replication (Figure <figr fid="F1">1</figr>) are therefore also enriched in G and T. Thus, <it>x</it><sub>1</sub>, their average G<sub>3</sub>/(G<sub>3</sub>+C<sub>3</sub>) value, is greater than 0.5, and <it>y</it><sub>1</sub>, their average A<sub>3</sub>/(A<sub>3</sub>+T<sub>3</sub>) value, is less than 0.5. From DNA duplex base-pairing rules, if the leading strand for replication is enriched in G and T, its complementary lagging strand is conversely enriched in C and A. Therefore, we have a perfectly symmetric situation for the lagging-strand coding sequences, whose sense strand is, by definition, the lagging strand for replication (Figure <figr fid="F1">1</figr>), with <it>x</it><sub>2</sub> = 1 - <it>x</it><sub>1</sub> and <it>y</it><sub>2</sub> = 1 - <it>y</it><sub>1</sub>. Note that the overall average is biased towards the leading-strand group because of the excess of leading-strand coding sequences, but that the midpoint (<it>x</it><sub>c</sub> = (<it>x</it><sub>1</sub> + <it>x</it><sub>2</sub>)/2, <it>y</it><sub>c</sub> = (<it>y</it><sub>1</sub> + <it>y</it><sub>2</sub>)/2) is not affected by this uneven distribution. Figure <figr fid="F2">2c</figr> shows diagrammatically what would be expected with asymmetric forces, such as codon usage that optimizes translation efficiency [<abbr bid="B49">49</abbr>,<abbr bid="B50">50</abbr>] or transcription-induced mutation pressure [<abbr bid="B51">51</abbr>]. These affect the sense strand of both the leading and lagging coding sequences in the same direction, because the analyzed strand is always the sense strand. Note that the deviation from the null hypothesis could be higher [<abbr bid="B22">22</abbr>] for the leading group because a greater number of highly expressed genes are found in this group [<abbr bid="B48">48</abbr>], so that both transcription-induced mutation and translation-induced selection [<abbr bid="B52">52</abbr>,<abbr bid="B53">53</abbr>] could be higher in these genes. However, this effect was found to be negligible [<abbr bid="B35">35</abbr>], and never produces a symmetric distribution around the center of the two groups, as in previous case (Figure <figr fid="F2">2b</figr>), because the two groups are expected to be moved in the same direction for transcription- and translation-induced biases, instead of in opposite directions for replication-induced biases. Figure <figr fid="F2">2d</figr> shows diagrammatically what would be expected when all phenomena are combined. The distance between the averages of the two groups,</p>
         <fig id="F2">
            <title>
               <p>Figure 2</p>
            </title>
            <caption>
               <p>Expected deviations in PR2 plots in coding sequences and small intergenic regions among co-oriented genes</p>
            </caption>
            <text>
               <p>Expected deviations in PR2 plots in coding sequences and small intergenic regions among co-oriented genes. <b>(a)</b> The null hypothesis: mutation and selection are symmetric with respect to the two DNA strands; <b>(b)</b> replication-induced mutation pressure is asymmetric; <b>(c)</b> transcription- or translation-associated mutation or selection pressures are asymmetric; <b>(d)</b> all forces in (b) and (c) combined are asymmetric.</p>
            </text>
            <graphic file="gb-2002-3-10-research0058-2"/>
         </fig>
         <p>
            <graphic file="gb-2002-3-10-research0058-i1.gif"/>
         </p>
         <p>is the contribution of replication-associated effects to PR2 deviations. The distance from the center of the plot to the midpoint of <it>B</it><sub><it>I</it></sub> (<it>x</it><sub>c</sub>; <it>y</it><sub>c</sub>),</p>
         <p>
            <graphic file="gb-2002-3-10-research0058-i2.gif"/>
         </p>
         <p>is the overall contribution of transcription- and translation-associated effects to PR2 deviations, where <it>B</it><sub><it>I</it></sub> is the replication-associated bias.</p>
         <p>Consider now a small transcribed untranslated intergenic region between two co-oriented coding sequences. Let A, C, G, T denote its base counts in the <it>cis</it>-sense, that is in the same strand as the sense strand of the two flanking coding sequences (Figure <figr fid="F1">1</figr>). As previously described, deviations from PR2 are characterized by using G/(G+C) and A/(A+T) values to draw PR2-plots. Expected effects in the <it>cis</it>-sense strand are the same as in the sense strand (Figure <figr fid="F2">2</figr>) except for translation-induced selection, which is absent in untranslated regions. Therefore, a difference in <it>B</it><sub><it>II</it></sub> when compared with the analysis of coding sequences is evidence of translation-associated effects on PR2 deviations in coding sequences. Last, consider a large untranscribed intergenic region and note its base counts (A, C, G, T) in the published strand of the chromosome (Figure <figr fid="F1">1</figr>).</p>
         <p>Figure <figr fid="F3">3a</figr> shows what would be expected in spacers under replication-induced effects alone when the leading strand is enriched in G and T. This is essentially the same as the situation for third codon position when there is only replication-induced mutation pressure (Figure <figr fid="F2">2b</figr>), because all bases are involved, not just those in protein-coding genes. The sole difference is that the leading and lagging group are now of similar size, because when working with the published strand, there are approximately the same number of intergenic regions in the two groups (Figure <figr fid="F1">1</figr>). Therefore, when <it>B</it><sub><it>I</it></sub> is similarly oriented in the two analyses, this is an indication of a common underlying replication-induced mutation pressure, the difference in intensity being controlled by the difference in selective pressure between the two kinds of region.</p>
         <fig id="F3">
            <title>
               <p>Figure 3</p>
            </title>
            <caption>
               <p>Expected deviations in PR2 plots in large intergenic spaces</p>
            </caption>
            <text>
               <p>Expected deviations in PR2 plots in large intergenic spaces <b>(a)</b> when replication-induced mutation pressure is asymmetric and <b>(b)</b> in an extreme case of asymmetric transcription-induced forces(see text for details).</p>
            </text>
            <graphic file="gb-2002-3-10-research0058-3"/>
         </fig>
         <p>In practice, however, it is impossible to verify from DNA database annotations whether large intergenic regions are completely free from transcription. Consider an extreme theoretical case in which all large intergenic regions are transcribed (Figure <figr fid="F3">3b</figr>). For simplicity, the whole intergenic region is assumed to be transcribed from the same strand: the whole intergenic region is either the template strand or the anti-template strand, but not a mixture of both. (The anti-template strand has the same nucleotide sequence as the transcribed RNA because the complementary strand is the template strand for transcription.) This simplification does not alter the reasoning, because a proper partition of intergenic regions would remove the mixed cases. In the leading-strand group, positioned to the right of the origin (Figure <figr fid="F1">1</figr>), there is an excess of anti-template strand for transcription because of the selection against head-on collisions between RNA and DNA polymerase complexes [<abbr bid="B54">54</abbr>,<abbr bid="B55">55</abbr>,<abbr bid="B56">56</abbr>]. Therefore, the anti-template strand subgroup is bigger than the template one in the leading group, and by a symmetrical reasoning, on the left to the origin, the template strand subgroup is bigger than the anti-template one in the lagging-strand group. Figure <figr fid="F3">3b</figr> shows what is expected if transcription leads to an excess of G and T bases in the anti-template strand. Note that in this extreme case transcription effects yield <it>B</it><sub><it>I</it></sub> &#8800; 0, but the general pattern (Figure <figr fid="F3">3b</figr>) is completely different from replication-induced mutation pressure effects (Figure <figr fid="F2">2b</figr>), making it easy to discriminate between the two effects.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Distribution of G+C content</p>
            </st>
            <p>Thanks to the availability of an increasing number of complete bacterial genomes with a high G+C content, there was a representative distribution of G+C content in the dataset (Table <tblr tid="T1">1</tblr>), and the values of G+C content in third codon positions (<it>P</it><sub>3</sub>, as previously defined in [<abbr bid="B11">11</abbr>]; see also Materials and methods), in first and second codon positions averaged (<it>P</it><sub>12</sub>), and in intergenic spaces (<it>GC</it><sub><it>IGR</it></sub>), were found to be consistent with previously known results [<abbr bid="B10">10</abbr>,<abbr bid="B11">11</abbr>] both in terms of their range of distribution and the correlation between them (Figure <figr fid="F4">4</figr>). Intragenomic <it>P</it><sub>3</sub> distributions of all species examined (all histograms are given in the tables in Additional data files under the 'GCorder' flag) are unimodal and narrow for a major class of genes comprising more than 90% of the total genes, although average <it>P</it><sub>3</sub> values differ widely between species. Some species show scattering of <it>P</it><sub>3</sub> distribution. In particular, a previous study of 254 genes in <it>Pseudomonas</it> [<abbr bid="B7">7</abbr>] revealed a sizeable minor class of genes (28%) forming a trail toward lower G+C content (genes with <it>P</it><sub>3</sub> &lt; 0.75). In our present study of 5,255 genes, the minor class of genes (with <it>P</it><sub>3</sub> &lt; 0.75) is only 8% of the total, indicating that the previous sample was biased towards the minor class. Moreover, the width of the distribution (standard deviation) of the majority of genes is narrower in species with both extreme ranges of <it>P</it><sub>3</sub> than in those with middle ranges. These features of the <it>P</it><sub>3</sub> distribution are expected from the theory of bidirectional mutation rates and their equilibrium [<abbr bid="B8">8</abbr>]. In this theory, mutation rates, <it>u</it> (GC &#8594; AT) and <it>v</it> (AT &#8594; GC), will lead to a G+C content (<it>P</it><sub>3</sub>) at equilibrium of <it>v</it>/(<it>u</it>+<it>v</it>). The expected standard deviation is</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Distribution of G+C content in the dataset</p>
               </caption>
               <text>
                  <p>Distribution of G+C content in the dataset. The G+C content of the 51 bacterial chromosomes under analysis was highly variable from 25.5% in the <it>Ureaplasma urealyticum</it> chromosome to 67.9% in <it>Halobacterium</it> sp., with a larger distribution in third codon positions (<it>x</it>-axis <it>P</it><sub>3</sub> from 11.2% to 88.0%) than in intergenic spaces (<it>y</it>-axis <it>GC</it><sub>IGR</sub> from 15.5% to 63.4%) than in first and second position (<it>y</it>-axis <it>P</it><sub>12</sub> from 33.4% to 62.2%) as expected [<abbr bid="B10">10</abbr>,<abbr bid="B11">11</abbr>]. Regression slopes (or &#949; values) and their standard deviations for <it>P</it><sub>12</sub> and <it>GC</it><sub>IGR</sub> were 0.343 &#177; 0.021 and 0.586 &#177; 0.024, respectively.</p>
               </text>
               <graphic file="gb-2002-3-10-research0058-4"/>
            </fig>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>List of 51 chromosomes under study</p>
               </caption>
               <tblbdy cols="10">
                  <r>
                     <c ca="left">
                        <p>Chromosome</p>
                     </c>
                     <c ca="left">
                        <p>Strain</p>
                     </c>
                     <c ca="left">
                        <p>ACC</p>
                     </c>
                     <c ca="left">
                        <p>EMGLib</p>
                     </c>
                     <c ca="left">
                        <p>Genome (bp)</p>
                     </c>
                     <c ca="left">
                        <p>Chir</p>
                     </c>
                     <c ca="center">
                        <p>G+C (%)</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>P</it>
                           <sub>12</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>P</it>
                           <sub>3</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>GC</it>
                           <sub>IGR</sub>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="10">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Aeropyrum pernix</p>
                     </c>
                     <c ca="left">
                        <p>K1</p>
                     </c>
                     <c ca="left">
                        <p>AP0000 [<abbr bid="B58">58</abbr>,<abbr bid="B59">59</abbr>,<abbr bid="B60">60</abbr>,<abbr bid="B61">61</abbr>,<abbr bid="B62">62</abbr>,<abbr bid="B63">63</abbr>,<abbr bid="B64">64</abbr>]</p>
                     </c>
                     <c ca="left">
                        <p>CG0043</p>
                     </c>
                     <c ca="left">
                        <p>1,669,695</p>
                     </c>
                     <c ca="left">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>56.3</p>
                     </c>
                     <c ca="center">
                        <p>0.567</p>
                     </c>
                     <c ca="center">
                        <p>0.662</p>
                     </c>
                     <c ca="center">
                        <p>0.506</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Aquifex aeolicus</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>VF5</p>
                     </c>
                     <c ca="left">
                        <p>AE000657</p>
                     </c>
                     <c ca="left">
                        <p>CG0034</p>
                     </c>
                     <c ca="left">
                        <p>1,551,335</p>
                     </c>
                     <c ca="left">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>43.5</p>
                     </c>
                     <c ca="center">
                        <p>0.442</p>
                     </c>
                     <c ca="center">
                        <p>0.488</p>
                     </c>
                     <c ca="center">
                        <p>0.386</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Archaeoglobus fulgidus</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>DSM 4304</p>
                     </c>
                     <c ca="left">
                        <p>AE000782</p>
                     </c>
                     <c ca="left">
                        <p>CG0032</p>
                     </c>
                     <c ca="left">
                        <p>2,178,400</p>
                     </c>
                     <c ca="left">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>48.6</p>
                     </c>
                     <c ca="center">
                        <p>0.507</p>
                     </c>
                     <c ca="center">
                        <p>0.489</p>
                     </c>
                     <c ca="center">
                        <p>0.377</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Bacillus halodurans</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>C-125</p>
                     </c>
                     <c ca="left">
                        <p>BA000004</p>
                     </c>
                     <c ca="left">
                        <p>CG0057</p>
                     </c>
                     <c ca="left">
                        <p>4,202,353</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>43.7</p>
                     </c>
                     <c ca="center">
                        <p>0.466</p>
                     </c>
                     <c ca="center">
                        <p>0.403</p>
                     </c>
                     <c ca="center">
                        <p>0.384</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Bacillus subtilis</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>168</p>
                     </c>
                     <c ca="left">
                        <p>AL009126</p>
                     </c>
                     <c ca="left">
                        <p>CG0031</p>
                     </c>
                     <c ca="left">
                        <p>4,214,814</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>43.5</p>
                     </c>
                     <c ca="center">
                        <p>0.455</p>
                     </c>
                     <c ca="center">
                        <p>0.424</p>
                     </c>
                     <c ca="center">
                        <p>0.364</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Borrelia burgdorferi</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>B31</p>
                     </c>
                     <c ca="left">
                        <p>AE000783</p>
                     </c>
                     <c ca="left">
                        <p>CG0033</p>
                     </c>
                     <c ca="left">
                        <p>910,724</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>28.6</p>
                     </c>
                     <c ca="center">
                        <p>0.347</p>
                     </c>
                     <c ca="center">
                        <p>0.197</p>
                     </c>
                     <c ca="center">
                        <p>0.212</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Buchnera</it> sp.</p>
                     </c>
                     <c ca="left">
                        <p>APS</p>
                     </c>
                     <c ca="left">
                        <p>AP000398</p>
                     </c>
                     <c ca="left">
                        <p>CG0058</p>
                     </c>
                     <c ca="left">
                        <p>640,681</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>26.3</p>
                     </c>
                     <c ca="center">
                        <p>0.360</p>
                     </c>
                     <c ca="center">
                        <p>0.122</p>
                     </c>
                     <c ca="center">
                        <p>0.155</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Caulobacter crescentus</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>CB15</p>
                     </c>
                     <c ca="left">
                        <p>AE005673</p>
                     </c>
                     <c ca="left">
                        <p>CG0068</p>
                     </c>
                     <c ca="left">
                        <p>4,016,947</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>67.2</p>
                     </c>
                     <c ca="center">
                        <p>0.600</p>
                     </c>
                     <c ca="center">
                        <p>0.856</p>
                     </c>
                     <c ca="center">
                        <p>0.625</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Campylobacter jejuni</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>NCTC11168</p>
                     </c>
                     <c ca="left">
                        <p>AL111168</p>
                     </c>
                     <c ca="left">
                        <p>CG0047</p>
                     </c>
                     <c ca="left">
                        <p>1641481</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>30.6</p>
                     </c>
                     <c ca="center">
                        <p>0.381</p>
                     </c>
                     <c ca="center">
                        <p>0.175</p>
                     </c>
                     <c ca="center">
                        <p>0.205</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Chlamydia muridarum</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>MoPn</p>
                     </c>
                     <c ca="left">
                        <p>AE002160</p>
                     </c>
                     <c ca="left">
                        <p>CG0054</p>
                     </c>
                     <c ca="left">
                        <p>1069412</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>40.3</p>
                     </c>
                     <c ca="center">
                        <p>0.462</p>
                     </c>
                     <c ca="center">
                        <p>0.317</p>
                     </c>
                     <c ca="center">
                        <p>0.360</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Chlamydia trachomatis</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>D/U W-3/Cx</p>
                     </c>
                     <c ca="left">
                        <p>AE001273</p>
                     </c>
                     <c ca="left">
                        <p>CG0037</p>
                     </c>
                     <c ca="left">
                        <p>1,042,519</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>41.3</p>
                     </c>
                     <c ca="center">
                        <p>0.468</p>
                     </c>
                     <c ca="center">
                        <p>0.330</p>
                     </c>
                     <c ca="center">
                        <p>0.360</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Chlamydophila pneumoniae</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>CWL029</p>
                     </c>
                     <c ca="left">
                        <p>AE001363</p>
                     </c>
                     <c ca="left">
                        <p>CG0041</p>
                     </c>
                     <c ca="left">
                        <p>1,230,230</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>40.6</p>
                     </c>
                     <c ca="center">
                        <p>0.459</p>
                     </c>
                     <c ca="center">
                        <p>0.332</p>
                     </c>
                     <c ca="center">
                        <p>0.323</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Chlamydophila pneumoniae</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>AR39</p>
                     </c>
                     <c ca="left">
                        <p>AE002161</p>
                     </c>
                     <c ca="left">
                        <p>CG0053</p>
                     </c>
                     <c ca="left">
                        <p>1,229,853</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>40.6</p>
                     </c>
                     <c ca="center">
                        <p>0.460</p>
                     </c>
                     <c ca="center">
                        <p>0.271</p>
                     </c>
                     <c ca="center">
                        <p>0.340</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Chlamydophila pneumoniae</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>J138</p>
                     </c>
                     <c ca="left">
                        <p>BA000008</p>
                     </c>
                     <c ca="left">
                        <p>CG0062</p>
                     </c>
                     <c ca="left">
                        <p>1,228,267</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>40.6</p>
                     </c>
                     <c ca="center">
                        <p>0.459</p>
                     </c>
                     <c ca="center">
                        <p>0.332</p>
                     </c>
                     <c ca="center">
                        <p>0.323</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Deinococcus radiodurans</it> chromosome 1</p>
                     </c>
                     <c ca="left">
                        <p>R1</p>
                     </c>
                     <c ca="left">
                        <p>AE000513</p>
                     </c>
                     <c ca="left">
                        <p>CG0049</p>
                     </c>
                     <c ca="left">
                        <p>2,648,638</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>67.0</p>
                     </c>
                     <c ca="center">
                        <p>0.622</p>
                     </c>
                     <c ca="center">
                        <p>0.801</p>
                     </c>
                     <c ca="center">
                        <p>0.623</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Deinococcus radiodurans</it> chromosome 2</p>
                     </c>
                     <c ca="left">
                        <p>R1</p>
                     </c>
                     <c ca="left">
                        <p>AE001825</p>
                     </c>
                     <c ca="left">
                        <p>CG0050</p>
                     </c>
                     <c ca="left">
                        <p>412,348</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>66.7</p>
                     </c>
                     <c ca="center">
                        <p>0.605</p>
                     </c>
                     <c ca="center">
                        <p>0.824</p>
                     </c>
                     <c ca="center">
                        <p>0.593</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Escherichia coli</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>K-12</p>
                     </c>
                     <c ca="left">
                        <p>U00096</p>
                     </c>
                     <c ca="left">
                        <p>CG0028</p>
                     </c>
                     <c ca="left">
                        <p>4,639,221</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>50.8</p>
                     </c>
                     <c ca="center">
                        <p>0.514</p>
                     </c>
                     <c ca="center">
                        <p>0.534</p>
                     </c>
                     <c ca="center">
                        <p>0.425</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Escherichia coli</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>EDL933</p>
                     </c>
                     <c ca="left">
                        <p>AE005174</p>
                     </c>
                     <c ca="left">
                        <p>CG0069</p>
                     </c>
                     <c ca="left">
                        <p>5,528,970</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>50.4</p>
                     </c>
                     <c ca="center">
                        <p>0.513</p>
                     </c>
                     <c ca="center">
                        <p>0.527</p>
                     </c>
                     <c ca="center">
                        <p>0.425</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Escherichia coli</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>RIMD 0509952</p>
                     </c>
                     <c ca="left">
                        <p>BA000007</p>
                     </c>
                     <c ca="left">
                        <p>CG0070</p>
                     </c>
                     <c ca="left">
                        <p>5,498,450</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>50.5</p>
                     </c>
                     <c ca="center">
                        <p>0.514</p>
                     </c>
                     <c ca="center">
                        <p>0.528</p>
                     </c>
                     <c ca="center">
                        <p>0.424</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Halobacterium</it> sp.</p>
                     </c>
                     <c ca="left">
                        <p>NRC-1</p>
                     </c>
                     <c ca="left">
                        <p>AE004437</p>
                     </c>
                     <c ca="left">
                        <p>CG0065</p>
                     </c>
                     <c ca="left">
                        <p>2,014,239</p>
                     </c>
                     <c ca="left">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>67.9</p>
                     </c>
                     <c ca="center">
                        <p>0.600</p>
                     </c>
                     <c ca="center">
                        <p>0.880</p>
                     </c>
                     <c ca="center">
                        <p>0.634</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Haemophilus influenzae</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>KW20</p>
                     </c>
                     <c ca="left">
                        <p>L42023</p>
                     </c>
                     <c ca="left">
                        <p>CG0001</p>
                     </c>
                     <c ca="left">
                        <p>1,830,140</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>38.2</p>
                     </c>
                     <c ca="center">
                        <p>0.447</p>
                     </c>
                     <c ca="center">
                        <p>0.265</p>
                     </c>
                     <c ca="center">
                        <p>0.312</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Helicobacter pylori</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>26695</p>
                     </c>
                     <c ca="left">
                        <p>AE000511</p>
                     </c>
                     <c ca="left">
                        <p>CG0001</p>
                     </c>
                     <c ca="left">
                        <p>1,667,877</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>38.2</p>
                     </c>
                     <c ca="center">
                        <p>0.396</p>
                     </c>
                     <c ca="center">
                        <p>0.404</p>
                     </c>
                     <c ca="center">
                        <p>0.302</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Helicobacter pylori</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>J99</p>
                     </c>
                     <c ca="left">
                        <p>AE001439</p>
                     </c>
                     <c ca="left">
                        <p>CG0042</p>
                     </c>
                     <c ca="left">
                        <p>1,643,831</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>39.2</p>
                     </c>
                     <c ca="center">
                        <p>0.397</p>
                     </c>
                     <c ca="center">
                        <p>0.411</p>
                     </c>
                     <c ca="center">
                        <p>0.308</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Lactococcus lactis</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>IL1403</p>
                     </c>
                     <c ca="left">
                        <p>AE005176</p>
                     </c>
                     <c ca="left">
                        <p>CG0074</p>
                     </c>
                     <c ca="left">
                        <p>2365589</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>35.5</p>
                     </c>
                     <c ca="center">
                        <p>0.425</p>
                     </c>
                     <c ca="center">
                        <p>0.229</p>
                     </c>
                     <c ca="center">
                        <p>0.276</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Methanobacterium thermoautotrophicum</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>H</p>
                     </c>
                     <c ca="left">
                        <p>AE000666</p>
                     </c>
                     <c ca="left">
                        <p>CG0030</p>
                     </c>
                     <c ca="left">
                        <p>1,751,377</p>
                     </c>
                     <c ca="left">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>49.5</p>
                     </c>
                     <c ca="center">
                        <p>0.525</p>
                     </c>
                     <c ca="center">
                        <p>0.500</p>
                     </c>
                     <c ca="center">
                        <p>0.383</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Methanococcus jannaschii</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>DSM 2661</p>
                     </c>
                     <c ca="left">
                        <p>L77117</p>
                     </c>
                     <c ca="left">
                        <p>CG0003</p>
                     </c>
                     <c ca="left">
                        <p>1,664,977</p>
                     </c>
                     <c ca="left">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>31.4</p>
                     </c>
                     <c ca="center">
                        <p>0.347</p>
                     </c>
                     <c ca="center">
                        <p>0.307</p>
                     </c>
                     <c ca="center">
                        <p>0.252</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Mycobacterium leprae</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>TN</p>
                     </c>
                     <c ca="left">
                        <p>AL450380</p>
                     </c>
                     <c ca="left">
                        <p>CG0071</p>
                     </c>
                     <c ca="left">
                        <p>3,268,203</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>57.8</p>
                     </c>
                     <c ca="center">
                        <p>0.576</p>
                     </c>
                     <c ca="center">
                        <p>0.638</p>
                     </c>
                     <c ca="center">
                        <p>0.546</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Mycobacterium tuberculosis</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>H37Rv</p>
                     </c>
                     <c ca="left">
                        <p>AL123456</p>
                     </c>
                     <c ca="left">
                        <p>CG0035</p>
                     </c>
                     <c ca="left">
                        <p>4,411,529</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>65.6</p>
                     </c>
                     <c ca="center">
                        <p>0.604</p>
                     </c>
                     <c ca="center">
                        <p>0.787</p>
                     </c>
                     <c ca="center">
                        <p>0.625</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Mycobacterium tuberculosis</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>CDC1551</p>
                     </c>
                     <c ca="left">
                        <p>AE000516</p>
                     </c>
                     <c ca="left">
                        <p>CG0073</p>
                     </c>
                     <c ca="left">
                        <p>4,403,836</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>65.6</p>
                     </c>
                     <c ca="center">
                        <p>0.605</p>
                     </c>
                     <c ca="center">
                        <p>0.783</p>
                     </c>
                     <c ca="center">
                        <p>0.627</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Mycoplasma genitalium</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>G-37</p>
                     </c>
                     <c ca="left">
                        <p>L43967</p>
                     </c>
                     <c ca="left">
                        <p>CG0002</p>
                     </c>
                     <c ca="left">
                        <p>580,073</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>31.7</p>
                     </c>
                     <c ca="center">
                        <p>0.367</p>
                     </c>
                     <c ca="center">
                        <p>0.221</p>
                     </c>
                     <c ca="center">
                        <p>0.272</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Mycoplasma pneumoniae</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>M 129</p>
                     </c>
                     <c ca="left">
                        <p>U00089</p>
                     </c>
                     <c ca="left">
                        <p>CG0011</p>
                     </c>
                     <c ca="left">
                        <p>816,394</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>40.0</p>
                     </c>
                     <c ca="center">
                        <p>0.409</p>
                     </c>
                     <c ca="center">
                        <p>0.403</p>
                     </c>
                     <c ca="center">
                        <p>0.339</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Mycoplasma pulmonis</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>UAB CTIP</p>
                     </c>
                     <c ca="left">
                        <p>AL445566</p>
                     </c>
                     <c ca="left">
                        <p>CG0072</p>
                     </c>
                     <c ca="left">
                        <p>963879</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>26.6</p>
                     </c>
                     <c ca="center">
                        <p>0.343</p>
                     </c>
                     <c ca="center">
                        <p>0.140</p>
                     </c>
                     <c ca="center">
                        <p>0.182</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Neisseria meningitidis</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Z2491 (A)</p>
                     </c>
                     <c ca="left">
                        <p>AL162759</p>
                     </c>
                     <c ca="left">
                        <p>CG0056</p>
                     </c>
                     <c ca="left">
                        <p>2,184,406</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>51.8</p>
                     </c>
                     <c ca="center">
                        <p>0.503</p>
                     </c>
                     <c ca="center">
                        <p>0.598</p>
                     </c>
                     <c ca="center">
                        <p>0.444</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Neisseria meningitidis</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>MC58 (B)</p>
                     </c>
                     <c ca="left">
                        <p>AE002098</p>
                     </c>
                     <c ca="left">
                        <p>CG0055</p>
                     </c>
                     <c ca="left">
                        <p>2,272,351</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>51.5</p>
                     </c>
                     <c ca="center">
                        <p>0.502</p>
                     </c>
                     <c ca="center">
                        <p>0.593</p>
                     </c>
                     <c ca="center">
                        <p>0.447</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Pasteurella multocida</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>PM70</p>
                     </c>
                     <c ca="left">
                        <p>AE004439</p>
                     </c>
                     <c ca="left">
                        <p>CG0075</p>
                     </c>
                     <c ca="left">
                        <p>2,257,487</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>40.4</p>
                     </c>
                     <c ca="center">
                        <p>0.453</p>
                     </c>
                     <c ca="center">
                        <p>0.323</p>
                     </c>
                     <c ca="center">
                        <p>0.329</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Pseudomonas aeruginosa</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>PA01</p>
                     </c>
                     <c ca="left">
                        <p>AE004091</p>
                     </c>
                     <c ca="left">
                        <p>CG0059</p>
                     </c>
                     <c ca="left">
                        <p>6,264,403</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>66.6</p>
                     </c>
                     <c ca="center">
                        <p>0.583</p>
                     </c>
                     <c ca="center">
                        <p>0.870</p>
                     </c>
                     <c ca="center">
                        <p>0.616</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Pyrococcus abyssi</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>GE5</p>
                     </c>
                     <c ca="left">
                        <p>AL096836</p>
                     </c>
                     <c ca="left">
                        <p>CG0045</p>
                     </c>
                     <c ca="left">
                        <p>1,765,118</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>44.7</p>
                     </c>
                     <c ca="center">
                        <p>0.457</p>
                     </c>
                     <c ca="center">
                        <p>0.511</p>
                     </c>
                     <c ca="center">
                        <p>0.379</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Pyrococcus horikoshii</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>OT3</p>
                     </c>
                     <c ca="left">
                        <p>AP00000 [<abbr bid="B1">1</abbr>,<abbr bid="B2">2</abbr>,<abbr bid="B3">3</abbr>,<abbr bid="B4">4</abbr>,<abbr bid="B5">5</abbr>,<abbr bid="B6">6</abbr>,<abbr bid="B7">7</abbr>]</p>
                     </c>
                     <c ca="left">
                        <p>CG0038</p>
                     </c>
                     <c ca="left">
                        <p>1,738,505</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>41.9</p>
                     </c>
                     <c ca="center">
                        <p>0.449</p>
                     </c>
                     <c ca="center">
                        <p>0.428</p>
                     </c>
                     <c ca="center">
                        <p>0.376</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Rickettsia prowazekii</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Madrid E</p>
                     </c>
                     <c ca="left">
                        <p>AJ235269</p>
                     </c>
                     <c ca="left">
                        <p>CG0040</p>
                     </c>
                     <c ca="left">
                        <p>1,111,523</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>29.0</p>
                     </c>
                     <c ca="center">
                        <p>0.387</p>
                     </c>
                     <c ca="center">
                        <p>0.168</p>
                     </c>
                     <c ca="center">
                        <p>0.242</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Staphylococcus aureus</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Mu50</p>
                     </c>
                     <c ca="left">
                        <p>BA000017</p>
                     </c>
                     <c ca="left">
                        <p>CG0076</p>
                     </c>
                     <c ca="left">
                        <p>2,878,134</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>32.9</p>
                     </c>
                     <c ca="center">
                        <p>0.404</p>
                     </c>
                     <c ca="center">
                        <p>0.205</p>
                     </c>
                     <c ca="center">
                        <p>0.277</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Staphylococcus aureus</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>N315</p>
                     </c>
                     <c ca="left">
                        <p>BA000018</p>
                     </c>
                     <c ca="left">
                        <p>CG0077</p>
                     </c>
                     <c ca="left">
                        <p>2,813,641</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>32.8</p>
                     </c>
                     <c ca="center">
                        <p>0.404</p>
                     </c>
                     <c ca="center">
                        <p>0.204</p>
                     </c>
                     <c ca="center">
                        <p>0.274</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Streptococcus pyogenes</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>SF370</p>
                     </c>
                     <c ca="left">
                        <p>AE004092</p>
                     </c>
                     <c ca="left">
                        <p>CG0078</p>
                     </c>
                     <c ca="left">
                        <p>1,852,441</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>38.5</p>
                     </c>
                     <c ca="center">
                        <p>0.439</p>
                     </c>
                     <c ca="center">
                        <p>0.297</p>
                     </c>
                     <c ca="center">
                        <p>0.325</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Sulfolobus solfataricus</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>P2</p>
                     </c>
                     <c ca="left">
                        <p>AE006641</p>
                     </c>
                     <c ca="left">
                        <p>CG0079</p>
                     </c>
                     <c ca="left">
                        <p>2,992,245</p>
                     </c>
                     <c ca="left">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>35.8</p>
                     </c>
                     <c ca="center">
                        <p>0.409</p>
                     </c>
                     <c ca="center">
                        <p>0.325</p>
                     </c>
                     <c ca="center">
                        <p>0.310</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Synechocystis</it> sp.</p>
                     </c>
                     <c ca="left">
                        <p>PCC 6803</p>
                     </c>
                     <c ca="left">
                        <p>AB001339</p>
                     </c>
                     <c ca="left">
                        <p>CG0010</p>
                     </c>
                     <c ca="left">
                        <p>3,573,470</p>
                     </c>
                     <c ca="left">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>47.7</p>
                     </c>
                     <c ca="center">
                        <p>0.503</p>
                     </c>
                     <c ca="center">
                        <p>0.468</p>
                     </c>
                     <c ca="center">
                        <p>0.421</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Thermoplasma acidophilum</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>DSM 1728</p>
                     </c>
                     <c ca="left">
                        <p>AL139299</p>
                     </c>
                     <c ca="left">
                        <p>CG0060</p>
                     </c>
                     <c ca="left">
                        <p>1,564,906</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>46.0</p>
                     </c>
                     <c ca="center">
                        <p>0.473</p>
                     </c>
                     <c ca="center">
                        <p>0.558</p>
                     </c>
                     <c ca="center">
                        <p>0.361</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Thermotoga maritima</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>MSB8</p>
                     </c>
                     <c ca="left">
                        <p>AE000512</p>
                     </c>
                     <c ca="left">
                        <p>CG0044</p>
                     </c>
                     <c ca="left">
                        <p>1,860,725</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>46.2</p>
                     </c>
                     <c ca="center">
                        <p>0.456</p>
                     </c>
                     <c ca="center">
                        <p>0.524</p>
                     </c>
                     <c ca="center">
                        <p>0.397</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Treponema pallidum</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Nichols</p>
                     </c>
                     <c ca="left">
                        <p>AE000520</p>
                     </c>
                     <c ca="left">
                        <p>CG0036</p>
                     </c>
                     <c ca="left">
                        <p>1,138,011</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>52.8</p>
                     </c>
                     <c ca="center">
                        <p>0.537</p>
                     </c>
                     <c ca="center">
                        <p>0.538</p>
                     </c>
                     <c ca="center">
                        <p>0.541</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Ureaplasma urealyticum</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>ATCC 700970</p>
                     </c>
                     <c ca="left">
                        <p>AF222894</p>
                     </c>
                     <c ca="left">
                        <p>CG0048</p>
                     </c>
                     <c ca="left">
                        <p>751,719</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>25.5</p>
                     </c>
                     <c ca="center">
                        <p>0.334</p>
                     </c>
                     <c ca="center">
                        <p>0.112</p>
                     </c>
                     <c ca="center">
                        <p>0.178</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Vibrio cholerae</it> chromosome 1</p>
                     </c>
                     <c ca="left">
                        <p>N16961</p>
                     </c>
                     <c ca="left">
                        <p>AE003852</p>
                     </c>
                     <c ca="left">
                        <p>CG0063</p>
                     </c>
                     <c ca="left">
                        <p>2,961,149</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>47.7</p>
                     </c>
                     <c ca="center">
                        <p>0.494</p>
                     </c>
                     <c ca="center">
                        <p>0.470</p>
                     </c>
                     <c ca="center">
                        <p>0.412</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Vibrio cholerae</it> chromosome 2</p>
                     </c>
                     <c ca="left">
                        <p>N16961</p>
                     </c>
                     <c ca="left">
                        <p>AE003853</p>
                     </c>
                     <c ca="left">
                        <p>CG0064</p>
                     </c>
                     <c ca="left">
                        <p>1,072,315</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>46.9</p>
                     </c>
                     <c ca="center">
                        <p>0.482</p>
                     </c>
                     <c ca="center">
                        <p>0.458</p>
                     </c>
                     <c ca="center">
                        <p>0.431</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Xylella fastidiosa</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>9a5c</p>
                     </c>
                     <c ca="left">
                        <p>AE003849</p>
                     </c>
                     <c ca="left">
                        <p>CG0061</p>
                     </c>
                     <c ca="left">
                        <p>2,679,306</p>
                     </c>
                     <c ca="left">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>52.7</p>
                     </c>
                     <c ca="center">
                        <p>0.535</p>
                     </c>
                     <c ca="center">
                        <p>0.552</p>
                     </c>
                     <c ca="center">
                        <p>0.468</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>The first column is the species names with chromosome number when necessary, the second the strain that was sequenced. ACC is the accession number in the DDBJ/EMBL/GenBank database, EMGLib the accession number in EMGlib database [<abbr bid="B88">88</abbr>]. Chir denotes whether a clear chirochore structure allowed for the classification of subsequences (for example, CDS) in the leading or lagging group. The G+C contents are given for the whole genome (G+C), in first and second codon positions (<it>P</it><sub>12</sub>), in third codon positions (<it>P</it><sub>3</sub>), and large intergenic spaces (<it>GC</it><sub>IGR</sub>) as defined in the Materials and methods section.</p>
               </tblfn>
            </tbl>
            <p>
               <graphic file="gb-2002-3-10-research0058-i3.gif"/>
            </p>
            <p>where <it>b</it> is an average of the number of third codon positions per gene [<abbr bid="B8">8</abbr>].</p>
         </sec>
         <sec>
            <st>
               <p>Difference in G+C content between leading and lagging strands</p>
            </st>
            <p>Statistically highly significant differences (<it>p</it> &lt; 0.01) in <it>P</it><sub>3</sub> between coding sequences located on the leading and lagging strands were found in 23 out of 43 chromosomes. In most cases (21 out of the 23) genes on the leading strand were found to have a lower <it>P</it><sub>3</sub> content than those on the lagging one. (Two exceptions were <it>Mycoplasma genitalium</it>, which is known to have a peculiar behavior with respect to G+C content [<abbr bid="B5">5</abbr>,<abbr bid="B6">6</abbr>], and <it>Caulobacter crescentus</it>.) As far as we know, this is the first time that such a systematic difference has been reported, but note that the differences were always very small. The maximum was only 0.035, for the difference in <it>P</it><sub>3</sub> content of leading and lagging strands of <it>Vibrio cholerae</it> chromosome 2, as compared to the 0.768 average <it>P</it><sub>3</sub> difference between <it>Halobacterium</it> sp. and <it>Ureaplasma urealyticum</it>, or the 0.050 accepted range of genomic G+C content polymorphism [<abbr bid="B57">57</abbr>]. There was no significant correlation between the average <it>P</it><sub>3</sub> and the difference in <it>P</it><sub>3</sub> between the two groups of genes. A highly significant difference in G+C content between the leading and the lagging sequences was found in only five chromosomes for large intergenic regions, and three chromosomes for small intergenic spaces.</p>
         </sec>
         <sec>
            <st>
               <p>PR2-biases in coding sequences</p>
            </st>
            <p>Out of 43 chromosomes, the differences in PR2-biases between the leading and lagging groups were highly significant (<it>p</it> &lt; 0.01) in 39 chromosomes for G<sub>3</sub>/(G<sub>3</sub>+C<sub>3</sub>) and in 34 for A<sub>3</sub>/(A<sub>3</sub>+T<sub>3</sub>) and in 32 chromosomes for simultaneously G<sub>3</sub>/(G<sub>3</sub>+C<sub>3</sub>) and A<sub>3</sub>/(A<sub>3</sub>+T<sub>3</sub>) (see Additional data files). In the PR2-bias-plot analysis, we found a general pattern of <it>x</it><sub>1</sub> ><it>x</it><sub>2</sub> and <it>y</it><sub>1</sub> &lt;<it>y</it><sub>2</sub>, as in Figure <figr fid="F2">2b</figr>, meaning that coding sequences in the leading strand had higher G<sub>3</sub>/(G<sub>3</sub>+C<sub>3</sub>) values and lower A<sub>3</sub>/(A<sub>3</sub>+T<sub>3</sub>) values than those on the lagging strand: leading coding sequences were enriched in keto-bases (G and T) in the third codon position compared to lagging coding sequences. There were, however, exceptions to this general trend: out of the 39 chromosomes that were highly significant for G<sub>3</sub>/(G<sub>3</sub>+C<sub>3</sub>), all had higher values in the leading strand, but out of the 34 chromosomes that were highly significant for A<sub>3</sub>/(A<sub>3</sub>+T<sub>3</sub>), three chromosomes did not follow the general trend (<it>Lactococcus lactis</it> and <it>Staphylococcus aureus</it> strains Mu50 and N315). Leading and lagging coding sequences are separated in PR2 plots as expected under replication-associated effects, and the leading group is almost always down right of the lagging group. However, the extent to which the groups are separated differs between species. The two most extreme species are <it>B. burgdorferi</it>, where the two groups of genes are completely resolved, and <it>M. genitalium</it>, where the two groups are almost completely overlapping (Figure <figr fid="F5">5</figr>). Other species with a spectacular difference in PR2 plots were <it>Chlamydia muridarum</it>, <it>Chlamydia trachomatis</it> and <it>Treponema pallidum</it>. Out of 43 chromosomes, the residual bias B<sub><it>II</it></sub> was in most cases oriented as in Figure <figr fid="F2">2c</figr>, with <it>x</it><sub>c</sub> &lt; 0.5 (all except <it>Pyrococcus abyssi</it> and <it>T. pallidum</it>) and <it>y</it><sub>c</sub> &lt; 0.5 (all except <it>Thermotoga maritima</it>). The relative contribution of replication-coupled bias (<it>B</it><sub><it>I</it></sub>) to the total bias (<it>B</it><sub><it>I</it></sub> plus <it>B</it><sub><it>II</it></sub>) ranged from 6% (<it>Mycoplasma pulmonis</it>) to 86% (<it>B. burgdorferi</it>) with an average value of 52%.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Example of detailed results for two extreme cases with <it>B. burgdorferi</it> (left) and <it>M. genitalium</it> (right)</p>
               </caption>
               <text>
                  <p>Example of detailed results for two extreme cases with <it>B. burgdorferi</it> (left) and <it>M. genitalium</it> (right). <b>(a)</b><it>P</it><sub>3</sub>-<it>P</it><sub>12</sub> plot for coding sequences; <b>(b)</b> PR2-plot in third codon position; <b>(c)</b> PR2 plot in large intergenic spaces. Leading-group sequences are represented by black circles, and lagging ones by white circles.</p>
               </text>
               <graphic file="gb-2002-3-10-research0058-5"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>PR2-biases in large intergenic regions</p>
            </st>
            <p>Out of 43 chromosomes, the differences in PR2-biases between the leading and lagging groups were highly significant (<it>p</it> &lt; 0.01) in 36 chromosomes for G/(G+C) and in 34 for A/(A+T) and in 32 chromosomes for simultaneously G/(G+C) and A/(A+T) (see table in Additional data files). All the 36 chromosomes highly significant for G/(G+C) had higher values in the leading strand (that is, <it>x</it><sub>1</sub> ><it>x</it><sub>2</sub> as in coding sequences). Out of the 34 chromosomes highly significant for A/(A+T), 28 had lower values in the leading strand (that is, <it>y</it><sub>1</sub> &lt;<it>y</it><sub>2</sub> as in coding sequences). Exceptions were <it>Bacillus subtilis</it>, <it>Bacillus halodurans</it>, <it>L. lactis</it>, <it>S. aureus</it> strains Mu50 and N315, and <it>Streptococcus pyogenes</it>. The distribution of intergenic regions in a PR2-plot was as expected under replication-associated effects (Figure <figr fid="F3">3a</figr>), and is particularly visible in the case of <it>B. burgdorferi</it> (Figure <figr fid="F5">5c</figr>). Unlike the third codon position, the leading and lagging groups clustered symmetrically around the center point of the PR2-bias plot. In intergenic regions, both <it>x</it><sub>c</sub> and <it>y</it><sub>c</sub> were always near 0.5, so that the residual bias <it>B</it><sub><it>II</it></sub> was always close to zero. The relative contribution of replication-coupled bias (<it>B</it><sub><it>I</it></sub>) to the total bias (<it>B</it><sub><it>I</it></sub> plus <it>B</it><sub><it>II</it></sub>) was therefore very high, on average 90%, and ranged from 88% (<it>Helicobacter pylori</it>) to 99.6% (<it>B. subtilis</it>) in the intergenic regions of species with significant <it>B</it><sub><it>I</it></sub> values.</p>
         </sec>
         <sec>
            <st>
               <p>PR2-biases in potentially transcribed untranslated regions</p>
            </st>
            <p>Out of 43 chromosomes, the differences in PR2-biases between the leading and lagging small intergenic spaces among co-oriented genes were highly significant (<it>p</it> &lt; 0.01) in 26 chromosomes for G/(G+C) and in 7 chromosomes for A/(A+T) and in 7 chromosomes for both G/(G+C) and A/(A+T) together (see table in Additional data files). All the 26 chromosomes with highly significant difference in G/(G+C) showed the same pattern as in third codon positions in coding sequences with <it>x</it><sub>1</sub> ><it>x</it><sub>2</sub>. Of the seven chromosomes highly significant for A/(A+T), all showed the same pattern as in third codon positions in coding sequences with <it>y</it><sub>1</sub> &lt;<it>y</it><sub>2</sub>, except for <it>B. subtilis</it> and <it>S. pyogenes</it>. Out of 43 chromosomes, the orientation of the residual bias <it>B</it><sub><it>II</it></sub> showed no clear tendency (only 31 chromosomes had <it>x</it><sub>c</sub> > 0.5 and 23 <it>y</it><sub>c</sub> > 0.5; the trend, if any, would be in the opposite direction, as in third codon positions). The relative contribution of replication-coupled bias (<it>B</it><sub><it>I</it></sub>) to the total bias (<it>B</it><sub><it>I</it></sub> plus <it>B</it><sub><it>II</it></sub>) ranged from 10% (<it>H. pylori</it> strain 26695) to 93% (<it>Thermoplasma acidophilum</it>) with an average value of 58%.</p>
         </sec>
         <sec>
            <st>
               <p>Comparison of third codon positions and intergenic spaces</p>
            </st>
            <p>The differences between PR2-biases in intergenic regions between the leading and lagging strands (<it>B</it><sub><it>I</it></sub>) were significantly and highly correlated with the PR2-biases in the third codon position as is evident from the regression coefficient of approximately 0.6 and correlation coefficient (r<sup>2</sup> = 0.77) among 43 chromosomes (Figure <figr fid="F6">6</figr>). The correlation was still significant when the extreme <it>B. burgdorferi</it> was removed from analysis (r<sup>2</sup> = 0.68). The 95% confidence interval for the value of the slope of the regression line [0.50-0.71] is less than 1, meaning that replication-associated biases were significantly smaller in intergenic regions than in third codon positions. This slope value was very close to the one obtained for the regression line between G+C content in intergenic regions versus third codon positions (Figure <figr fid="F4">4</figr>).</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Comparison of the absolute contribution of replication-associated bias <it>B</it><sub><it>I</it></sub> between intergenic and third codon positions</p>
               </caption>
               <text>
                  <p>Comparison of the absolute contribution of replication-associated bias <it>B</it><sub><it>I</it></sub> between intergenic and third codon positions.</p>
               </text>
               <graphic file="gb-2002-3-10-research0058-6"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Correlation between P<sub>3</sub> and replication-associated biases</p>
            </st>
            <p>If differences in the G+C content between species were dictated by asymmetric replication-associated mutation pressure, one would expect a correlation between the extent of PR2-bias and G+C content; however, no significant correlation between <it>B</it><sub><it>I</it></sub> in coding sequences and <it>P</it><sub>3</sub> were found (Figure <figr fid="F7">7</figr>).</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Correlation between GC content and the extent of strand biases in third codon positions</p>
               </caption>
               <text>
                  <p>Correlation between GC content and the extent of strand biases in third codon positions.</p>
               </text>
               <graphic file="gb-2002-3-10-research0058-7"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>Asymmetric mutation pressure</p>
            </st>
            <p>A general feature observed in the present study is that, whenever there is a difference in base composition between the leading and the lagging strand, the G and T bases are enriched in the leading strand, and the G+C content is somewhat lower in the leading strand (see Results). The direction of observed biases are universal in bacteria [<abbr bid="B28">28</abbr>], although not every species is biased [<abbr bid="B58">58</abbr>]. This pattern is not restricted to the bacterial world, as GT-enriched leading strands have been documented in mitochondria [<abbr bid="B59">59</abbr>,<abbr bid="B60">60</abbr>,<abbr bid="B61">61</abbr>], in the chloroplast genome of <it>Euglena gracilis</it> [<abbr bid="B62">62</abbr>], in viruses [<abbr bid="B45">45</abbr>,<abbr bid="B46">46</abbr>,<abbr bid="B47">47</abbr>], and in the telomeric ends of the <it>Saccharomyces cerevisiae</it> genome [<abbr bid="B63">63</abbr>]. Not all sequences are biased, as shown by the well-documented &#946;-globin region in the <it>Homo sapiens</it> genome [<abbr bid="B37">37</abbr>]. Remarkably, absolutely no exceptions to the relative enrichment in G of the leading strand were found here, either in third codon positions or intergenic positions. However, the sole general rule in biology is that there are no general rules, and during the time of revision of this manuscript the first exception was documented in the chromosome of <it>Streptomyces coelicolor</it> [<abbr bid="B64">64</abbr>]. The relative enrichment in T of the leading strand suffers from some exceptions that, interestingly, are all found here in the low-G+C group of Gram-positive bacteria (namely <it>B. subtilis</it>, <it>B. halodurans</it>, <it>L. lactis</it>, <it>S. aureus</it>, and <it>S. pyogenes</it>).</p>
            <p>Because of the strong correlation in direction and intensity of PR2 deviation between third codon positions and intergenic spaces (Figure <figr fid="F6">6</figr>), the most likely explanation for the GT enrichment of the leading strand is a replication-induced asymmetric mutation pressure between the leading and lagging strand.</p>
         </sec>
         <sec>
            <st>
               <p>Potential underlying causes</p>
            </st>
            <p>The universal excess of G and T in the leading strand is common to distantly related species, suggesting that asymmetric directional mutation pressure has a chemical basis. This does not mean that the mechanism is simple: there are 12 substitution rates under asymmetric molecular evolution, but only four base frequencies; many different substitution scenarios can explain the observed biases. Here we discuss only simple hypotheses in which only one of the 12 substitution rates is assumed to differ between the two strands: for example, only an excess of A &#8594; G or of C &#8594; T in the leading strand is able to produce a GT-rich leading strand. Because the leading strand spends more time in the single-stranded state than the lagging strand during DNA replication [<abbr bid="B65">65</abbr>], it is natural to look for mutations that are increased in the single-stranded state as possible explanations.</p>
            <sec>
               <st>
                  <p>Oxidations</p>
               </st>
               <p>We are not aware of any studies comparing the rate of base oxidation between the single-stranded and double-stranded state in DNA. However, in a study of peroxyl radicals [<abbr bid="B66">66</abbr>], which have a longer biological half-life than other O<sub>2</sub>-derived oxidants, all products formed were from the major groove of the bases, implying that steric hindrance from Watson-Crick base pairing in double-stranded DNA is not a factor. Consequently, similar yields of products in both single- and double-stranded DNA were found. Additionally, 8-hydroxyguanine, a major product of oxidative damage in DNA, induces mainly G &#8594; T and A &#8594; C mutations [<abbr bid="B67">67</abbr>,<abbr bid="B68">68</abbr>]. In <it>Escherichia coli</it>, it has also been reported that 2-hydroxy-dATP induces G &#8594; T mutations [<abbr bid="B69">69</abbr>]. These results do not explain GT excess in the leading strand and therefore are not good candidates for a single mechanism of asymmetric mutation pressures.</p>
            </sec>
            <sec>
               <st>
                  <p>Deaminations</p>
               </st>
               <p>Within this group, only cytosine deamination to uracil (or 5-methylcytosine deamination directly to thymine) yields a C &#8594; T mutation, and only adenine deamination to hypoxanthine yields an A &#8594; G mutation. Interestingly, there is experimental evidence <it>in vitro</it> that the rate of these deaminations is increased in the single-stranded state. Deamination of cytosine occurs over 100 times faster in single-stranded than in double-stranded DNA [<abbr bid="B70">70</abbr>,<abbr bid="B71">71</abbr>]. This is also most likely true for adenine, but its deamination rate is one-fortieth of the rate of cytosine [<abbr bid="B72">72</abbr>], so that it is not possible to determine its rate in double-stranded DNA. The rate-determining step in cytosine deamination involves attack by water at the C-4 of the cytosine to give a tetrahedral intermediate, difficult to obtain in the rigid structure of double-stranded DNA because of Watson-Crick base pairing with guanine [<abbr bid="B73">73</abbr>]. Single-stranded-DNA-binding proteins do not offer specific protection from this type of mutation during DNA replication [<abbr bid="B74">74</abbr>] but may differ between species in their ability to protect from solvent attack. The single-stranded state during transcription induces C &#8594; T mutations [<abbr bid="B75">75</abbr>], and this effect increases with transcription level [<abbr bid="B76">76</abbr>] or when the duration of the single-stranded state is increased using a slower, mutant, RNA polymerase [<abbr bid="B77">77</abbr>]. Also consistent is the report [<abbr bid="B78">78</abbr>] that GT enrichment increases with the time spent in the single-stranded state during mitochondrial DNA replication. In a study of the mutation pattern estimated from pseudogene data in <it>B. burgdorferi</it>, it was found that C &#8594; T mutation was the most frequent in the leading strand [<abbr bid="B79">79</abbr>]. The small depletion in G+C content observed here in the leading strand is also consistent with an excess of C &#8594; T substitutions in the leading strand because C &#8594; T substitutions decrease the G+C content. This effect is also expected to be a second-order one because of the centripetal action of symmetric directional mutation pressure on the G+C content of both strands. Therefore, we found the cytosine deamination theory compatible in most cases with the data, but a more definitive answer may come from biochemical studies on an extreme case such as <it>B. burgdorferi</it>.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>G+C content variation between species is not controlled by a unique asymmetric mutation pressure</p>
            </st>
            <p>From inspection of our data, we can reject the simple model in which a unique asymmetric mutation pressure causes variation in G+C content between species. For instance, if only C &#8594; T mutations were the underlying phenomenon, we would expect low G+C in species with the highest <it>B</it><sub><it>I</it></sub>. Figure <figr fid="F7">7</figr> shows that this is clearly not the case. Moreover, if asymmetric directional mutation pressures were the major origin of variation in G+C content between species, we would expect a much more important difference in term of G+C content between the genes on the leading strand and those on the lagging strand.</p>
         </sec>
         <sec>
            <st>
               <p>Intergenic regions could be under greater selective constraints than expected</p>
            </st>
            <p>If we exclude as an explanation that many unannotated genes remain to be discovered, or that the boundaries of annotated genes should be greatly extended, so that some intergenic regions under analysis are not actually intergenic, then, there are two non-exclusive interpretations of the fact that <it>B</it><sub><it>I</it></sub> is smaller in intergenic regions than in third codon positions. On the one hand, one may postulate that intergenic regions are under weaker selection pressure against inversions, so that deviations from PR2 are cancelled. On the other hand, one may assume that some positions are not free to deviate from PR2 because of selective pressure for some function. For instance, there are 100 copies per enterobacterial genome of palindromic sequences in intergenic spaces [<abbr bid="B80">80</abbr>], which may be under selective pressure to preserve their palindromic character and therefore follow PR2 (as pure palindromic sequences are effectively base-paired). This interpretation is compatible with the fact that interspecific G+C content variation is higher at the third codon position than in intergenic spaces [<abbr bid="B10">10</abbr>,<abbr bid="B11">11</abbr>], as shown here in Figure <figr fid="F4">4</figr>. Moreover, the surprisingly low level of polymorphism in large intergenic regions among natural strains of <it>E. coli</it> [<abbr bid="B81">81</abbr>] is in favor of selective constraints.</p>
         </sec>
         <sec>
            <st>
               <p>Transcription-coupled PR2-biases</p>
            </st>
            <p>Francino and Ochman [<abbr bid="B22">22</abbr>,<abbr bid="B81">81</abbr>] pointed out that transcription is likely to cause more mutations in the sense strand than in the anti-sense strand because of its transient single-stranded state during transcription. We found no evidence for this in intergenic spaces (> 100 bp), which could be interpreted either as an absence of transcription-induced mutation pressure or by a low proportion of transcribed fragments in these regions. Third codon positions are known to be transcribed, so that we expect an effect of transcription-induced mutation pressure on the residual deviation from PR2 (<it>B</it><sub><it>II</it></sub>). However, selection for translation optimization interferes, so that we cannot interpret unambiguously the almost universal orientation of <it>B</it><sub><it>II</it></sub> (down left in PR2-plot, as in Figure <figr fid="F2">2c</figr>) as the result of transcription-induced mutation pressure. If we interpret results this way, we would have to postulate a different mechanism for replication-induced and transcription-induced mutation pressures, as <it>B</it><sub><it>I</it></sub> and <it>B</it><sub><it>II</it></sub> have different orientations. Moreover, in small intergenic spaces, which are potentially transcribed untranslated regions, no general pattern is evident for the residual deviation from PR2, so that no universal transcription-induced mutation pressure was found here as a strong contributor to the deviation from the PR2 state.</p>
         </sec>
         <sec>
            <st>
               <p>Asymmetric mutation pressure and chirochore structure</p>
            </st>
            <p>A chirochore is a contiguous segment of the chromosome homogeneous with respect to its deviation from PR2 [<abbr bid="B44">44</abbr>]. A strong asymmetric mutation pressure induces a chirochore structure, as observed in <it>B. burgdorferi</it> [<abbr bid="B82">82</abbr>], but the converse need not to be true: a weak asymmetric mutation pressure does not imply an absence of chirochore. For example in <it>M. genitalium</it>, replication-associated biases, <it>B</it><sub><it>I</it></sub>/(<it>B</it><sub><it>I</it></sub> + <it>B</it><sub><it>II</it></sub>), are the weakest among species under study (<it>B</it><sub><it>I</it></sub> = 12%), but there is still a strong chirochore structure [<abbr bid="B83">83</abbr>] because of a large excess (80%) of coding sequences in the leading strand. The deviation from PR2 in the first and second codon positions, because of selection at the amino-acid level (for example [<abbr bid="B84">84</abbr>]), is then a major factor. It is not known whether chirochores could be adaptative, but recent results suggest a functional polarization of the E. coli chromosome [<abbr bid="B85">85</abbr>,<abbr bid="B86">86</abbr>] and may be relevant to this problem.</p>
         </sec>
         <sec>
            <st>
               <p>PR2-plot analysis</p>
            </st>
            <p>A prerequisite for using the simple graphical representations and quantifications used here is that the coding sequences and intergenic spaces can be divided into two groups: the leading and the lagging group. This is not necessarily always possible when, for instance, all coding sequences are in the leading group (this situation has never been encountered in bacteria but could be problematic in small genomes such as those of mitochondria). Another limitation is encountered when there are genes in both groups but we are unable to assign sequences to the correct group because of limited information about the location of origin and terminus of replication used during the course of evolution. Although an estimation of <it>B</it><sub><it>I</it></sub> is not possible in this latter case, an oblong distribution of points in PR2-plots may suggest a non-zero value for <it>B</it><sub><it>I</it></sub>. When more than one strain was sequenced for a species (that is, <it>Chlamydophila pneumoniae</it>, <it>E. coli</it>, <it>H. pylori</it>, <it>Mycobacterium tuberculosis</it>, <it>Neisseria meningitidis</it>, and <it>S. aureus</it>) <it>B</it><sub><it>I</it></sub> and <it>B</it><sub><it>II</it></sub> values in third codon position are extremely stable within a species (see table in the Additional data files), but the maximum number of three strains available here for one species does not allow us to estimate their within-species variability, and therefore their taxonomic utility.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>Source of data</p>
            </st>
            <p>Data on 51 complete bacterial chromosomes were retrieved from the EMGLib server [<abbr bid="B87">87</abbr>,<abbr bid="B88">88</abbr>] and used to extract annotated coding sequences and intergenic spaces (Figure <figr fid="F1">1</figr>). These 51 chromosomes were from 49 different strains (there are two chromosomes in <it>Deinococcus radiodurans</it> and <it>Vibrio cholerae</it>), from 41 different species (there were three strains of <it>C. pneumoniae</it>, <it>E. coli</it>; two strains of <it>H. pylori</it>, <it>M. tuberculosis</it>, <it>N. meningitidis</it>, <it>S. aureus</it>) as detailed in Table <tblr tid="T1">1</tblr>.</p>
         </sec>
         <sec>
            <st>
               <p>Leading and lagging group assignment</p>
            </st>
            <p>To classify sequences into the leading and the lagging groups (Figure <figr fid="F1">1</figr>), we must be able to locate them relative to the origin and terminus of replication. Out of 51 bacterial chromosomes, the locations of the origin and the terminus were known from experimental evidence or were estimated from chirochore boundaries [<abbr bid="B44">44</abbr>] in 43 chromosomes with the program Oriloc [<abbr bid="B89">89</abbr>,<abbr bid="B90">90</abbr>]. Details of assignments are given in EMGLib database under a '/strand=' qualifier. To discriminate between the origin and the terminus of replication, we used the assumption that organisms tend to avoid head-on collisions between the DNA replication apparatus and the RNA polymerase transcription complex [<abbr bid="B54">54</abbr>,<abbr bid="B55">55</abbr>,<abbr bid="B56">56</abbr>]. We also assumed that coding sequences were more abundant on the leading than on the lagging strand. As far as we know, there is no exception to this rule, and the report that the trend is exacerbated in highly expressed genes supports this assumption [<abbr bid="B48">48</abbr>]. Moreover, the comparison of the maps of closely related bacterial chromosomes shows that observed chromosome rearrangements tend to preserve an excess of genes in the leading strand [<abbr bid="B91">91</abbr>,<abbr bid="B92">92</abbr>,<abbr bid="B93">93</abbr>,<abbr bid="B94">94</abbr>,<abbr bid="B95">95</abbr>,<abbr bid="B96">96</abbr>,<abbr bid="B97">97</abbr>,<abbr bid="B98">98</abbr>]. Our figures show sequences on the leading strand as filled circles, and those on the lagging strand as open circles.</p>
         </sec>
         <sec>
            <st>
               <p>Coding-sequence analysis</p>
            </st>
            <p>The nucleotide composition of coding sequences was always calculated from the sense strand (Figure <figr fid="F1">1</figr>) for each codon position. The sense strand (or anti-template strand) is the DNA strand opposite to the template strand for transcription. The average G+C content of the third codon position (<it>P</it><sub>3</sub>) and that of the first and second codon positions (<it>P</it><sub>12</sub>) were computed to draw neutrality plots relative to <it>P</it><sub>3</sub> as previously described [<abbr bid="B11">11</abbr>]: stop codons and codons from odd-numbered sets ATG (Met), TGG (Trp) and ATA (Ile) were excluded from the calculations of <it>P</it>-values below to avoid an extra cause of potential deviation from PR2. The coding sequences after position 1,950 kb on <it>D. radiodurans</it> chromosome 1 were discarded because of a frameshift error in the annotations. Some of the apparent coding sequences identified in genome sequences could be artifacts that do not correspond to actually expressible genes [<abbr bid="B99">99</abbr>]. To overcome this problem, coding sequences of less than 300 bp were removed from the analysis (less than 10% of sequences were discarded). As the probability of obtaining an open reading frame with more than 300 bp by chance is low, most (> 99%) sequences analyzed should correspond to actual coding sequences [<abbr bid="B100">100</abbr>].</p>
         </sec>
         <sec>
            <st>
               <p>Intergenic-sequence analysis</p>
            </st>
            <p>GC<sub>IGR</sub> denotes the G+C content, and deviations from PR2 were characterized by G/(G+C) and A/(A+T) to draw PR2-plots in a way similar to coding-sequence analyses. Intergenic sequences were split into two non-overlapping groups (Figure <figr fid="F1">1</figr>). The first group contained small intergenic sequences among co-oriented genes that are potentially transcribed untranslated spacers. The strand used to compute nucleotide composition - chosen to allow direct comparisons with coding-sequence analyses - is the same as the anti-template strand for transcription of the two flanking genes (CDS, tRNA, rRNA). Therefore, when the two flanking genes are coding sequences, the strand is the same as the strand used for the analysis of coding sequences, that is the sense strand (Figure <figr fid="F1">1</figr>). Their minimum size was set to 50 bp, a trade-off between the likelihood of being transcribed and the avoidance of too much scattering in PR2-plots. Their maximum size was set to 100 bp to avoid data overlap with large intergenic regions. The second group contained large intergenic regions that are potentially untranscribed untranslated regions. Their minimum size was set to 100 bp, which is an amount of information equivalent to the third codon position of 100 codons, the minium size required in coding-sequence analyses. The analyzed strand cannot be defined unambiguously with respect to the flanking genes when they are convergently or divergently transcribed. The nucleotide composition was therefore always calculated from the published strand (Figure <figr fid="F1">1</figr>).</p>
         </sec>
         <sec>
            <st>
               <p>Statistical tests and their interpretation</p>
            </st>
            <p>The significance of the analyses was assessed using unpaired t-test implemented in Statview 5.0 (SAS Institute Inc., North Carolina, USA) to compare the mean between two groups - here the leading and the lagging group. The null hypothesis is the equality of the mean value between the two groups. The critical value <it>p</it> is the probability of taking the wrong decision of rejecting the null hypothesis when the null hypothesis is true. Two notes of caution should be introduced about the interpretation of results in the present context. First, we made runs of 43 comparisons so that the odds of getting at least one wrong significant result, 1 - (1 - <it>p</it>)<sup>43</sup>, is 0.89, 0.35, 0.04 for <it>p</it> = 0.05, <it>p</it> = 0.01, <it>p</it> = 0.001 respectively. We selected the highly significant critical level (<it>p</it> &lt; 0.01) so that odds of getting one, two, or three wrong decisions were 0.28, 0.06, and 0.01, respectively, for each run. Therefore, a simple figure to keep in mind when looking at the results is that the odds of getting by chance three or more highly significant differences in a run were less than 1 in 100. The second note of caution is that we were working with large data sets, typically one thousand values within each group, so that the power of the test, that is its ability to reject the null hypothesis, was high. As a matter of consequence, even a small difference between the mean of the two groups was considered, correctly from the statistical test point of view, as significant. However, statistically significant does not mean automatically biologically significant because a second-order or a side effect could easily be evidenced this way. To help interpretation, the simple graphical representations used here, very close to raw data, were connected through hyperlinks to the tabulated test results. By this means, readers can quickly form their own opinions by viewing the original data.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>Tables of the intragenomic <it>P</it><sub>3</sub> distributions of all species examined, and the PR2-biases between leading and lagging groups for coding sequences, and small and large intergenic regions, together with associated histograms are available as additional data files in <supplr sid="S1">pdf</supplr> format, with the tables also in text format (Tables <supplr sid="S2">1</supplr>,<supplr sid="S3">2</supplr>,<supplr sid="S4">3</supplr>,<supplr sid="S5">4</supplr>).</p>
         <suppl id="S1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>Tables of the intragenomic <it>P</it><sub>3</sub> distributions of all species examined</p>
            </caption>
            <text>
               <p>Tables of the intragenomic <it>P</it><sub>3</sub> distributions of all species examined, and the PR2-biases between leading and lagging groups for coding sequences, and small and large intergenic regions, together with associated histograms</p>
            </text>
            <file name="gb-2002-3-10-research0058-S1.pdf">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="S2">
            <title>
               <p>Additional data file 2</p>
            </title>
            <caption>
               <p>Table 1 in text format</p>
            </caption>
            <text>
               <p>Table 1 in text format</p>
            </text>
            <file name="gb-2002-3-10-research0058-S2.txt">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="S3">
            <title>
               <p>Additional data file 3</p>
            </title>
            <caption>
               <p>Table 2 in text format</p>
            </caption>
            <text>
               <p>Table 2 in text format</p>
            </text>
            <file name="gb-2002-3-10-research0058-S3.txt">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="S4">
            <title>
               <p>Additional data file 4</p>
            </title>
            <caption>
               <p>Table 3 in text format</p>
            </caption>
            <text>
               <p>Table 3 in text format</p>
            </text>
            <file name="gb-2002-3-10-research0058-S4.txt">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="S5">
            <title>
               <p>Additional data file 5</p>
            </title>
            <caption>
               <p>Table 4 in text format</p>
            </caption>
            <text>
               <p>Table 4 in text format</p>
            </text>
            <file name="gb-2002-3-10-research0058-S5.txt">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank the anonymous referees for their helpful comments, and we thank Rob Knight for his kind help in improving this manuscript.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Contenu en bases puriques et pyrimidiques des acides d&#233;soxyribonucl&#233;iques des bact&#233;ries.</p>
            </title>
            <aug>
               <au>
                  <snm>Lee</snm>
                  <fnm>KY</fnm>
               </au>
               <au>
                  <snm>Wahl</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Barbu</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Ann Inst Pasteur</source>
            <pubdate>1956</pubdate>
            <volume>91</volume>
            <fpage>212</fpage>
            <lpage>224</lpage>
         </bibl>
         <bibl id="B2">
            <title>
               <p>A correlation between the compositions of deoxyribonucleic and ribonucleic acids.</p>
            </title>
            <aug>
               <au>
                  <snm>Belozersky</snm>
                  <fnm>AN</fnm>
               </au>
               <au>
                  <snm>Spirin</snm>
                  <fnm>AS</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1958</pubdate>
            <volume>182</volume>
            <fpage>111</fpage>
            <lpage>112</lpage>
            <xrefbib>
               <pubid idtype="pmpid">13566202</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Heterogeneity in deoxyriboneucleic acids. II. Dependence of the density of deoxyribonucleic acids on guanine-cytosine.</p>
            </title>
            <aug>
               <au>
                  <snm>Sueoka</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Marmur</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Doty</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1959</pubdate>
            <volume>183</volume>
            <fpage>1427</fpage>
            <lpage>1431</lpage>
            <xrefbib>
               <pubid idtype="pmpid">13657152</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>The relative homogeneity of microbial DNA.</p>
            </title>
            <aug>
               <au>
                  <snm>Rolfe</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Meselson</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1959</pubdate>
            <volume>45</volume>
            <fpage>1039</fpage>
            <lpage>1043</lpage>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Systematic base composition variation around the genome of <it>Mycoplasma genitalium</it>, but not <it>Mycoplasma pneumoniae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Kerr</snm>
                  <fnm>ARW</fnm>
               </au>
               <au>
                  <snm>Peden</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Sharp</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <fpage>1177</fpage>
            <lpage>1179</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.1997.5461902.x</pubid>
                  <pubid idtype="pmpid">9350873</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Prokaryotic genome evolution as assessed by multivariate analysis of codon usage patterns.</p>
            </title>
            <aug>
               <au>
                  <snm>McInerney</snm>
                  <fnm>JO</fnm>
               </au>
            </aug>
            <source>Microb Comp Genomics</source>
            <pubdate>1997</pubdate>
            <volume>2</volume>
            <fpage>1</fpage>
            <lpage>10</lpage>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Two aspects of DNA base composition: G+C content and translation-coupled deviation from intra-strand rule of A = T and G = C.</p>
            </title>
            <aug>
               <au>
                  <snm>Sueoka</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
  