<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>gb-2005-6-2-r14</ui>
	<ji>GBJ</ji>
	<fm>
		<dochead>Research</dochead>
		<bibl>
			<title>
				<p>Accelerated evolution associated with genome reduction in a free-living prokaryote</p>
			</title>
			<aug>
				<au id="A1">
					<snm>Dufresne</snm>
					<fnm>Alexis</fnm>
					<insr iid="I1"/>
					<email>dufresne@sb-roscoff.fr</email>
				</au>
				<au id="A2">
					<snm>Garczarek</snm>
					<fnm>Laurence</fnm>
					<insr iid="I1"/>
					<email>garczare@sb-roscoff.fr</email>
				</au>
				<au id="A3" ca="yes">
					<snm>Partensky</snm>
					<fnm>Fr&#233;d&#233;ric</fnm>
					<insr iid="I1"/>
					<email>partensky@sb-roscoff.fr</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Station Biologique, UMR 7127 CNRS et Universit&#233; Paris 6, BP74, 29682 Roscoff Cedex, France</p>
				</ins>
			</insg>
			<source>Genome Biology</source>
			<issn>1465-6906</issn>
			<pubdate>2005</pubdate>
			<volume>6</volume>
			<issue>2</issue>
			<fpage>R14</fpage>
			<url>http://genomebiology.com/2005/6/2/R14</url>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">15693943</pubid><pubid idtype="doi">10.1186/gb-2005-6-2-r14</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>5</day>
					<month>10</month>
					<year>2004</year>
				</date>
			</rec>
			<revrec>
				<date>
					<day>2</day>
					<month>12</month>
					<year>2004</year>
				</date>
			</revrec>
			<acc>
				<date>
					<day>7</day>
					<month>12</month>
					<year>2004</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>14</day>
					<month>1</month>
					<year>2005</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2005</year>
			<collab>Dufresne et al.; licensee BioMed Central Ltd.</collab>
			<note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<shorttitle>
			<p>Genome reduction in free-living bacteria</p>
		</shorttitle>
		<shortabs>
			<p><it>Prochlorococcus</it> sp. are marine bacteria with very small genomes. The mechanisms by which these reduced genomes have evolved appears, however, to be distinct from those that have led to small genome size in intracellular bacteria.</p>
		</shortabs>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>Three complete genomes of <it>Prochlorococcus </it>species, the smallest and most abundant photosynthetic organism in the ocean, have recently been published. Comparative genome analyses reveal that genome shrinkage has occurred within this genus, associated with a sharp reduction in G+C content. As all examples of genome reduction characterized so far have been restricted to endosymbionts or pathogens, with a host-dependent lifestyle, the observed genome reduction in <it>Prochlorococcus </it>is the first documented example of such a process in a free-living organism.</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>Our results clearly indicate that genome reduction has been accompanied by an increased rate of protein evolution in <it>P. marinus </it>SS120 that is even more pronounced in <it>P. marinus </it>MED4. This acceleration has affected every functional category of protein-coding genes. In contrast, the 16S rRNA gene seems to have evolved clock-like in this genus. We observed that MED4 and SS120 have lost several DNA-repair genes, the absence of which could be related to the mutational bias and the acceleration of amino-acid substitution.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusions</p>
					</st>
					<p>We have examined the evolutionary mechanisms involved in this process, which are different from those known from host-dependent organisms. Indeed, most substitutions that have occurred in <it>Prochlorococcus </it>have to be selectively neutral, as the large size of populations imposes low genetic drift and strong purifying selection. We assume that the major driving force behind genome reduction within the <it>Prochlorococcus </it>radiation has been a selective process favoring the adaptation of this organism to its environment. A scenario is proposed for genome evolution in this genus.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010014">Microbiology and parasitology</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>The size of bacterial genomes is primarily the result of two counteracting processes: the acquisition of new genes by gene duplication or by horizontal gene transfer; and the deletion of non-essential genes. Genomic flux created by these gains and losses of genetic information can substantially alter gene content. This process drives divergence of bacterial species and eventually adaptation to new ecological niches <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. In some cases, gene deletion may prevail over gene acquisition, leading to genome reduction. This process has occurred several times during evolution and has been well documented for cellular organelles <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>, obligate pathogens such as <it>Mycoplasma genitalium </it><abbrgrp><abbr bid="B4">4</abbr></abbrgrp> or phytoplasmas <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> and symbionts such as the insect endosymbiont <it>Buchnera </it><abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp> or the hyperthermophile <it>Nanoarchaeum equitans </it><abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. In the case of organelles, the degree of genome reduction can be extensive as a result of massive gene transfer into the host nucleus, allowing maintenance of the corresponding functions in the resulting composite organism. Mitochondrial or chloroplast genomes, for instance, can be as small as 6 kilobases (kb) <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and 35 kb <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, respectively. In the case of obligate host-dependent bacteria, the reduction is more limited because the relationships with their hosts are less intimate than for organelles in eukaryotic cells. Thus, obligatory pathogens need to retain a minimum of functions that allow them to infect new hosts and to avoid host defenses, and obligate endosymbionts carry genes which are absolutely necessary for host survival. For instance, a substantial part (approximately 10 %) of the <it>Buchnera </it>genome is devoted to biosynthesis of amino acids which are essential to its host <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>.</p>
			<p>So far, all characterized examples of genome reduction have been associated with a change from a free-living to a host-dependent lifestyle <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. It is therefore intriguing that a similar phenomenon of genome reduction has occurred within the free-living marine cyanobacterial genus <it>Prochlorococcus </it><abbrgrp><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>. The latter is present at high abundance (often over 10<sup>5 </sup>cells/ml) in all nutrient-poor areas of the world's oceans between 40&#176;N and 40&#176;S and is probably the most abundant photosynthetic organism on Earth <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>. It has been shown that two major ecotypes exist within this genus <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. The first is adapted to grow at the base of the illuminated layer and displays a high divinyl-chlorophyll <it>b </it>to <it>a </it>ratio; the second inhabits the upper layer of the ocean and has a low divinyl-chlorophyll <it>b </it>to <it>a </it>ratio <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. The genome of one high-light-adapted (HL) strain, <it>Prochlorococcus marinus </it>MED4 <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, and of two low-light-adapted (LL) strains, <it>P. marinus </it>SS120 <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> and <it>Prochlorococcus </it>species MIT9313 <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, have recently been sequenced and annotated.</p>
			<p>Phylogenetic trees based on 16S rRNA sequences <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> or 16S-23S ribosomal internal transcribed spacer sequences <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> show that <it>Prochlorococcus </it>sp. MIT9313 branches at the base of the <it>Prochlorococcus </it>radiation, close to the <it>Synechococcus </it>group <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. In contrast, the <it>Prochlorococcus </it>HL clade, encompassing the MED4 strain, appears to be the most recently evolved <it>Prochlorococcus </it>group, consistent with the fact that this clade is much less diversified than are the LL clades.</p>
			<p>Despite the close relatedness of these strains, their genomes vary widely in terms of size, G+C content and the number of protein-coding genes (Table <tblr tid="T1">1</tblr>). While the general characteristics of the MIT9313 genome are very similar to those of the <it>Synechococcus </it>sp. WH8102 genome <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>, MED4 has the smallest genome for a photosynthetic organism known to date and the SS120 genome is only 90 kb larger. Furthermore, this genome reduction is clearly accompanied by a drift in G+C content, a phenomenon that commonly occurs during the evolution of host-dependent genomes <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. However, the evolutionary mechanisms involved in the genome reductive process are most probably different from those that have occurred in host-dependent organisms. Using comparative sequence analyses of the four genomes of marine picocyanobacteria published to date, we have attempted to better understand the causes and consequences of this phenomenon and to address the relationships between genome reduction and niche adaptation in marine picocyanobacteria.</p>
			<tbl id="T1" hint_layout="single">
				<title>
					<p>Table 1</p>
				</title>
				<caption>
					<p>General features of the genomes of the four marine picocyanobacteria used in this study</p>
				</caption>
				<tblbdy cols="4">
					<r>
						<c ca="left">
							<p>Genome</p>
						</c>
						<c ca="center">
							<p>Size (Mbp)</p>
						</c>
						<c ca="center">
							<p>GC%</p>
						</c>
						<c ca="center">
							<p>Number of protein-coding genes</p>
						</c>
					</r>
					<r>
						<c cspan="4">
							<hr/>
						</c>
					</r>
					<r>
						<c ca="left">
							<p><it>P. marinus </it>MED4</p>
						</c>
						<c ca="center">
							<p>1.66</p>
						</c>
						<c ca="center">
							<p>30.8</p>
						</c>
						<c ca="center">
							<p>1,716</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p><it>P. marinus </it>SS120</p>
						</c>
						<c ca="center">
							<p>1.75</p>
						</c>
						<c ca="center">
							<p>36.4</p>
						</c>
						<c ca="center">
							<p>1,882</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p><it>Prochlorococcus </it>sp. MIT9313</p>
						</c>
						<c ca="center">
							<p>2.41</p>
						</c>
						<c ca="center">
							<p>50.7</p>
						</c>
						<c ca="center">
							<p>2,273</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p><it>Synechococcus </it>sp. WH8102</p>
						</c>
						<c ca="center">
							<p>2.43</p>
						</c>
						<c ca="center">
							<p>59.4</p>
						</c>
						<c ca="center">
							<p>2,525</p>
						</c>
					</r>
				</tblbdy>
			</tbl>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<sec>
				<st>
					<p>Synteny and genome stability</p>
				</st>
				<p>Alignments of whole genomes show a strong conservation of the gene order between MED4 and SS120 (Figure <figr fid="F1">1a</figr>). There are only five inversions larger than 20 kb between these two genomes. In contrast, the large number of inversions and translocations and the shorter size of the colinear segments between SS120 and MIT9313 on the one hand and MIT9313 and WH8102 on the other hand (Figure <figr fid="F1">1b,c</figr>) indicate that extensive genome rearrangements have occurred not only between <it>Synechococcus </it>and <it>Prochlorococcus </it>but also between MIT9313 and the two other <it>Prochlorococcus </it>strains (see also Figure <figr fid="F2">2</figr> in <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>). The degree of synteny observed between the four marine picocyanobacteria genomes strengthens the hypothesis of a more recent divergence of the clades containing MED4 and SS120 than of the clade containing MIT9313.</p>
				<fig id="F1">
					<title>
						<p>Figure 1</p>
					</title>
					<caption>
						<p>Alignments of complete genome sequences of marine picocyanobacteria</p>
					</caption>
					<text>
						<p>Alignments of complete genome sequences of marine picocyanobacteria. Genome sequences are translated in their six reading frames. <b>(a) </b>Comparison of the MED4 and SS120 genomes; <b>(b) </b>comparison of the SS120 and MIT9313 genomes; <b>(c) </b>comparison of the MIT9313 and WH8102 genomes. Colinear segments are shown in red and inversions in green. Translocated segments are above or below the diagonal.</p>
					</text>
					<graphic file="gb-2005-6-2-r14-1"/>
				</fig>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p>Phylogenetic tree of 16S rRNA genes from the four marine picocyanobacteria</p>
					</caption>
					<text>
						<p>Phylogenetic tree of 16S rRNA genes from the four marine picocyanobacteria. Neighbor-joining tree with Kimura 2-parameter correction. The bootstrap value (1,000 replications) is shown in boldface. Lengths of the branches dA, dB, dC and dX (see text) are given below the branches. N1, node 1, branchpoint between MED4 and SS120; N2, node 2, branchpoint between MIT9313 and Node 1.</p>
					</text>
					<graphic file="gb-2005-6-2-r14-2"/>
				</fig>
			</sec>
			<sec>
				<st>
					<p>Overall genome composition</p>
				</st>
				<p>The downsizing of MED4 and SS120 genomes during evolution is associated with a genome-wide adenine (A) and thymine (T) enrichment (Table <tblr tid="T1">1</tblr>). The bias is most pronounced at neutral sites such as intergenic regions (MED4, 76.6% A+T; SS120, 69.3% A+T) and third-codon positions of protein-coding genes (MED4, 79.7% A+T; SS120, 73.85% A+T). This bias has little effect on ribosomal RNA genes (5S, 16S and 23S) which have a G+C content greater than 50% in all four picocyanobacterial genomes. In both MED4 and SS120, the single rRNA gene cluster can easily be spotted as a G+C-rich anomaly compared to the rest of the genome (see for example, Figure <figr fid="F1">1</figr> in <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>). In direct contrast, protein-coding genes are strongly affected by the extreme base composition of these genomes. First, the bias influences codon usage since, for a given amino acid, AT-rich codons are preferentially used (Figure <figr fid="F3">3a</figr>). Second, the amino-acid composition of the proteins themselves is affected (Figure <figr fid="F3">3b</figr>). Indeed, when compared to <it>Prochlorococcus </it>sp. MIT9313 and <it>Synechococcus </it>sp. WH8102, the genes of <it>P. marinus </it>MED4 and SS120 contain fewer amino acids encoded by G+C-rich codons (for example, alanine or arginine) and more amino acids encoded by A+T-rich codons (for example, isoleucine or lysine).</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>Influence of mutational bias in codon usage and amino-acid usage</p>
					</caption>
					<text>
						<p>Influence of mutational bias in codon usage and amino-acid usage. <b>(a) </b>Percentage use of AT-rich codons in the four marine picocyanobacteria. Amino acids are ranked according to AT content of their respective codons. Methionine and tryptophan, which are both encoded by only one codon, have been discarded from the analysis. <b>(b) </b>Percentage use of amino acids in marine picocyanobacteria.</p>
					</text>
					<graphic file="gb-2005-6-2-r14-3"/>
				</fig>
			</sec>
			<sec>
				<st>
					<p>Orthologous gene pool size</p>
				</st>
				<p>A total of 1,306 orthologs belonging to all major functional categories are common to the four genomes (see Additional data file 1) and probably constitute an estimate of the core of genes conserved in all marine picocyanobacteria. This is sensibly more than the pool of around 1,000 orthologs identified by W.R. Hess <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. The difference certainly results from the use by the latter author of a low E-value threshold (10e<sup>-12</sup>) for BLAST comparisons. In contrast, our analysis is based on identification of reciprocal best hits without the use of any particular threshold (apart from the default BLAST threshold) and consequently allows the detection of orthologous relationships whatever the gene lengths or the level of similarity. Still, our ortholog identification process is rather strict and the set of orthologs identified in this study probably corresponds to a lower estimate of the actual number of orthologs shared by the four genomes. This set of genes represents a substantial percentage of the total pool of all protein-coding genes in <it>P. marinus </it>MED4 (73.2%) and SS120 (69.2%) and about half of the gene set in <it>Prochlorococcus </it>sp. MIT9313 (56.2%) and <it>Synechococcus </it>sp. WH8102 (51.1%). These percentages are consistent with the differences in the respective number of genes within these genomes (Table <tblr tid="T1">1</tblr>) and are compatible with the assumption that a massive gene loss has occurred in MED4 and SS120 during their evolution from a <it>Prochlorococcus </it>ancestor with a larger genome <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>.</p>
			</sec>
			<sec>
				<st>
					<p>Accelerated rate of evolution of protein-coding genes in <it>Prochlorococcus</it></p>
				</st>
				<p>Because biased base composition seems to constrain amino-acid usage in the <it>Prochlorococcus </it>genomes, we have investigated whether it also affects the rate of protein sequence evolution in these genomes. We used the 1,306 orthologs common to the four genomes to estimate the amino-acid substitution rate in each genome. Branch lengths calculated for a given tree topology (the same topology as for the 16S rRNA gene tree; see Figure <figr fid="F2">2</figr>) are 0.46, 0.22, 0.16 and 0.14 amino acid substitutions per site for branches dA, dB, dC and dX, respectively. Using <it>Synechococcus </it>sp. WH8102 as the outgroup, we tested the rate-constancy hypothesis and computed the ratios of branch lengths. Relative rate tests (two-cluster and branch length tests) indicate that protein sequences evolved at significantly different rates (<it>P </it>&lt; 0.001) between MED4, SS120 and MIT9313. Therefore the hypothesis of a constant evolutionary rate between these strains can be rejected for protein-coding genes. The calculation of branch-length ratios reveals that the amino-acid substitution rate is 2.04-fold higher inMED4 than in SS120 (dA/dB) and 3.81-fold higher in MED4 than in MIT9313 ((dA+dX)/dC). This rate is also 2.31-fold higher for SS120 than for MIT9313 ((dB+dX)/dC). Computation of branch lengths for each functional category shows that the increased rate of amino-acid replacement in protein sequences concerns every category (Figure <figr fid="F4">4</figr> and Table <tblr tid="T2">2</tblr>). These results imply that the rate of amino-acid substitution increased during evolution of the <it>Prochlorococcus </it>genus concomitantly with genome reduction and increase in A+T content.</p>
				<fig id="F4">
					<title>
						<p>Figure 4</p>
					</title>
					<caption>
						<p>Amino-acid substitution rate per functional category</p>
					</caption>
					<text>
						<p>Amino-acid substitution rate per functional category. Branch lengths computed for each functional category between <b>(a) </b>MED4 and SS120, <b>(b) </b>SS120 and MIT9313 and <b>(c) </b>MED4 and MIT9313. In the three comparisons, branch-length values are aligned along a line with a slope much greater than 1, indicating that acceleration of the substitution rates occurs in every functional category. Axes represent the number of amino-acid substitutions per site. Red circle, amino-acid transport and metabolism; green circle, carbohydrate transport and metabolism; yellow circle, cell-cycle control; blue triangle, cell wall/membrane biogenesis; pink circle, coenzyme metabolism; red square, defense mechanisms; green square, energy production and conversion; yellow square, function unknown; blue square, general function prediction only; pink square, inorganic ion transport and metabolism; red triangle, intracellular trafficking; green triangle, lipid transport and metabolism; yellow triangle, nucleotide transport and metabolism; blue circle, posttranslational modification, protein turnover; pink triangle, replication, recombination and repair; red diamond, secondary metabolite biosynthesis, transport and catabolism; green diamond, signal transduction mechanisms; yellow diamond, transcription; blue diamond, translation; black circle, miscellaneous.</p>
					</text>
					<graphic file="gb-2005-6-2-r14-4"/>
				</fig>
				<tbl id="T2" hint_layout="single">
					<title>
						<p>Table 2</p>
					</title>
					<caption>
						<p>Number and percentage of orthologous genes per functional category</p>
					</caption>
					<tblbdy cols="3">
						<r>
							<c ca="left">
								<p>Category</p>
							</c>
							<c ca="center">
								<p>Number of genes</p>
							</c>
							<c ca="center">
								<p>% of orthologs</p>
							</c>
						</r>
						<r>
							<c cspan="3">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Amino-acid transport and metabolism</p>
							</c>
							<c ca="center">
								<p>94</p>
							</c>
							<c ca="center">
								<p>7.2</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Carbohydrate transport and metabolism</p>
							</c>
							<c ca="center">
								<p>50</p>
							</c>
							<c ca="center">
								<p>3.8</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Cell-cycle control</p>
							</c>
							<c ca="center">
								<p>17</p>
							</c>
							<c ca="center">
								<p>1.3</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Cell wall/membrane biogenesis</p>
							</c>
							<c ca="center">
								<p>55</p>
							</c>
							<c ca="center">
								<p>4.2</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Coenzyme metabolism</p>
							</c>
							<c ca="center">
								<p>99</p>
							</c>
							<c ca="center">
								<p>7.6</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Defense mechanisms</p>
							</c>
							<c ca="center">
								<p>14</p>
							</c>
							<c ca="center">
								<p>1.1</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Energy production and conversion</p>
							</c>
							<c ca="center">
								<p>106</p>
							</c>
							<c ca="center">
								<p>8.1</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Function unknown</p>
							</c>
							<c ca="center">
								<p>269</p>
							</c>
							<c ca="center">
								<p>20.6</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>General function prediction only</p>
							</c>
							<c ca="center">
								<p>116</p>
							</c>
							<c ca="center">
								<p>8.9</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Inorganic ion transport and metabolism</p>
							</c>
							<c ca="center">
								<p>47</p>
							</c>
							<c ca="center">
								<p>3.6</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Intracellular trafficking</p>
							</c>
							<c ca="center">
								<p>13</p>
							</c>
							<c ca="center">
								<p>1.0</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Lipid transport and metabolism</p>
							</c>
							<c ca="center">
								<p>25</p>
							</c>
							<c ca="center">
								<p>1.9</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Nucleotide transport and metabolism</p>
							</c>
							<c ca="center">
								<p>39</p>
							</c>
							<c ca="center">
								<p>3.0</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Posttranslational modification, protein turnover</p>
							</c>
							<c ca="center">
								<p>59</p>
							</c>
							<c ca="center">
								<p>4.5</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Replication, recombination and repair</p>
							</c>
							<c ca="center">
								<p>51</p>
							</c>
							<c ca="center">
								<p>3.9</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Secondary metabolite biosynthesis, transport and catabolism</p>
							</c>
							<c ca="center">
								<p>6</p>
							</c>
							<c ca="center">
								<p>0.5</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Signal transduction mechanisms</p>
							</c>
							<c ca="center">
								<p>11</p>
							</c>
							<c ca="center">
								<p>0.8</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Transcription</p>
							</c>
							<c ca="center">
								<p>26</p>
							</c>
							<c ca="center">
								<p>2.0</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Translation</p>
							</c>
							<c ca="center">
								<p>127</p>
							</c>
							<c ca="center">
								<p>9.7</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Miscellaneous</p>
							</c>
							<c ca="center">
								<p>82</p>
							</c>
							<c ca="center">
								<p>6.3</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
			</sec>
			<sec>
				<st>
					<p>Synonymous and nonsynonymous substitutions</p>
				</st>
				<p>The ratio of the rate of nonsynonymous substitutions (<it>d</it><sub>N</sub>) to the rate of synonymous substitutions (<it>d</it><sub>S</sub>) is commonly used to measure the relative rate of purifying selection acting at the protein level. We determined <it>d</it><sub>S </sub>and <it>d</it><sub>N </sub>for each gene pair of every group of orthologs and their values were averaged for each genome. Surprisingly, we observed saturation at synonymous sites for all genome pairs (<it>d</it><sub>S </sub>&gt; 2) and the calculation of the <it>d</it><sub>N</sub>/<it>d</it><sub>S </sub>ratio was thus impossible. Still, the average <it>d</it><sub>N </sub>was higher between MED4 and SS120 (0.36) than between SS120 and MIT9313 (0.32). The lowest <it>d</it><sub>N </sub>was observed between MIT9313 and WH8102 (0.24), a finding which is consistent with the relative acceleration of amino-acid substitutions in MED4 and in SS120.</p>
			</sec>
			<sec>
				<st>
					<p>DNA-repair systems</p>
				</st>
				<p>A shift in base composition may reflect the loss of DNA-repair genes and we therefore determined the presence or absence of genes involved in these mechanisms. As the mutational pressure is toward a high A+T content in both MED4 and SS120, we looked more closely at those genes whose absence could increase the frequency of G:C to A:T mutations. Among the genes putatively encoding DNA-repair enzymes identified in MIT9313 and WH8102, a few are missing in SS120 and/or MED4 (Table <tblr tid="T3">3</tblr>). Both MED4 and SS120 lack the <it>ada </it>gene, which encodes 6-O-methylguanine-DNA methyltransferase, which repairs alkylated forms of guanine and thymine in DNA. Such alkylations generate lesions that can lead to G:C to A:T transversions <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Interestingly, the MED4 genome is the only one among the four picocyanobacteria not to encode the A/G-specific DNA glycosylase MutY, as previously noted by Rocap and co-workers <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. This enzyme acts with MutT (NTP pyrophosphohydrolase) and MutM (formamido-pyrimidine-DNA glycosylase) in the GO system to avoid misincorporation of oxidized guanine (8-oxoG) in DNA and to repair the base mismatches A:8-oxoG <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. In <it>Escherichia coli</it>, knocking out both <it>mutM </it>and <it>mutY </it>translates into a 1,000-fold increase of G:C to A:T transversions in comparison to the wild-type strain <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. In addition to MutT and MutY, MIT9313 and WH8102 encode a third enzyme of the NUDIX hydrolase family that is missing in MED4 and SS120. This hydrolase could act to prevent mutations. However because of the broad substrate specificity of this family, one cannot know with certainty the function of this protein. Likewise, two genes coding for enzymes of the RecF pathway have been lost either by both MED4 and SS120 (DNA helicase RecQ) or only by MED4 (exonuclease RecJ).</p>
				<tbl id="T3">
					<title>
						<p>Table 3</p>
					</title>
					<caption>
						<p>DNA-repair genes missing only in <it>P. marinus </it>MED4 or in both MED4 and SS120</p>
					</caption>
					<tblbdy cols="7">
						<r>
							<c ca="left">
								<p>Gene</p>
							</c>
							<c ca="left">
								<p>COG</p>
							</c>
							<c ca="left">
								<p>Product</p>
							</c>
							<c ca="center">
								<p>MED4</p>
							</c>
							<c ca="center">
								<p>SS120</p>
							</c>
							<c ca="center">
								<p>MIT9313</p>
							</c>
							<c ca="center">
								<p>WH8102</p>
							</c>
						</r>
						<r>
							<c cspan="7">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>
										<it>ada/ogt</it>
									</b>
								</p>
							</c>
							<c ca="left">
								<p>0350</p>
							</c>
							<c ca="left">
								<p>6-O-methylguanine-DNA methyltransferase</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>PMT0269</p>
							</c>
							<c ca="center">
								<p>SYNW1680</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>
										<it>mutY</it>
									</b>
								</p>
							</c>
							<c ca="left">
								<p>1194</p>
							</c>
							<c ca="left">
								<p>A/G-specific DNA glycosylase</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>Pro1789</p>
							</c>
							<c ca="center">
								<p>PMT0135</p>
							</c>
							<c ca="center">
								<p>SYNW0115</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<it>recQ</it>
								</p>
							</c>
							<c ca="left">
								<p>0514</p>
							</c>
							<c ca="left">
								<p>Superfamily II DNA helicase</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>PMT0189</p>
							</c>
							<c ca="center">
								<p>SYNW1958</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<it>recJ</it>
								</p>
							</c>
							<c ca="left">
								<p>0608</p>
							</c>
							<c ca="left">
								<p>Single-stranded DNA-specific exonuclease</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>Pro0984</p>
							</c>
							<c ca="center">
								<p>PMT0761</p>
							</c>
							<c ca="center">
								<p>SYNW1206</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<it>exoI/xseA</it>
								</p>
							</c>
							<c ca="left">
								<p>1570</p>
							</c>
							<c ca="left">
								<p>Exonuclease VII large subunit</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>Pro0111</p>
							</c>
							<c ca="center">
								<p>PMT1641</p>
							</c>
							<c ca="center">
								<p>SYNW2181</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<it>xseB</it>
								</p>
							</c>
							<c ca="left">
								<p>1722</p>
							</c>
							<c ca="left">
								<p>Exonuclease VII small subunit</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>Pro0112</p>
							</c>
							<c ca="center">
								<p>PMT1642</p>
							</c>
							<c ca="center">
								<p>SYNW2182</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>-</p>
							</c>
							<c ca="left">
								<p>0494</p>
							</c>
							<c ca="left">
								<p>NUDIX hydrolase family</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>PMT1026</p>
							</c>
							<c ca="center">
								<p>SYNW1334</p>
							</c>
						</r>
					</tblbdy>
					<tblfn>
						<p>Genes in bold are involved in repair of G:C to A:T mutations.</p>
					</tblfn>
				</tbl>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st>
			<p>The process of genome reduction which has occurred within the <it>Prochlorococcus </it>radiation has to our knowledge never been observed so far in any other free-living prokaryote. Since <it>Prochlorococcus </it>sp. MIT9313 has a genome size very similar to that of <it>Synechococcus </it>sp. WH8102 (2.4 megabase-pair (Mbp)), as well as several other marine <it>Synechococcus </it>spp. (M. Ostrowski and D. Scanlan, personal communication), it is reasonable to assume that the common ancestor of all <it>Prochlorococcus </it>species also had a genome size around 2.4 Mbp. Under this hypothesis, the genome reduction which has occurred in MED4 would correspond to around 31%. By comparison, the extent of genome reduction in the insect endosymbiont <it>Buchnera</it>, as compared to a reconstructed ancestral genome, is around 77% <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. The genome of <it>P. marinus </it>SS120 - and <it>a fortiori </it>the MED4 genome - is considered to be near minimal for a free-living oxyphototrophic organism <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. It would seem that genome reduction in these organisms probably cannot proceed below a certain limit, corresponding to a gene pool containing all the essential genes of biosynthetic pathways and housekeeping functions (probably including most of the 1,306 four-way orthologous genes identified in this study) plus a number of other genes, including genus-specific as well as niche-specific genes. For instance, MED4 encodes a number of photolyase-related proteins, a few specific ABC transporters (for cyanate, for example; <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> and data not shown). These specific compounds might be critical for survival in the upper water layer, which receives high photon fluxes, UV light and is nutrient-depleted, but less so for life deeper in the water column.</p>
			<p>If both <it>Prochlorococcus </it>lineages and host-dependent organisms have undergone genome reduction associated with accelerated substitution rates, these phenomena must have arisen from very different causes as the resulting gene repertoires of the two types of organisms differ tremendously. Indeed, the genome evolution of endosymbionts and obligatory pathogens is driven by two main processes which have mutually reinforcing effects on genome size and evolutionary rates. Being confined inside their host, these bacteria have tiny population sizes and are regularly bottlenecked at each host generation or at each new host infection. Consequently, they experience a strong genetic drift <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> involving an increase in substitution rate. This acceleration results in the accumulation at random of slightly deleterious mutations in protein-coding genes <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B29">29</abbr></abbrgrp> as well as in rRNA genes <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. This genetic drift enhances the downsizing of the genome through inactivation and then elimination of potentially beneficial but dispensable genes. Among these, there have been a number of DNA-repair genes, the disappearance of which could have further increased the mutation rate <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. Furthermore, a number of genes may be subject to a relaxation of purifying selection which is therefore rendered less effective in maintaining gene function. This relaxation particularly affects genes which have become useless because they are redundant in their host genome, such as genes involved in the biosynthesis of amino acids, nucleotides, fatty acids and even ATP <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B32">32</abbr></abbrgrp>. Selection pressure is also reduced for genes involved in environmental sensing and regulatory systems, such as two-component systems, because of the much buffered environment offered by the host <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>.</p>
			<p>In the free-living genus <it>Prochlorococcus</it>, the very large size of field populations <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> means that these populations are subject to much lower genetic drift and their genomes are subject to much stronger purifying selection than are those of endosymbionts and pathogens <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. Consequently, the observed accelerated rate of evolution probably results merely from the increase in the mutation rate, which in turn is probably due to the loss of DNA-repair genes, even if one should note that, in <it>P. marinus </it>SS120 only two such genes are missing (Table <tblr tid="T3">3</tblr>). We observed a similar acceleration of amino-acid substitutions for all functional categories (Figure <figr fid="F4">4</figr>). This finding is more consistent with a global increase in the mutation rate than with relaxed selection, the latter being unlikely to occur to the same extent at all loci. We also assume that most amino-acid substitutions that have occurred in <it>Prochlorococcus </it>proteins are neutral; that is, they have not altered protein function. Indeed, populations of the HL clade which, like MED4, have the most derived protein sequences of all <it>Prochlorococcus </it>species, appear to be the most abundant photosynthetic organisms in the upper layer of the temperate and inter-tropical oceans <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Such an ecological success would hardly be possible for organisms handicapped by a large number of slightly deleterious mutations, especially given the fact that most genes are single copy, and so compensation of gene function is generally not possible. The effect of the maintenance of a high level of purifying selection on counteracting deleterious substitutions is particularly obvious in the rRNA genes. Contrary to the protein-coding genes, relative rate tests did not show any significant differences in the rates of evolution of the 16S rRNA genes in the four marine picocyanobacterial genomes, and thus there is no evidence that either SS120 or MED4 could have accumulated mutations destabilizing the secondary structure of their 16S rRNA molecule. One noteworthy consequence of the acceleration in the rates of evolution of protein-coding genes in <it>Prochlorococcus </it>is that phylogenetic reconstructions based on protein sequences are biased. Indeed, this leads to much longer branches for these two strains than for MIT9313. The resulting tree topology most often does not support that obtained with the 16S rRNA gene, for which the molecular clock hypothesis holds true according to our analyses. Thus, rRNA genes are likely to be among the few genes that will give reliable estimates of the phylogenetic distances between <it>Prochlorococcus </it>strains.</p>
			<p>If it is neither the relaxation of purifying selection nor an increase in genetic drift that has been the main factor causing <it>Prochlorococcus </it>genome reduction, an alternative possibility is that the latter could be the result of a selective process favoring the adaptation of <it>Prochlorococcus </it>to its environment. The apparently better ecological success in oligotrophic areas of <it>Prochlorococcus </it>species compared to their close relative <it>Synechococcus </it><abbrgrp><abbr bid="B16">16</abbr><abbr bid="B34">34</abbr></abbrgrp>, strongly suggests that the reduction of <it>Prochlorococcus </it>genome size could provide a competitive advantage to the former. Indeed, extensive comparisons of the gene complements of these two organisms show very few examples - at least among genes for which function is known - of the occurrence of specific genes in MED4 which could explain its better adaptation (data not shown). One noteworthy exception is the presence in <it>Prochlorococcus</it>, but not <it>Synechococcus</it>, of flavodoxin and ferritin, two proteins that possibly give <it>Prochlorococcus </it>a better resistance to iron stress. Apart from that, <it>Synechococcus </it>appears more like a generalist, in particular with regard to nitrogen or phosphorus uptake and assimilation <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>, and should <it>a priori </it>be more suited to sustain competition. Hence, we assume that the key to the success of <it>Prochlorococcus </it>resides less in the development of a specific complex or pathway to cope better with unfavorable conditions than in the simplification of its genome and cell organization, which can allow this organism to make substantial economies in energy and material for cell maintenance.</p>
			<p>The mere reduction in genome size <it>per se </it>is a potential source of substantial economies for the cell, as it reduces the amount of nitrogen and phosphorus, two particularly limiting elements in the upper part of the ocean, which are necessary, for instance, in DNA synthesis. Another advantage is that it allows a concomitant reduction in cell volume. It has been previously suggested (see, for example <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>) that, for a phytoplanktonic organism, a small cell volume confers two selective advantages by reducing self-shading (the package effect) and by increasing the cell surface-to-volume ratio, which can improve nutrient uptake. The first advantage would improve the fitness of the LL strains, whereas the second would offer an advantage to the HL strains living in nutrient-depleted surface waters. Finally, cell division is less costly for a small than for a large cell. On the basis of these observations, we assume that the major driving force for genome reduction within the <it>Prochlorococcus </it>radiation has been the selection for a more economical lifestyle. The bias toward an A+T-rich genome in MED4 and SS120 is also consistent with this hypothesis, as it can be seen as a way to economize on nitrogen. Indeed, an AT base-pair contains seven atoms of nitrogen, one less than a GC base-pair.</p>
			<p>With this hypothesis in mind, we propose a possible scenario for the evolution of <it>Prochlorococcus </it>genomes. Using a rate of 16S rRNA divergence of 1% per 50 million years <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>, one can estimate that the differentiation of these two genera is as recent as 150 million years, as the molecular clock hypothesis holds for this gene in <it>Prochlorococcus </it>and <it>Synechococcus</it>. The ancestral <it>Prochlorococcus </it>cells must have developed in the LL niche, a niche probably left free by other picocyanobacteria. Given the considerable difference in genome size between the LL strains MIT9313 and SS120, it appears that genome reduction itself must have started in one (or possibly several) lineage(s) within the LL niche some time after <it>Prochlorococcus </it>differentiation from its common ancestor with marine <it>Synechococcus </it>species. Why the selection has affected only one (or some?) and not all <it>Prochlorococcus </it>lineages remains unclear. Examination of the gene repertoire of <it>P. marinus </it>SS120 <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> suggests that this genome reduction must have concerned the random loss of dispensable genes from many different pathways. At some point during evolution, some genes involved in DNA repair have been affected; these would include the <it>ada </it>gene, which may be responsible for the shift in base composition, but also possibly several others, not necessarily involved in GC to AT mutation repair (see Table <tblr tid="T3">3</tblr>). Loss of these genes may have led to an increase in the mutation rate and therefore in the rate of evolution of protein-coding genes, accompanied by a more rapid genome shrinkage and a shift of base composition toward AT. It is worth noting that one likely consequence of this genome-wide compositional shift is the absence of the adaptive codon bias in the genomes of <it>Prochlorococcus </it>species MED4 and SS120. AT-rich codons are preferentially used whatever the amino acid (Figure <figr fid="F3">3a</figr>). Thus, codon usage in these genomes appears to reflect more the local base-composition bias than the selection for a more efficient translation through the use of optimal codons. The same conclusion has been drawn for other small genomes with high A+T content <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B38">38</abbr></abbrgrp>.</p>
			<p>Later during evolution (around 80 million years ago, according to the degree of 16S rRNA sequence divergence between MED4 and SS120) one LL population which probably already had a significantly reduced cell and genome size must have progressively adapted to the HL niche and eventually recolonized the upper layer. How this change in ecological niche was possible is still hard to define. Comparison of the gene set that differs between the LL-adapted SS120 and the HL-adapted MED4 shows that very few genes might be sufficient to shift from one to the other niche, including a multiplication of <it>hli </it>genes <abbrgrp><abbr bid="B39">39</abbr></abbrgrp> and the differential retention of genes which were present in the common ancestor of <it>Prochlorococcus </it>and <it>Synechococcus</it>, (such as the photolyases and cyanate transporters mentioned above) and were secondarily lost in the LL-adapted lineages.</p>
		</sec>
		<sec>
			<st>
				<p>Conclusions</p>
			</st>
			<p>Genome evolution in the free-living genus <it>Prochlorococcus </it>has similar features to that in host-dependent prokaryotes: genome reduction, bias toward a low G+C content, acceleration in the evolution rate of protein-coding genes, and loss of DNA-repair genes. In contrast to the latter organisms, however, in <it>Prochlorococcus </it>this evolution does not appear to be the result of genetic drift or relaxed selection being exerted on some gene categories. Indeed, purifying selection is very efficient in <it>Prochlorococcus</it>, as rRNA genes have evolved at a similar rate in all genomes. Despite the decrease in G+C content and an accelerated rate of evolution of protein-coding genes, purifying selection must also act on these genes and avoid potentially deleterious mutations. We hypothesize that a reduction in genome size (which allows a concomitant reduction in cell size and substantial economies in energy and nutrients) can constitute a selective advantage for life in the open ocean, both at depths where photon energy is low and in surface waters where nutrients are scarce.</p>
			<p>Genome shrinkage in <it>Prochlorococcus </it>has led to populations highly specialized to narrow ecological niches, at the expense of versatility and competitiveness in changing conditions. Indeed, not only is the distribution of the <it>Prochlorococcus </it>genus limited to low latitudes (40&#176;N and 40&#176;S, see <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>) but the different ecotypes are themselves more or less confined to a restricted part of the euphotic layer <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>; for example, they experience only limited changes in temperature and salinity. Paradoxically, because warm oligotrophic areas constitute a very large part of the world's oceans, the ecological niches (both LL and HL) occupied by <it>Prochlorococcus </it>species are huge, and thus this organism appears globally, despite its specialization, as one of the most successful oxyphototrophs on Earth.</p>
		</sec>
		<sec>
			<st>
				<p>Materials and methods</p>
			</st>
			<sec>
				<st>
					<p>Genome sequence data</p>
				</st>
				<p>The complete genome sequences and annotations of <it>Prochlorococcus marinus </it>MED4, <it>P. marinus </it>SS120, <it>Prochlorococcus </it>sp. MIT9313 and <it>Synechococcus </it>sp. WH8102 (accession numbers: NC_005071, NC_005072, NC_005042 and NC_005070 respectively) were downloaded from the Genome division of the NCBI Entrez system. A few additional genes which were modeled in at least one genome and were present in the other genomes but not modeled (because of their small size, for example) were included in our dataset (see Additional data file 2).</p>
			</sec>
			<sec>
				<st>
					<p>Alignment of whole genomes</p>
				</st>
				<p>Genome sequences translated in their six reading frames were aligned with the Promer program of the MUMmer 3.0 system <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>.</p>
			</sec>
			<sec>
				<st>
					<p>Codon and amino-acid usage</p>
				</st>
				<p>Codon usage was computed for every open reading frame (ORF) of each genome with the EMBOSS program cusp. Amino-acid usage was derived from the results produced by cusp.</p>
			</sec>
			<sec>
				<st>
					<p>Identification of orthologous proteins</p>
				</st>
				<p>We used a sequence-similarity based approach which is similar to the procedure used for the cluster of orthologous groups (COGs <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>). For each genome pair, all-against-all BLAST <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> comparisons were performed using protein sequences and reciprocal genome-specific best hits were identified. We considered genes as being probable orthologs when they were included in groups of size four in which each gene was the best hit of the three others. From similarity searches against the COG database, orthologs were assigned to functional categories according to those defined for the COG system. Because of the lack of a particular category for photosynthesis genes, the latter were assigned to the 'energy production and conversion' COG category. Other genes which fell into more than one of the 19 COG categories have been assigned to a supplementary category called 'miscellaneous'.</p>
			</sec>
			<sec>
				<st>
					<p>Phylogenetic branch length estimations</p>
				</st>
				<p>Protein sequences from each of the groups of four orthologous genes were aligned using ClustalW <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> with default parameters. After exclusion of all gap sites, individual alignments were concatenated in one super-alignment of 388,120 sites. Gamma distances <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> with an alpha parameter of 1 were estimated between each pair of sequences of the super-alignment. Phylogenetic branch lengths were calculated from distances with the ordinary least-squares method <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Relative rate tests (two-cluster test and Branch length test) were applied in order to test the constancy of amino-acid substitution rates between the three <it>Prochlorococcus </it>genomes (hypothesis of the molecular clock). The same analysis was applied to orthologs of each functional category.</p>
			</sec>
			<sec>
				<st>
					<p>Estimate of synonymous and nonsynonymous substitution rates</p>
				</st>
				<p>Nucleotide sequences of each group of orthologs were aligned with Protal2dna according to alignments of their corresponding amino-acid sequences <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. Pairwise estimates of the synonymous (<it>d</it><sub>S</sub>) and non-synonymous (<it>d</it><sub>N</sub>) substitution rates were obtained from the Yn00 program of the PAML 3.13 package <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Additional data files</p>
			</st>
			<p>The following additional data are available with the online version of this article. Additional data file <supplr sid="s1">1</supplr> lists the orthologous genes classified by functional category. Orthologous genes were assigned to the functional categories of COG system. Photosynthesis genes were assigned to the 'energy production and conversion' COG category. Genes falling in more than one of the 19 COG categories have been assigned to a supplementary category called 'miscellaneous'. Additional data file <supplr sid="s2">2</supplr> is a fasta file of orthologous genes which were modeled in at least one genome and present but not modeled in the other genomes.</p>
			<suppl id="s1">
				<title>
					<p>Additional data file 1</p>
				</title>
				<caption>
					<p>The orthologous genes classified by functional category</p>
				</caption>
				<text>
					<p>The orthologous genes classified by functional category</p>
				</text>
				<file name="gb-2005-6-2-r14-s1.xls">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s2">
				<title>
					<p>Additional data file 2</p>
				</title>
				<caption>
					<p>A fasta file of orthologous genes which were modeled in at least one genome and present but not modeled in the other genomes</p>
				</caption>
				<text>
					<p>A a fasta file of orthologous genes which were modeled in at least one genome and present but not modeled in the other genomes</p>
				</text>
				<file name="gb-2005-6-2-r14-s2.txt">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>We are very grateful to Martin Ostrowski and Dave Scanlan for their critical reading of the manuscript. This work was supported by the European Union Program MARGENES (QLRT-2001-01226), the EU FP6 Network of Excellence 'Marine Genomics Europe' and by the French programs Genomer (R&#233;gion Bretagne) and Ouest-Genopole. AD is supported by a doctoral fellowship from R&#233;gion Bretagne.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Genomic flux: genome evolution by gene loss and acquisition.</p>
				</title>
				<aug>
					<au>
						<snm>Lawrence</snm>
						<fnm>JG</fnm>
					</au>
					<au>
						<snm>Roth</snm>
						<fnm>JR</fnm>
					</au>
				</aug>
				<source>Organization of the Prokaryotic Genome</source>
				<publisher>Washington, DC: American Society for Microbiology</publisher>
				<editor>Charlebois RL</editor>
				<pubdate>1999</pubdate>
				<fpage>263</fpage>
				<lpage>289</lpage>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Reductive evolution of resident genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Andersson</snm>
						<fnm>SG</fnm>
					</au>
					<au>
						<snm>Kurland</snm>
						<fnm>CG</fnm>
					</au>
				</aug>
				<source>Trends Microbiol</source>
				<pubdate>1998</pubdate>
				<volume>6</volume>
				<fpage>263</fpage>
				<lpage>268</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0966-842X(98)01312-2</pubid>
						<pubid idtype="pmpid" link="fulltext">9717214</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Gene transfer from organelles to the nucleus: frequent and in big chunks.</p>
				</title>
				<aug>
					<au>
						<snm>Martin</snm>
						<fnm>W</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2003</pubdate>
				<volume>100</volume>
				<fpage>8612</fpage>
				<lpage>8614</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">166356</pubid>
						<pubid idtype="pmpid" link="fulltext">12861078</pubid>
						<pubid idtype="doi">10.1073/pnas.1633606100</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>The minimal gene complement of <it>Mycoplasma genitalium</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Fraser</snm>
						<fnm>CM</fnm>
					</au>
					<au>
						<snm>Gocayne</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>White</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Adams</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Clayton</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Fleischmann</snm>
						<fnm>RD</fnm>
					</au>
					<au>
						<snm>Bult</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Kerlavage</snm>
						<fnm>AR</fnm>
					</au>
					<au>
						<snm>Sutton</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Kelley</snm>
						<fnm>JM</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>1995</pubdate>
				<volume>270</volume>
				<fpage>397</fpage>
				<lpage>403</lpage>
				<xrefbib>
					<pubid idtype="pmpid">7569993</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>Reductive evolution suggested from the complete genome sequence of a plant-pathogenic phytoplasma.</p>
				</title>
				<aug>
					<au>
						<snm>Oshima</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Kakizawa</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Nishigawa</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Jung</snm>
						<fnm>HY</fnm>
					</au>
					<au>
						<snm>Wei</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Suzuki</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Arashida</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Nakata</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Miyata</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Ugaki</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Namba</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2004</pubdate>
				<volume>36</volume>
				<fpage>27</fpage>
				<lpage>29</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/ng1277</pubid>
						<pubid idtype="pmpid" link="fulltext">14661021</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>Genome sequence of the endocellular bacterial symbiont of aphids <it>Buchnera </it>sp. APS.</p>
				</title>
				<aug>
					<au>
						<snm>Shigenobu</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Watanabe</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Hattori</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Sakaki</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Ishikawa</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2000</pubdate>
				<volume>407</volume>
				<fpage>81</fpage>
				<lpage>86</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35024074</pubid>
						<pubid idtype="pmpid" link="fulltext">10993077</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>50 million years of genomic stasis in endosymbiotic bacteria.</p>
				</title>
				<aug>
					<au>
						<snm>Tamas</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Klasson</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Canback</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Naslund</snm>
						<fnm>AK</fnm>
					</au>
					<au>
						<snm>Eriksson</snm>
						<fnm>AS</fnm>
					</au>
					<au>
						<snm>Wernegreen</snm>
						<fnm>JJ</fnm>
					</au>
					<au>
						<snm>Sandstrom</snm>
						<fnm>JP</fnm>
					</au>
					<au>
						<snm>Moran</snm>
						<fnm>NA</fnm>
					</au>
					<au>
						<snm>Andersson</snm>
						<fnm>SG</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>296</volume>
				<fpage>2376</fpage>
				<lpage>2379</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1071278</pubid>
						<pubid idtype="pmpid" link="fulltext">12089438</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B8">
				<title>
					<p>Reductive genome evolution in <it>Buchnera aphidicola</it>.</p>
				</title>
				<aug>
					<au>
						<snm>van Ham</snm>
						<fnm>RC</fnm>
					</au>
					<au>
						<snm>Kamerbeek</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Palacios</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Rausell</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Abascal</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Bastolla</snm>
						<fnm>U</fnm>
					</au>
					<au>
						<snm>Fernandez</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Jimenez</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Postigo</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Silva</snm>
						<fnm>FJ</fnm>
					</au>
					<etal/>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2003</pubdate>
				<volume>100</volume>
				<fpage>581</fpage>
				<lpage>586</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">141039</pubid>
						<pubid idtype="pmpid" link="fulltext">12522265</pubid>
						<pubid idtype="doi">10.1073/pnas.0235981100</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>The genome of <it>Nanoarchaeum equitans</it>: insights into early archaeal evolution and derived parasitism.</p>
				</title>
				<aug>
					<au>
						<snm>Waters</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Hohn</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Ahel</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Graham</snm>
						<fnm>DE</fnm>
					</au>
					<au>
						<snm>Adams</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Barnstead</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Beeson</snm>
						<fnm>KY</fnm>
					</au>
					<au>
						<snm>Bibbs</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Bolanos</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Keller</snm>
						<fnm>M</fnm>
					</au>
					<etal/>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2003</pubdate>
				<volume>100</volume>
				<fpage>12984</fpage>
				<lpage>12988</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">240731</pubid>
						<pubid idtype="pmpid" link="fulltext">14566062</pubid>
						<pubid idtype="doi">10.1073/pnas.1735403100</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>Origin of <it>Plasmodium falciparum </it>malaria is traced by mitochondrial DNA.</p>
				</title>
				<aug>
					<au>
						<snm>Conway</snm>
						<fnm>DJ</fnm>
					</au>
					<au>
						<snm>Fanello</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Lloyd</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Al-Joubori</snm>
						<fnm>BM</fnm>
					</au>
					<au>
						<snm>Baloch</snm>
						<fnm>AH</fnm>
					</au>
					<au>
						<snm>Somanath</snm>
						<fnm>SD</fnm>
					</au>
					<au>
						<snm>Roper</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Oduola</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Mulder</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Povoa</snm>
						<fnm>MM</fnm>
					</au>
					<etal/>
				</aug>
				<source>Mol Biochem Parasitol</source>
				<pubdate>2000</pubdate>
				<volume>111</volume>
				<fpage>163</fpage>
				<lpage>171</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0166-6851(00)00313-3</pubid>
						<pubid idtype="pmpid" link="fulltext">11087926</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>A plastid of probable green algal origin in Apicomplexan parasites.</p>
				</title>
				<aug>
					<au>
						<snm>Kohler</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Delwiche</snm>
						<fnm>CF</fnm>
					</au>
					<au>
						<snm>Denny</snm>
						<fnm>PW</fnm>
					</au>
					<au>
						<snm>Tilney</snm>
						<fnm>LG</fnm>
					</au>
					<au>
						<snm>Webster</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Wilson</snm>
						<fnm>RJ</fnm>
					</au>
					<au>
						<snm>Palmer</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Roos</snm>
						<fnm>DS</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>1997</pubdate>
				<volume>275</volume>
				<fpage>1485</fpage>
				<lpage>1489</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.275.5305.1485</pubid>
						<pubid idtype="pmpid" link="fulltext">9045615</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Microbial minimalism: genome reduction in bacterial pathogens.</p>
				</title>
				<aug>
					<au>
						<snm>Moran</snm>
						<fnm>NA</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>2002</pubdate>
				<volume>108</volume>
				<fpage>583</fpage>
				<lpage>586</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0092-8674(02)00665-7</pubid>
						<pubid idtype="pmpid" link="fulltext">11893328</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Genome sequence of the cyanobacterium <it>Prochlorococcus marinus </it>SS120, a nearly minimal oxyphototrophic genome.</p>
				</title>
				<aug>
					<au>
						<snm>Dufresne</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Salanoubat</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Partensky</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Artiguenave</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Axmann</snm>
						<fnm>IM</fnm>
					</au>
					<au>
						<snm>Barbe</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Duprat</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Galperin</snm>
						<fnm>MY</fnm>
					</au>
					<au>
						<snm>Koonin</snm>
						<fnm>EV</fnm>
					</au>
					<au>
						<snm>Le Gall</snm>
						<fnm>F</fnm>
					</au>
					<etal/>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2003</pubdate>
				<volume>100</volume>
				<fpage>10020</fpage>
				<lpage>10025</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">187748</pubid>
						<pubid idtype="pmpid" link="fulltext">12917486</pubid>
						<pubid idtype="doi">10.1073/pnas.1733211100</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>Genome divergence in two <it>Prochlorococcus </it>ecotypes reflects oceanic niche differentiation.</p>
				</title>
				<aug>
					<au>
						<snm>Rocap</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Larimer</snm>
						<fnm>FW</fnm>
					</au>
					<au>
						<snm>Lamerdin</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Malfatti</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Chain</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Ahlgren</snm>
						<fnm>NA</fnm>
					</au>
					<au>
						<snm>Arellano</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Coleman</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Hauser</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Hess</snm>
						<fnm>WR</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2003</pubdate>
				<volume>424</volume>
				<fpage>1042</fpage>
				<lpage>1047</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature01947</pubid>
						<pubid idtype="pmpid" link="fulltext">12917642</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Genome analysis of marine photosynthetic microbes and their global role.</p>
				</title>
				<aug>
					<au>
						<snm>Hess</snm>
						<fnm>WR</fnm>
					</au>
				</aug>
				<source>Curr Opin Biotechnol</source>
				<pubdate>2004</pubdate>
				<volume>15</volume>
				<fpage>191</fpage>
				<lpage>198</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.copbio.2004.03.007</pubid>
						<pubid idtype="pmpid" link="fulltext">15193326</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>Differential distribution and ecology of <it>Prochlorococcus </it>and <it>Synechococcus </it>in oceanic waters: a review.</p>
				</title>
				<aug>
					<au>
						<snm>Partensky</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Blanchot</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Vaulot</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Marine Cyanobacteria</source>
				<publisher>Monaco: Mus&#233;e Oc&#233;anographique</publisher>
				<editor>Charpy L, Larkum AWD</editor>
				<pubdate>1999</pubdate>
				<fpage>457</fpage>
				<lpage>475</lpage>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Estimates of cyanobacterial biomass and its distribution.</p>
				</title>
				<aug>
					<au>
						<snm>Garcia-Pichel</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Belnap</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Neuer</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Schanz</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Archiv Hydrobiol</source>
				<pubdate>2003</pubdate>
				<volume>109</volume>
				<issue>Suppl 148</issue>
				<fpage>213</fpage>
				<lpage>228</lpage>
			</bibl>
			<bibl id="B18">
				<title>
					<p>Physiology and molecular phylogeny of coexisting <it>Prochlorococcus </it>ecotypes.</p>
				</title>
				<aug>
					<au>
						<snm>Moore</snm>
						<fnm>LR</fnm>
					</au>
					<au>
						<snm>Rocap</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Chisholm</snm>
						<fnm>SW</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>1998</pubdate>
				<volume>393</volume>
				<fpage>464</fpage>
				<lpage>467</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/30965</pubid>
						<pubid idtype="pmpid" link="fulltext">9624000</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Photophysiology of the marine cyanobacterium <it>Prochlorococcus</it>: ecotypic differences among cultured isolates.</p>
				</title>
				<aug>
					<au>
						<snm>Moore</snm>
						<fnm>LR</fnm>
					</au>
					<au>
						<snm>Chisholm</snm>
						<fnm>SW</fnm>
					</au>
				</aug>
				<source>Limnol Oceanogr</source>
				<pubdate>1999</pubdate>
				<volume>44</volume>
				<fpage>628</fpage>
				<lpage>638</lpage>
			</bibl>
			<bibl id="B20">
				<title>
					<p>Resolution of <it>Prochlorococcus </it>and <it>Synechococcus </it>ecotypes by using 16S-23S ribosomal DNA internal transcribed spacer sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Rocap</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Distel</snm>
						<fnm>DL</fnm>
					</au>
					<au>
						<snm>Waterbury</snm>
						<fnm>JB</fnm>
					</au>
					<au>
						<snm>Chisholm</snm>
						<fnm>SW</fnm>
					</au>
				</aug>
				<source>Appl Environ Microbiol</source>
				<pubdate>2002</pubdate>
				<volume>68</volume>
				<fpage>1180</fpage>
				<lpage>1191</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">123739</pubid>
						<pubid idtype="pmpid" link="fulltext">11872466</pubid>
						<pubid idtype="doi">10.1128/AEM.68.3.1180-1191.2002</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>Clade-specific 16S ribosomal DNA oligonucleotides reveal the predominance of a single marine <it>Synechococcus </it>clade throughout a stratified water column in the Red Sea.</p>
				</title>
				<aug>
					<au>
						<snm>Fuller</snm>
						<fnm>NJ</fnm>
					</au>
					<au>
						<snm>Marie</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Partensky</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Vaulot</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Post</snm>
						<fnm>AF</fnm>
					</au>
					<au>
						<snm>Scanlan</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Appl Environ Microbiol</source>
				<pubdate>2003</pubdate>
				<volume>69</volume>
				<fpage>2430</fpage>
				<lpage>2443</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">154553</pubid>
						<pubid idtype="pmpid" link="fulltext">12732508</pubid>
						<pubid idtype="doi">10.1128/AEM.69.5.2430-2443.2003</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>The genome of a motile marine <it>Synechococcus</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Palenik</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Brahamsha</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Larimer</snm>
						<fnm>FW</fnm>
					</au>
					<au>
						<snm>Land</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Hauser</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Chain</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Lamerdin</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Regala</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Allen</snm>
						<fnm>EE</fnm>
					</au>
					<au>
						<snm>McCarren</snm>
						<fnm>J</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2003</pubdate>
				<volume>424</volume>
				<fpage>1037</fpage>
				<lpage>1042</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature01943</pubid>
						<pubid idtype="pmpid" link="fulltext">12917641</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p>Tracing the evolution of gene loss in obligate bacterial symbionts.</p>
				</title>
				<aug>
					<au>
						<snm>Moran</snm>
						<fnm>NA</fnm>
					</au>
				</aug>
				<source>Curr Opin Microbiol</source>
				<pubdate>2003</pubdate>
				<volume>6</volume>
				<fpage>512</fpage>
				<lpage>518</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.mib.2003.08.001</pubid>
						<pubid idtype="pmpid" link="fulltext">14572545</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>DNA alkylation repair limits spontaneous base substitution mutations in <it>Escherichia coli</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Mackay</snm>
						<fnm>WJ</fnm>
					</au>
					<au>
						<snm>Han</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Samson</snm>
						<fnm>LD</fnm>
					</au>
				</aug>
				<source>J Bacteriol</source>
				<pubdate>1994</pubdate>
				<volume>176</volume>
				<fpage>3224</fpage>
				<lpage>3230</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">205492</pubid>
						<pubid idtype="pmpid">8195077</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>Evidence that MutY and MutM combine to prevent mutations by an oxidatively damaged form of guanine in DNA.</p>
				</title>
				<aug>
					<au>
						<snm>Michaels</snm>
						<fnm>ML</fnm>
					</au>
					<au>
						<snm>Cruz</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Grollman</snm>
						<fnm>AP</fnm>
					</au>
					<au>
						<snm>Miller</snm>
						<fnm>JH</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1992</pubdate>
				<volume>89</volume>
				<fpage>7022</fpage>
				<lpage>7025</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">49637</pubid>
						<pubid idtype="pmpid" link="fulltext">1495996</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p><it>Escherichia coli </it>mutator genes.</p>
				</title>
				<aug>
					<au>
						<snm>Horst</snm>
						<fnm>JP</fnm>
					</au>
					<au>
						<snm>Wu</snm>
						<fnm>TH</fnm>
					</au>
					<au>
						<snm>Marinus</snm>
						<fnm>MG</fnm>
					</au>
				</aug>
				<source>Trends Microbiol</source>
				<pubdate>1999</pubdate>
				<volume>7</volume>
				<fpage>29</fpage>
				<lpage>36</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0966-842X(98)01424-3</pubid>
						<pubid idtype="pmpid" link="fulltext">10068995</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>The process of genome shrinkage in the obligate symbiont <it>Buchnera aphidicola</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Moran</snm>
						<fnm>NA</fnm>
					</au>
					<au>
						<snm>Mira</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2001</pubdate>
				<volume>2</volume>
				<fpage>research0054.1</fpage>
				<lpage>0054.12</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">64839</pubid>
						<pubid idtype="pmpid" link="fulltext">11790257</pubid>
						<pubid idtype="doi">10.1186/gb-2001-2-12-research0054</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B28">
				<title>
					<p>Evidence for genetic drift in endosymbionts (<it>Buchnera</it>): analyses of protein-coding genes.</p>
				</title>
				<aug>
					<au>
						<snm>Wernegreen</snm>
						<fnm>JJ</fnm>
					</au>
					<au>
						<snm>Moran</snm>
						<fnm>NA</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>1999</pubdate>
				<volume>16</volume>
				<fpage>83</fpage>
				<lpage>97</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">10331254</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B29">
				<title>
					<p>Accelerated evolution and Muller's rachet in endosymbiotic bacteria.</p>
				</title>
				<aug>
					<au>
						<snm>Moran</snm>
						<fnm>NA</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1996</pubdate>
				<volume>93</volume>
				<fpage>2873</fpage>
				<lpage>2878</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">39726</pubid>
						<pubid idtype="pmpid" link="fulltext">8610134</pubid>
						<pubid idtype="doi">10.1073/pnas.93.7.2873</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<title>
					<p>Deleterious mutations destabilize ribosomal RNA in endosymbiotic bacteria.</p>
				</title>
				<aug>
					<au>
						<snm>Lambert</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Moran</snm>
						<fnm>NA</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1998</pubdate>
				<volume>95</volume>
				<fpage>4458</fpage>
				<lpage>4462</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">22511</pubid>
						<pubid idtype="pmpid" link="fulltext">9539759</pubid>
						<pubid idtype="doi">10.1073/pnas.95.8.4458</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B31">
				<title>
					<p>Sequencing and analysis of bacterial genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Koonin</snm>
						<fnm>EV</fnm>
					</au>
					<au>
						<snm>Mushegian</snm>
						<fnm>AR</fnm>
					</au>
					<au>
						<snm>Rudd</snm>
						<fnm>KE</fnm>
					</au>
				</aug>
				<source>Curr Biol</source>
				<pubdate>1996</pubdate>
				<volume>6</volume>
				<fpage>404</fpage>
				<lpage>416</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0960-9822(02)00508-0</pubid>
						<pubid idtype="pmpid">8723345</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>The complete sequence of the mucosal pathogen <it>Ureaplasma urealyticum</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Glass</snm>
						<fnm>JI</fnm>
					</au>
					<au>
						<snm>Lefkowitz</snm>
						<fnm>EJ</fnm>
					</au>
					<au>
						<snm>Glass</snm>
						<fnm>JS</fnm>
					</au>
					<au>
						<snm>Heiner</snm>
						<fnm>CR</fnm>
					</au>
					<au>
						<snm>Chen</snm>
						<fnm>EY</fnm>
					</au>
					<au>
						<snm>Cassell</snm>
						<fnm>GH</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2000</pubdate>
				<volume>407</volume>
				<fpage>757</fpage>
				<lpage>762</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35037619</pubid>
						<pubid idtype="pmpid" link="fulltext">11048724</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B33">
				<title>
					<p>Genome sequence of the endocellular obligate symbiont of tsetse flies, <it>Wigglesworthia glossinidia</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Akman</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Yamashita</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Watanabe</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Oshima</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Shiba</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Hattori</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Aksoy</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2002</pubdate>
				<volume>32</volume>
				<fpage>402</fpage>
				<lpage>407</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/ng986</pubid>
						<pubid idtype="pmpid" link="fulltext">12219091</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B34">
				<title>
					<p><it>Prochlorococcus</it>, a marine photosynthetic prokaryote of global significance.</p>
				</title>
				<aug>
					<au>
						<snm>Partensky</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Hess</snm>
						<fnm>WR</fnm>
					</au>
					<au>
						<snm>Vaulot</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Microbiol Mol Biol Rev</source>
				<pubdate>1999</pubdate>
				<volume>63</volume>
				<fpage>106</fpage>
				<lpage>127</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">98958</pubid>
						<pubid idtype="pmpid" link="fulltext">10066832</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B35">
				<title>
					<p>The nearly neutral theory of molecular evolution.</p>
				</title>
				<aug>
					<au>
						<snm>Ohta</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>Annu Rev Ecol Syst</source>
				<pubdate>1992</pubdate>
				<volume>23</volume>
				<fpage>263</fpage>
				<lpage>286</lpage>
				<xrefbib>
					<pubid idtype="doi">10.1146/annurev.es.23.110192.001403</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p>Phytoplankton size.</p>
				</title>
				<aug>
					<au>
						<snm>Chisholm</snm>
						<fnm>SW</fnm>
					</au>
				</aug>
				<source>Primary Productivity and Biogeochemical Cycles in the Sea</source>
				<publisher>New York: Plenum Press</publisher>
				<editor>Falkowski PG, Woodhead AD</editor>
				<pubdate>1992</pubdate>
				<fpage>213</fpage>
				<lpage>237</lpage>
			</bibl>
			<bibl id="B37">
				<title>
					<p>Evolution in bacteria: evidence for a universal substitution rate in cellular genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Ochman</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Wilson</snm>
						<fnm>AC</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>1987</pubdate>
				<volume>26</volume>
				<fpage>74</fpage>
				<lpage>86</lpage>
				<xrefbib>
					<pubid idtype="pmpid">3125340</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B38">
				<title>
					<p>Codon usage and base composition in <it>Rickettsia prowazekii</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Andersson</snm>
						<fnm>SG</fnm>
					</au>
					<au>
						<snm>Sharp</snm>
						<fnm>PM</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>1996</pubdate>
				<volume>42</volume>
				<fpage>525</fpage>
				<lpage>536</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8662004</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>Analysis of the <it>hli </it>gene family in marine and freshwater cyanobacteria.</p>
				</title>
				<aug>
					<au>
						<snm>Bhaya</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Dufresne</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Vaulot</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Grossman</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>FEMS Microbiol Lett</source>
				<pubdate>2002</pubdate>
				<volume>215</volume>
				<fpage>209</fpage>
				<lpage>219</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0378-1097(02)00934-5</pubid>
						<pubid idtype="pmpid" link="fulltext">12399037</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B40">
				<title>
					<p>Niche-partitioning of <it>Prochlorococcus </it>populations in a stratified water column in the eastern North Atlantic Ocean.</p>
				</title>
				<aug>
					<au>
						<snm>West</snm>
						<fnm>NJ</fnm>
					</au>
					<au>
						<snm>Scanlan</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Appl Environ Microbiol</source>
				<pubdate>1999</pubdate>
				<volume>65</volume>
				<fpage>2585</fpage>
				<lpage>2591</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">91382</pubid>
						<pubid idtype="pmpid" link="fulltext">10347047</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>Versatile and open software for comparing large genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Kurtz</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Phillippy</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Delcher</snm>
						<fnm>AL</fnm>
					</au>
					<au>
						<snm>Smoot</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Shumway</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Antonescu</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Salzberg</snm>
						<fnm>SL</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2004</pubdate>
				<volume>5</volume>
				<fpage>R12</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">395750</pubid>
						<pubid idtype="pmpid" link="fulltext">14759262</pubid>
						<pubid idtype="doi">10.1186/gb-2004-5-2-r12</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>A genomic perspective on protein families.</p>
				</title>
				<aug>
					<au>
						<snm>Tatusov</snm>
						<fnm>RL</fnm>
					</au>
					<au>
						<snm>Koonin</snm>
						<fnm>EV</fnm>
					</au>
					<au>
						<snm>Lipman</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>1997</pubdate>
				<volume>278</volume>
				<fpage>631</fpage>
				<lpage>637</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.278.5338.631</pubid>
						<pubid idtype="pmpid" link="fulltext">9381173</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B43">
				<title>
					<p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.</p>
				</title>
				<aug>
					<au>
						<snm>Altschul</snm>
						<fnm>SF</fnm>
					</au>
					<au>
						<snm>Madden</snm>
						<fnm>TL</fnm>
					</au>
					<au>
						<snm>Schaffer</snm>
						<fnm>AA</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Miller</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Lipman</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1997</pubdate>
				<volume>25</volume>
				<fpage>3389</fpage>
				<lpage>3402</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">146917</pubid>
						<pubid idtype="pmpid" link="fulltext">9254694</pubid>
						<pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B44">
				<title>
					<p>CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.</p>
				</title>
				<aug>
					<au>
						<snm>Thompson</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Higgins</snm>
						<fnm>DG</fnm>
					</au>
					<au>
						<snm>Gibson</snm>
						<fnm>TJ</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1994</pubdate>
				<volume>22</volume>
				<fpage>4673</fpage>
				<lpage>4680</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">308517</pubid>
						<pubid idtype="pmpid">7984417</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B45">
				<aug>
					<au>
						<snm>Nei</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Kumar</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Molecular Evolution and Phylogenetic</source>
				<publisher>Oxford: Oxford University Press</publisher>
				<pubdate>2000</pubdate>
			</bibl>
			<bibl id="B46">
				<title>
					<p>protal2dna</p>
				</title>
				<url>http://bioweb.pasteur.fr/seqanal/interfaces/protal2dna.html</url>
			</bibl>
			<bibl id="B47">
				<title>
					<p>PAML: A program package for phylogenetic analysis by maximum likelihood.</p>
				</title>
				<aug>
					<au>
						<snm>Yang</snm>
						<fnm>Z</fnm>
					</au>
				</aug>
				<source>Comput Appl Biosci</source>
				<pubdate>1997</pubdate>
				<volume>13</volume>
				<fpage>555</fpage>
				<lpage>556</lpage>
				<xrefbib>
					<pubid idtype="pmpid">9367129</pubid>
				</xrefbib>
			</bibl>
		</refgrp>
	</bm>
</art>
