<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2010-11-4-r44</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Method</dochead>
      <bibl>
         <title>
            <p>Restriction Site Tiling Analysis: accurate discovery and quantitative genotyping of genome-wide polymorphisms using nucleotide arrays</p>
         </title>
         <aug>
            <au ca="yes" id="A1"><snm>Pespeni</snm><mi>H</mi><fnm>Melissa</fnm><insr iid="I1"/><email>mpespeni@stanford.edu</email></au>
            <au id="A2"><snm>Oliver</snm><mi>A</mi><fnm>Thomas</fnm><insr iid="I1"/><email>toliver@stanford.edu</email></au>
            <au id="A3"><snm>Manier</snm><mi>K</mi><fnm>Mollie</fnm><insr iid="I1"/><insr iid="I2"/><email>maniermk@gmail.com</email></au>
            <au id="A4"><snm>Palumbi</snm><mi>R</mi><fnm>Stephen</fnm><insr iid="I1"/><email>spalumbi@stanford.edu</email></au>
         </aug>
         <insg>
            <ins id="I1"><p>Department of Biology, Stanford University, Hopkins Marine Station, Oceanview Blvd Pacific Grove, CA 93950, USA</p></ins>
            <ins id="I2"><p>Current address: Department of Biology, 107 Life Sciences Complex, Syracuse University, Syracuse, NY 13244, USA</p></ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2010</pubdate>
         <volume>11</volume>
         <issue>4</issue>
         <fpage>R44</fpage>
         <url>http://genomebiology.com/2010/11/4/R44</url>
         <xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2010-11-4-r44</pubid><pubid idtype="pmpid">20403197</pubid></pubidlist></xrefbib>
      </bibl>
      <history><rec><date><day>2</day><month>2</month><year>2010</year></date></rec><revrec><date><day>7</day><month>4</month><year>2010</year></date></revrec><acc><date><day>19</day><month>4</month><year>2010</year></date></acc><pub><date><day>19</day><month>4</month><year>2010</year></date></pub></history>
      <cpyrt><year>2010</year><collab>Pespeni et al.; licensee BioMed Central Ltd.</collab><note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
	  <shorttitle>
<p>Restriction Site Tiling Analysis</p>
</shorttitle>
<shortabs>
<p>A method for the simultaneous identification of polymorphic loci and the quantitative genotyping  of thousands of loci in individuals is presented.</p>
</shortabs>

      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>High-throughput genotype data can be used to identify genes important for local adaptation in wild populations, phenotypes in lab stocks, or disease-related traits in human medicine. Here we advance microarray-based genotyping for population genomics with Restriction Site Tiling Analysis. The approach simultaneously discovers polymorphisms and provides quantitative genotype data at 10,000s of loci. It is highly accurate and free from ascertainment bias. We apply the approach to uncover genomic differentiation in the purple sea urchin.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification id="endnote" subtype="user_supplied_xml" type="bmc"/>
      

<classification id="30010007" subtype="man_spc_id" type="BMC">Ecology</classification>
<classification id="30010008" subtype="man_spc_id" type="BMC">Evolution</classification>
<classification id="30010009" subtype="man_spc_id" type="BMC">Genetics</classification>
<classification id="30010010" subtype="man_spc_id" type="BMC">Genome studies</classification>
<classification id="30010013" subtype="man_spc_id" type="BMC">Methods</classification>
</classifications>
</meta>

   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Uncovering the genetic underpinnings of adaptive evolution is key to understanding the evolutionary processes that generate biodiversity <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The combined use of genome scans and population genetic analyses has been applied in both model and non-model organisms to discover and document the role of specific genes in adaptive evolution <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. Surveys of hundreds to thousands of genome-wide markers identified from SNP databases, microarray-based SNP survey methods, or sequences have been applied in humans, yeast, dogs, the malaria parasite <it>Plasmodium falciparum</it>, <it>Drosophila</it>, and <it>Arabidopsis </it><abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr></abbrgrp>. Based on massive sequencing efforts to identify polymorphisms, these approaches have led to insightful evaluation of genetic adaptation. However, these data sets can be complicated by ascertainment bias <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp> and have historically required a large investment in SNP development.</p>
         <p>Approaches to non-model organisms have also resulted in powerful tools to characterize the imprint of selection across the genome at smaller numbers of loci. Tens to hundreds of anonymous genome-wide markers, such as amplified fragment length polymorphisms or microsatellites, have shown genetic patterns correlated to environmental conditions, indicating local adaptation in organisms, including periwinkle snails, lake whitefish, Atlantic salmon, common frogs, and beech trees <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. These methods require little prior marker or sequence information. However, they are limited by the number of loci that can be examined (usually hundreds) and the focus on anonymous loci limits identification of functionally relevant genes <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>.</p>
         <p>Genome-wide scans of genetic diversity at tens of thousands of loci have become more accessible for non-model study systems with the development of microarray-based polymorphism detection approaches and as the synthesis of species-specific cDNA and high-density oligonucleotide arrays has become more affordable <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. Specifically, array platforms have been used to detect single feature polymorphisms (SFPs) and restriction-site-associated DNA (RAD) markers by hybridization to species-specific arrays <abbrgrp><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>. In these methods, a polymorphism is detected as a binding signal difference between individuals or pooled population samples hybridized to arrays. In the SFP approach, labeled genomic DNA from different samples is separately hybridized to high-density arrays of species-specific 25-bp oligonucleotides. In the case of RAD, two individuals are labeled with different fluorescent dyes and co-hybridized to a single array to identify differences. Each approach has advantages: SFP markers are not restricted to restriction cut sites, and RAD markers can be identified using pre-existing cDNA arrays. However, these approaches generate binary data about the presence or absence of a polymorphism at a locus (rather than genotype data of an individual), and RAD requires pairwise competitive hybridization among samples to identify differences. In addition, these approaches have primarily been applied in inbred, genetically tractable study organisms: yeast, <it>Arabidopsis </it>strains, <it>Drosophila </it>isofemale lines, stickleback lines, zebrafish lines, and <it>Neurospora </it>mold <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr></abbrgrp>, with the exception of wild caught <it>Anopheles </it>mosquitoes <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>.</p>
         <p>Another potential approach for generating genome-wide polymorphism data in non-model organisms is the combination of next-generation sequencing with targeted SNP genotyping <abbrgrp><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>. For example, for a species without a sequenced genome, the transcriptomes of multiple individuals could be labeled and pooled ('multiplexed') and sequenced in a single 454 sequencing run <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. These sequence data can be used to identify common polymorphisms that can then be assayed across more study individuals using a SNP genotyping platform (for example, Illumina's GoldenGate or Infinium platforms or Affymetrix GeneChips). Though this is an attractive approach, there are two major disadvantages. First, only genes expressed in sampled individuals can be compared; genotypes at other genetic loci cannot be assayed, emphasizing an important balance in 454 transcriptome sequencing - breadth of gene coverage across the genome and depth of coverage necessary for polymorphism identification. Second, ascertainment bias would be introduced by surveying only common polymorphisms identified from a subset of individuals. Rare polymorphisms would not be detected in the sequence data or may be excluded as potential sequencing errors. The importance of rare polymorphisms was recently emphasized in two independent studies on human disease. Data from the complete genome sequences of 14 healthy and diseased individuals suggested that diseases, whether rare or common, were caused by rare mutations <abbrgrp><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr></abbrgrp>. As a result, an approach that detects even rare substitutions is advantageous.</p>
         <p>For population genomics studies, there is a need for higher resolution genome-wide genotype data free from ascertainment bias and a less cumbersome ability to compare numerous individuals across multiple, wild populations. Though future resequencing technologies may allow genetic studies to map traits or search for adaptive genes by whole genome sequence comparisons <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B39">39</abbr></abbrgrp>, population level studies require comparing numerous individuals at the same loci. The sequencing coverage necessary to repeatedly sample many individuals across the same large set of loci drives resequencing strategies to be less cost-effective than array-based polymorphism discovery and genotyping assays.</p>
         <p>Here we present a generally applicable technique, Restriction Site Tiling Analysis (RSTA), which scans for restriction cut site polymorphisms across the genome of an individual using a microarray platform. The technique requires the sequence of a single genome, transcriptome, or large EST library from which to design a species-specific, high-density microarray. The approach allows simultaneous identification of polymorphic loci and the genotyping of individuals as homozygous for a cut site, homozygous for a mutation in a cut site, or heterozygous at thousands of loci. The approach is free from ascertainment bias and does not require competitive hybridization among individuals to identify polymorphisms. These qualities make it well suited for population genomics studies. Genotype data can be used to calculate F<sub>ST </sub>or heterozygosity, or look for patterns of linkage disequilibrium in two or more populations. We first validate the accuracy of the method in detecting polymorphic loci and genotyping individuals. Second, we explore its application for population genomics studies by comparing the genomes of 20 purple sea urchins from two geographically and environmentally distant populations.</p>
         <p>We developed this method using the purple sea urchin, <it>Strongylocentrotus purpuratus </it>(Stimpson, 1857), as a model system because we are ultimately interested in studying the balance between gene flow and adaptive evolution along environmental gradients. The purple sea urchin lives in intertidal and shallow subtidal habitats from the cold waters of Alaska to the warmer waters of Baja California, Mexico <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. There is great potential for genetic mixing because larvae may travel far during a 4- to 12-week development phase <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr></abbrgrp>. In accordance with their high dispersal potential, previous studies have found little or no population structure along the coast of the United States <abbrgrp><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>. In addition, the purple sea urchin is a highly fecund species <abbrgrp><abbr bid="B42">42</abbr></abbrgrp> and has dramatically large population sizes <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Theoretically, these characteristics maximize the effects of natural selection and minimize the effects of random genetic drift, making this species a good system in which to study adaptive evolution across the genome. Finally, the purple sea urchin has a published genome sequence <abbrgrp><abbr bid="B46">46</abbr></abbrgrp> and has been the subject of ecological studies for decades <abbrgrp><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr></abbrgrp>. However, little is known about the adaptive potential of purple sea urchins despite their broad latitudinal distribution, ecological importance, and their role as a model species in developmental biology.</p>
         <p>The purple sea urchin genome is approximately 800 Mb in size, encoding approximately 28,000 genes. There is a similar number of genes and gene structure as seen in the human genome, about 8 exons and 7 introns per gene with each gene spanning on average 8 kb <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. Exon size is just over 100 nucleotides and intron size is about 750 nucleotides, shorter than introns in the human genome as expected with the smaller genome size. The species is highly polymorphic relative to other species with sequenced genomes. Using thermal DNA reassociation experiments, it was estimated that two individual urchins differ from each other in about 4% of the nucleotide pairs in single-copy DNA <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. Genome assembly revealed about one SNP per 100 bases and a comparable number of indel polymorphisms <abbrgrp><abbr bid="B46">46</abbr></abbrgrp> when aligning the sequenced DNA from the single inbred diploid individual sea urchin. Such high heterozygosity has impeded a more complete assembly of the genome. In the most recent build of the genome sequence (Spur_v2.1, September 2006), there were 114,222 scaffolds of which 16,057 had multiple contigs with an N50 of 183 kb. Scaffolds are not physically mapped to chromosomes.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>RSTA hybridization results</p>
            </st>
            <p>RSTA is based on differential binding of restriction digested and non-digested DNA from a single individual to a single array with 50-bp tiles designed to be centered on known restriction cut sites (Figure <figr fid="F1">1</figr>). Specifically, for each individual, genomic DNA is randomly sheared by sonication, restriction digested and internally labeled with fluorescent dCTP using random octomers (Cy3, green). Non-digested DNA from the same individual is labeled with a different color (Cy5, red). These genomic preparations from the same individual are then pooled and hybridized under conditions that favor binding of uncut DNA over cut DNA to the array tiles. DNA that matches the known genome sequence is cut by the restriction enzyme, resulting in poor binding to the array tiles, low Cy3 signal intensity, and a high Cy5 to Cy3 ratio. In contrast, DNA with a polymorphic mutation in the cut site remains intact, resulting in a high Cy3 signal intensity, and a more even Cy5 to Cy3 ratio (Figure <figr fid="F1">1</figr>).</p>
            <fig id="F1"><title><p>Figure 1</p></title><caption><p>Restriction site tiling analysis identifies polymorphisms and genotypes individuals by hybridization to a custom microarray</p></caption><text>
   <p><b>Restriction site tiling analysis identifies polymorphisms and genotypes individuals by hybridization to a custom microarray</b>. Fifty base pair tiles (white circles) are designed to be centered on restriction enzyme cut sites. DNA from an individual is extracted and randomly sheared by sonication. The sample is then divided in half: one part is treated with the restriction enzyme and labeled with green fluorescent dye (Cy3), the other part is treated as a control (without restriction enzyme) and labeled with red fluorescent dye (Cy5). The two parts are mixed and hybridized to the array. This DNA processing and hybridization result in different fluorescent signals reflecting the three possible genotypes for a polymorphic locus: when an individual is homozygous for the cut site (blue triangle) the digested DNA is cut and does not hybridize to the tile, resulting in a high red-to-green ratio (log2 Cy5/Cy3, left panel); however, if an individual is homozygous for a mutation in the cut site (yellow star) then the DNA remains intact and hybridizes to the tile, resulting in high green signal intensity or a low red-to-green ratio (right panel). Heterozygous individuals yield an intermediate red-to-green ratio. Polymorphic loci are identified based on the bi- or trimodal distribution of log ratios across sampled individuals. Individuals can be genotyped based on their log ratio.</p>
</text><graphic file="gb-2010-11-4-r44-1"/></fig>
            <p>We designed several types of tiles in order to confirm that genomic DNA from a diploid organism with a large, complex genome interacted with the array platform as predicted. There were five tile types on the array: restriction cut site centered tiles (n = 50,935), control tiles centered on non-cut sites in single copy genes (n = 10,523), negative control tiles that did not match anywhere in the genome based on BLASTN results (n = 1,036), positive control tiles that matched multi-copy ribosomal DNA (n = 100), and a degradation series to examine the effect of mutational differences between sample DNA and tile sequence on binding efficiency (n = 1,100). We surveyed Taq&#225;I restriction cut sites, though any restriction enzyme or number of enzymes could be used as long as each 50-bp probe is non-overlapping. Taq&#225;I recognizes four base pairs (TCGA) and in doing so is predicted to occur, on average, every 256 bases. The average intermarker distance was 15.7 kb between restriction cut site centered tiles across the 800 Mb genome.</p>
            <p>Both experimental and control tiles yielded expected signal intensities (a proxy for binding efficiency). Restriction digestion resulted in a significantly lower distribution of green (Cy3) signal intensities for restriction cut site centered tiles compared to the control red (Cy5) channel (Figure <figr fid="F2">2a</figr>; KS test, <it>P </it>&lt; 0.0001). Control non-cut site tiles showed strong Cy3 (digested DNA) signal intensities, indicating no effect of restriction digestion (KS test, <it>P </it>&lt; 0.0001). Negative control tiles had very low signal intensities, significantly lower than experimental tiles (Figure <figr fid="F2">2b</figr>; KS test, <it>P </it>&lt; 0.0001). Positive control tiles designed to match ribosomal DNA had much greater signal intensity than experimental tiles designed to single-copy loci (Figure <figr fid="F2">2b</figr>; KS test, <it>P </it>&lt; 0.0001). We assessed the repeatability of the RSTA approach by performing experimental and technical replicates (that is, independent extraction, processing and hybridization of DNA from a single individual to multiple arrays, and replicate tiles synthesized in triplicate on a single array). These experiments revealed that the signal intensities of corresponding tiles among replicate arrays were highly consistent (R<sup>2 </sup>= 0.92) and that there was low variance among replicate tiles on a single array (coefficient of variation = 0.08).</p>
            <fig id="F2"><title><p>Figure 2</p></title><caption><p>Frequency histograms of signal intensities for experimental and control tiles</p></caption><text>
   <p><b>Frequency histograms of signal intensities for experimental and control tiles</b>. <b>(a) </b>Digested DNA (green, labeled with Cy3) and non-digested DNA (red, Cy5) binding to restriction cut site centered tiles. <b>(b) </b>Cy5 signal intensities for negative control tiles (blue, randomly generated tiles that did not match anywhere in the genome according to BLASTN) and positive control tiles (magenta, matching multi-copy ribosomal DNA).</p>
</text><graphic file="gb-2010-11-4-r44-2"/></fig>
         </sec>
         <sec>
            <st>
               <p>Identification of polymorphic loci</p>
            </st>
            <p>We compared the genomes of 10 individual purple sea urchins from Boiler Bay, Oregon and 10 individuals from San Diego, California at 50,935 restriction cut sites using 20 RSTA arrays. We genotyped the ten northern sea urchins and the ten southern sea urchins at five known polymorphic restriction cut sites through PCR amplification and restriction digestion and sequencing. We then examined the RSTA array data from 50-bp tiles designed around each of these five loci. We found for each locus that RSTA data across the 20 individuals consisted of three clusters corresponding to the two homozygous and the heterozygous genotypes (Figure <figr fid="F3">3a</figr>). The homozygote clusters were separated by more than 0.7 log ratio units. We used these log ratio characteristics (three clusters and a range greater than 0.7) to identify polymorphic loci among the other 50,930 loci based on their RSTA array data. We used the Bayesian hierarchical clustering algorithm Mclust <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> to determine the number of clusters that best described the log ratio data for the 20 individuals for each locus. These criteria identified 12,431 loci as polymorphic out of the 50,935 loci surveyed (24%). There were 6,859 polymorphisms in coding regions, 2,253 in putative regulatory regions, and 3,319 in intergenic regions. We confirmed individual genotypes for a subset of loci using PCR amplification and sequencing (see below) or restriction digestion gels (Figure <figr fid="F3">3b</figr>). We used the resulting genotype data to look for signals of population differentiation at specific loci (Figure <figr fid="F3">3c</figr>).</p>
            <fig id="F3"><title><p>Figure 3</p></title><caption><p>Polymorphic restriction cut site in pyruvate kinase muscle isozyme across 20 individuals</p></caption><text>
   <p><b>Polymorphic restriction cut site in pyruvate kinase muscle isozyme across 20 individuals</b>. <b>(a) </b>RSTA array log ratio data separate genotypes of individuals sampled. Cool colored circles represent individuals from Boiler Bay, Oregon; warm colored triangles represent individuals from San Diego, California. The data for each individual are in triplicate. <b>(b) </b>Individual genotypes confirmed by restriction digest gels. Lane 1 is an undigested PCR fragment for size reference, while lanes 2 to 10 are treated with the restriction enzyme; lanes 2, 3, 5, 6, 9, and 10 are from heterozygous individuals; lane 4 is from an individual homozygous for the cut site; lanes 7 and 8 are individuals homozygous for a mutation in the cut site. <b>(c) </b>Genotype data resulting from RSTA can be used to look for differences across populations.</p>
</text><graphic file="gb-2010-11-4-r44-3"/></fig>
         </sec>
         <sec>
            <st>
               <p>Accuracy of detecting polymorphic loci and genotyping</p>
            </st>
            <p>To determine the accuracy of the RSTA method and to determine the log ratio range for each genotype, we designed primers to amplify and sequence 15 loci, 7 putative polymorphic loci and 8 putative monomorphic loci, across the 20 individuals. We found 99.6% accuracy in genotypes called from RSTA array data (252 correct out of 253 genotypes surveyed). Of the 8 putative monomorphic loci, all were monomorphic; 139 out of 139 (100%) of the genotypes across the 20 individuals were homozygous for the Taq&#225;I cut site (TCGA). Out of the 114 polymorphic genotypes we confirmed with sequence data, 113 (99.1%) matched genotypes called from the RSTA array. From these confirmed genotypes, log ratio data for different genotypes reliably fell into three distinct clusters (less than -0.6 for homozygous uncut, between -0.6 and -0.1 for heterozygotes, and greater than -0.1 for homozygous cut). We used these cutoffs to call individual genotypes among all polymorphic loci from the population data set. These results show that our method of polymorphism identification and genotype calling was highly accurate under these conditions, distinguishing monomorphic and polymorphic loci and correctly calling genotypes of polymorphic loci.</p>
            <p>We were also able to detect insertion-deletion polymorphisms (indels) in the RSTA array data. Indels affected the Cy5 (non-digested) signal such that alleles with a deletion had a low binding signal (signal intensity &lt;50), in the same range as background and negative control tiles. Alleles that matched the published genome sequence had a normal binding signal (signal intensity &gt;150, depending on tile sequence). To identify loci with indel polymorphisms, we used these signal intensity cutoffs and the presence of two or three clusters in the Cy5 signal intensity data. We found that 3% of loci in coding regions had indel polymorphisms. We sequence-confirmed one particularly interesting locus, a mannose receptor, and found that RSTA array data matched sequence data in all cases. The sequence data revealed a 3-bp deletion in seven of seven predicted deletions while five out of five sequences matched the tile sequence as predicted. Genes with indels could be top candidates for further study as they likely result in an amino acid sequence change, possibly affecting protein function.</p>
            <p>We found that approximately 24% of surveyed restriction cut sites contained a mutation among the 20 individuals surveyed, which equates to about one polymorphism per approximately 200 bp of the purple sea urchin genome. This is less than expected based on the genome assembly, which found at least one SNP every approximately 100 bp and an equal proportion of indels. Due to the high degree of genetic diversity in this species, it is likely that a large proportion of polymorphisms among the 20 individuals sampled went undetected. In highly polymorphic genomic regions, the sampled DNA will not bind to the microarray tile and polymorphisms cannot be detected in the surveyed cut site. This is supported by the observation that we had a significantly greater fraction of tiles with poor binding signal in non-coding regions (7.8%) where higher rates of polymorphism were expected than in coding regions (4.3%, chi-square = 5049.6, <it>P </it>&lt; 0.0001). To determine the effect on hybridization of mutational differences between sample DNA and microarray tiles designed from the published genome sequence, we designed tiles that were a perfect match to one place in the genome, then randomly mutated 1 to 10 bases, resulting in a series of 11 tiles per perfect match tile. We did this for 100 perfect match tiles, resulting in a degradation series data set of 1,100 tiles. We found that there was an 80% reduction in signal intensity with four mutational differences in the 50-bp tiles, resulting in near background signal intensity range. These data suggest that 8% sequence difference between a DNA sample and microarray tile results in near complete hybridization loss.</p>
         </sec>
         <sec>
            <st>
               <p>Population patterns of polymorphic loci</p>
            </st>
            <p>For the 12,431 polymorphic loci, we constructed a genotype matrix for the 20 individuals. We used this matrix to calculate heterozygosity and F<sub>ST</sub>. We found that San Diego individuals had a significantly higher mean heterozygosity (0.2427) than Oregon individuals (0.2258; KS test, <it>P </it>= 1.38 &#215; 10<sup>-7</sup>), supporting the hypothesis of higher gene flow (larval dispersal) from the north to the south along the US West coast <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. As expected, we found a higher frequency of the uncut homozygous genotype (different from the published genome sequence, where the individual sequenced was from southern California) in Oregon individuals (0.1035) than San Diego individuals (0.0869; KS test, <it>P </it>= 5.014 &#215; 10<sup>-11</sup>). We used the genotype matrix to calculate F<sub>ST </sub>for each locus as F<sub>ST </sub>= (H<sub>T </sub>- H<sub>S</sub>)/H<sub>T</sub>, using allele frequencies to estimate heterozygosity, where H<sub>T </sub>is the total heterozygosity across populations and H<sub>S </sub>is the mean of heterozygosity within populations <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. The genome-wide mean F<sub>ST </sub>was 0.0029 among populations, with single locus F<sub>ST </sub>values ranging from 0 to 0.5.</p>
            <p>Genome-wide population patterns revealed that all loci were in Hardy-Weinberg equilibrium after multiple test correction. Among the top 100 highest F<sub>ST </sub>coding loci and the top 100 highest F<sub>ST </sub>loci overall, we found no linkage disequilibrium among any locus pairs after multiple test correction (using Genepop <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>). We looked for patterns of linkage in 687 paired loci in coding regions and corresponding upstream regions of the same genes. We found a highly significant correlation between the F<sub>ST </sub>values of the paired loci (correlation coefficient = 0.3288, <it>P </it>&lt; 0.0001). These data suggest that similar forces are acting on genetic differentiation in coding and upstream regions, either because of linkage across the two tile sites (2 to 10 kb apart) or the joint action of selection.</p>
         </sec>
         <sec>
            <st>
               <p>Genetic differentiation along the species range</p>
            </st>
            <p>We applied Principal Components Analysis (PCA) to determine if there was a signal of population differentiation in the array data set. Analyzing the log ratio data of all polymorphic loci, we found that principal components two and three spatially separated Oregon and San Diego populations (Figure <figr fid="F4">4a</figr>). By removing loci in the tail of the F<sub>ST </sub>distribution (F<sub>ST </sub>&gt;0.1, defined by the mean F<sub>ST </sub>plus two times the standard deviation, approximately the top 4%), we found that the spatial split between populations was lost (Figure <figr fid="F4">4b</figr>). These results suggest that &gt;95% of the purple sea urchin genome has no signal of population differentiation, in accord with previously published descriptions of a few loci <abbrgrp><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>. As expected, the high F<sub>ST </sub>loci (top 4%) show a strong separation of Oregon and San Diego individuals along PC2 (Figure <figr fid="F4">4c</figr>; see Additional file <supplr sid="S1">1</supplr> for a list of the top 100 loci and the corresponding gene annotations).</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p><b>Table of the top 100 highest F<sub>ST </sub>loci</b>. The table includes five columns of information: gene number (GLEAN3), gene annotation, RSTA array tile name, RSTA array tile oligonucleotide sequence, and F<sub>ST </sub>value.</p>
               </text>
               <file name="gb-2010-11-4-r44-S1.XLS">
   <p>Click here for file</p>
</file>
            </suppl>
            <fig id="F4"><title><p>Figure 4</p></title><caption><p>Principal Components Analysis using RSTA array log ratio data show a signal of population differentiation in a high gene flow species</p></caption><text>
   <p><b>Principal Components Analysis using RSTA array log ratio data show a signal of population differentiation in a high gene flow species</b>. Symbols represent individuals from Oregon (blue circles) and San Diego (red triangles). <b>(a) </b>All polymorphic coding loci, 6,859; <b>(b) </b>polymorphic coding loci excluding top F<sub>ST </sub>loci, 6,555; and <b>(c) </b>top F<sub>ST </sub>polymorphic coding loci, 304. Patterns were similar for other tiles in non-coding regions.</p>
</text><graphic file="gb-2010-11-4-r44-4"/></fig>
            <p>Overall F<sub>ST </sub>was low: 0.0029. To test the significance of this value, we randomly shuffled the alleles from all 20 individuals and recalculated F<sub>ST </sub>over 10,000 permutations for each polymorphic locus. We compared the observed genome-wide F<sub>ST </sub>distribution to the permuted distributions to determine if the observed F<sub>ST</sub>s were higher than would be predicted under panmixia. The observed distribution was significantly broader than 9,991 (99.91%) of the permuted distributions (KS test, <it>P </it>&lt; 0.0001; Figure <figr fid="F5">5</figr>). The observed mean was higher than the permuted mean (observed: 0.0029 &gt; permuted: 0.0026) over all the 10,000 simulations. The mean and median of the observed distribution was higher than 100% of the simulated distributions. These results show that the observed data consistently had a higher F<sub>ST </sub>than expected under panmixia. Moreover, the observed distribution always had more loci with F<sub>ST </sub>&gt;0.2 than seen in the permuted distributions. The higher levels of F<sub>ST </sub>in the observed data set suggest that there is low but significant genetic differentiation between populations. Such differentiation could be due to low gene flow among populations, selection at some loci, or both.</p>
            <fig id="F5"><title><p>Figure 5</p></title><caption><p>Genome-wide distribution of F<sub>ST </sub>values</p></caption><text>
   <p><b>Genome-wide distribution of F<sub>ST </sub>values</b>. Open bars show the observed distribution for 12,431 polymorphisms. Solid bars show the mean of 10,000 random permutations. Error bars represent standard deviation for permuted distributions. Numbers in boxes show excess number of loci observed over mean permuted.</p>
</text><graphic file="gb-2010-11-4-r44-5"/></fig>
            <p>Detecting loci under selection depends on evaluating the distribution of F<sub>ST</sub>s among loci compared to that expected under neutrality <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. We searched for loci that showed significantly high F<sub>ST </sub>values using the procedure of Beaumont and Nichols as implemented in LOSITAN <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>. Three significant loci were identified by this analysis (<it>P </it>&lt; 0.000002), along with a fourth marginally significant (<it>P </it>&lt; 0.00003). These conclusions are limited by the large number of multiple tests, requiring a strong multiple test correction factor, but the distribution of <it>P</it>-values suggests selection acts on more loci than just these three. Seven loci show <it>P</it>-values &lt; 0.0001 whereas less than one is expected. Likewise, the number of loci with <it>P</it>-values &lt; 0.001 or &lt; 0.01 is higher than expected (22 versus 7, and 93 versus 69, respectively).</p>
            <p>A separate procedure, in which selection on loci is estimated from the data and the distribution of selection factors (&#945;) is tested against Bayesian expectation, was suggested by Beaumont and Balding <abbrgrp><abbr bid="B55">55</abbr></abbrgrp> and augmented by Foll and Gaggiotti <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. This test returns three strongly significant loci (Bayes factor &gt;10) - two of which were detected in the previous analysis. The third significant locus is ranked fourth in the previous test. These values show selection factors (&#945;) of 1.3 to 1.4. Simulations suggest that these values correspond to mild selection coefficients (s) of about 0.02 per generation <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. In summary, our data suggest selection is acting on a small number of loci, but also suggest that selection occurs at other loci as well. In this high gene flow species, increased sampling at the individual and population levels using RSTA or other more targeted approaches would be needed to test robustly for selection across the genome.</p>
            <p>The top five genes in which loci were identified as outliers were mannose receptor C1, transcription factor 25, cubilin, a chromatin assembly factor (retinoblastoma binding protein 4 (RBBP4)), and a Golgi autoantigen. Mannose receptors bind to foreign cells and target them for destruction by the immune system <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>. Polymorphisms in mannose-binding proteins in humans are associated with infection frequency <abbrgrp><abbr bid="B58">58</abbr></abbrgrp>, but no data exist yet on the role of sea urchin polymorphisms. Transcription factor 25 (TCF25) and the chromatin assembly factor (RBBP4) both negatively regulate transcription. Cubilin is a multi-ligand endocytic receptor important for the endocytosis of proteins, nutrients and vitamins, and is massively expressed in the yolk sac during development <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>. The Golgi autoantigen (Golgin subfamily A member 3 (GOLGA3)) is an autoimmune antigen associated with the Golgi complex and has been shown to be important for successful spermatogenesis <abbrgrp><abbr bid="B60">60</abbr></abbrgrp>. These genes suggest important roles for immunity, transcriptional regulation, and reproduction and development. These processes have previously been shown to be targets of natural selection in other systems <abbrgrp><abbr bid="B61">61</abbr><abbr bid="B62">62</abbr><abbr bid="B63">63</abbr></abbrgrp>.</p>
            <p>Several other particularly interesting genes were among the highest F<sub>ST </sub>loci (Additional file <supplr sid="S1">1</supplr>) as potential targets of natural selection. These include a toll-like receptor (Tlr2.1), cytochrome P450, receptor for egg jelly 7, and a GABA-receptor, among others. Toll-like receptors and cytochrome P450 are environmental response genes that function during bacterial outbreaks <abbrgrp><abbr bid="B64">64</abbr><abbr bid="B65">65</abbr></abbrgrp> and environmental stress <abbrgrp><abbr bid="B66">66</abbr><abbr bid="B67">67</abbr></abbrgrp>. Receptors of egg jelly are expressed on the apical tip of sperm heads and are critical proteins in gamete recognition <abbrgrp><abbr bid="B63">63</abbr></abbrgrp>. GABA receptors function in some taxa as signals for larval settlement <abbrgrp><abbr bid="B68">68</abbr></abbrgrp>, and could play a role in habitat selection during early life. Alternatively, it could play some other role in larval nervous system function.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>Comparison of RSTA to other high-throughput polymorphism discovery methods</p>
            </st>
            <p>RSTA significantly advances other related high-throughput polymorphism discovery and genotyping methods by providing quantitative genotype data for each individual surveyed for each polymorphic locus identified (Table <tblr tid="T1">1</tblr>). Such data can be used to examine population allele frequencies at tens of thousands of loci, calculate F<sub>ST </sub>or Hardy-Weinberg equilibrium, model neutrality, identify outlier loci, or apply any other downstream population genetic analysis that requires genotype data. We also demonstrate that RSTA is highly accurate in outcrossed populations sampled from the wild, making it useful for species that cannot be crossed in the lab. The application of RSTA for genome-wide surveys of wild populations can generate hypotheses regarding genes important for local adaptation in species that do not have a visible trait that might confer a fitness advantage.</p>
            <tbl id="T1"><title><p>Table 1</p></title><caption><p>Comparison of four high-throughput polymorphism detection approaches</p></caption><tblbdy cols="5">
      <r>
         <c ca="left">
            <p>
               <b>Parameter</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>SFP</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>RAD tagging</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>RAD sequencing</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>RSTA</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="5">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Marker type</p>
         </c>
         <c ca="left">
            <p>SNPs and indels</p>
         </c>
         <c ca="left">
            <p>Restriction cut site polymorphisms</p>
         </c>
         <c ca="left">
            <p>Sequence data: SNPs next to restriction cut sites</p>
         </c>
         <c ca="left">
            <p>Restriction cut site polymorphisms: distinguishes SNPs and indels</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Number of loci surveyed</p>
         </c>
         <c ca="left">
            <p>92,924</p>
         </c>
         <c ca="left">
            <p>19,200 (elements on an enriched RAD-tag microarray designed from stickleback)</p>
         </c>
         <c ca="left">
            <p>26 nucleotides at 41,622 RAD tags</p>
         </c>
         <c ca="left">
            <p>50,935</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Number of polymorphisms identified (informative marker rate)</p>
         </c>
         <c ca="left">
            <p>3,806 (4% at a 5% false discovery rate cutoff)</p>
         </c>
         <c ca="left">
            <p>1,990 (10% at a two-fold signal difference cutoff)</p>
         </c>
         <c ca="left">
            <p>Approximately 13,000 (31%)</p>
         </c>
         <c ca="left">
            <p>12,431 (24%)</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>False discovery rate</p>
         </c>
         <c ca="left">
            <p>3% (117 out of 121 confirmed correct by sequencing)</p>
         </c>
         <c ca="left">
            <p>9% (20 out of 22 confirmed correct by sequencing)</p>
         </c>
         <c ca="left">
            <p>Not reported</p>
         </c>
         <c ca="left">
            <p>&lt;1% (113 out of 114 confirmed correct by sequencing)</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Platform</p>
         </c>
         <c ca="left">
            <p>Custom high-density oligonucleotide array (Affymetrix), 25 bp oligo</p>
         </c>
         <c ca="left">
            <p>cDNA or genomic tiling array (in house synthesis)</p>
         </c>
         <c ca="left">
            <p>Illumina sequencing</p>
         </c>
         <c ca="left">
            <p>Custom high-density oligonucleotide array (Agilent), 50 bp oligo</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Prior information required</p>
         </c>
         <c ca="left">
            <p>EST, 454 or genome sequence</p>
         </c>
         <c ca="left">
            <p>EST or RAD-tag library for array synthesis</p>
         </c>
         <c ca="left">
            <p>EST or genome sequence to map short sequence reads</p>
         </c>
         <c ca="left">
            <p>EST, 454 or genome sequence</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Polymorphism identification</p>
         </c>
         <c ca="left">
            <p>Hybridization signal difference among study individuals</p>
         </c>
         <c ca="left">
            <p>Hybridization signal difference between two study individuals</p>
         </c>
         <c ca="left">
            <p>Custom Perl scripts for sequence alignment</p>
         </c>
         <c ca="left">
            <p>Genotype clusters across all study individuals</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Individual genotype data</p>
         </c>
         <c ca="left">
            <p>No</p>
         </c>
         <c ca="left">
            <p>No</p>
         </c>
         <c ca="left">
            <p>No</p>
         </c>
         <c ca="left">
            <p>Yes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Organisms studied</p>
         </c>
         <c ca="left">
            <p>Yeast, <it>Arabidopsis, Anopheles</it>, several seed plants<sup>a</sup></p>
         </c>
         <c ca="left">
            <p><it>Drosophila</it>, stickleback, zebrafish, <it>Neurospora</it></p>
         </c>
         <c ca="left">
            <p>
               <it>Neurospora</it>
            </p>
         </c>
         <c ca="left">
            <p>Purple sea urchin</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>Numbers are from studies that describe each method: SFP <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>; RAD tagging <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>; RAD sequencing <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. <sup>a</sup>See Gupta <it>et al</it>. <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> for review of high-throughput applications in crop plants.</p>
   </tblfn></tbl>
            <p>RAD tagging, like RSTA, surveys the genome of a species for restriction cut site polymorphisms using an array platform <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. The RAD system compares the hybridization signal between two genome preparations that are co-hybridized, and provides a view of the relative degree of restriction digestion in the two genome preparations. Applying the RAD approach in our study system at the level of individual DNAs would have required 190 hybridizations in order to compare all individuals to one another in the way that 20 RSTA hybridizations allowed. In addition, the resulting 190 RAD hybridizations would produce a qualitative ranking of allele content among individuals, but not the precise genotypes at all loci. Applying the SFP <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> approach, however, though this has not been demonstrated, could yield quantitative data because, like RSTA and unlike RAD, there is no PCR amplification step in DNA processing and each individual is hybridized to a single array. PCR amplification can generate differences in allele copy numbers between samples, making detecting differences between samples qualitative rather than quantitative. However, the short oligonucleotide size (25 bp) in the SFP approach could add noise to the data through non-specific binding, particularly in species with large complex genomes, and could yield more subtle differences between genotypes at each polymorphic locus. This would necessitate large sample sizes to improve the signal to noise ratio for quantitative SFP genotype data. RSTA may be better suited for species with large genomes or high heterozygosity and may yield cleaner data for heterozygotes because of the longer oligonucleotides used (50 bp).</p>
            <p>RSTA, RAD, and SFP approaches can be applied to 'bulk' DNA pooled from individuals from a single population. This drastically reduces the number of arrays needed but also reduces the data to a qualitative assessment of gene frequency differences between pooled samples because there is not a precise relationship between hybridization signal difference and gene frequency difference. By contrast, the RSTA approach applied at the individual level allows gene frequencies to be precisely quantified among populations and produces multi-locus data sets of high accuracy at the individual and population levels.</p>
            <p>RAD tagging has been extended to use next-generation sequencing to identify polymorphisms <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. RAD sequencing reduces representation of the genome by sequencing adjacent to conserved restriction cut sites. The approach identifies a similar number of markers as RSTA, although it does not provide genotype data. Half of one Illumina run yielded approximately 0.4- to 1-fold coverage across the 96 individuals studied <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. An estimated 13-fold coverage is necessary for accurate identification of heterozygotes <abbrgrp><abbr bid="B69">69</abbr></abbrgrp>, making next-generation sequencing costly for genotype data at this stage.</p>
            <p>In applying RSTA, DNA processing and data analysis is simpler than in other approaches. DNA processing proceeds as follows: shear by sonication, restriction digest with chosen enzyme, fluorescently label, then competitively hybridize with control, non-digested DNA from the same individual. Hybridization against control DNA from the same individual and screening for trimodal data across the population data set nicely separates signal from noise in microarray data, likely resulting in the low false discovery rate (&lt;1%). The RSTA approach can also distinguish SNP and indel polymorphisms using the hybridization signal of the control, non-digested DNA.</p>
            <p>The major advantage of the RSTA system is that it produces highly accurate genotypes of individuals at many loci simultaneously without ascertainment bias. Other platforms can provide this information for well-defined systems, though there will be ascertainment bias if targeted SNPs are surveyed - for example, the Affymetrix platform used for humans, dogs, or yeast. In addition, there is a high upfront cost for microarrays that require mask development and there is little chance that such gene chips will become available for many species. In the field of population genomics, there is a need for and keen interest in generating genome-wide genotype data for wild populations of a species. RSTA provides such quantitative genome-wide genotype data in a technically and analytically straightforward approach and without an upfront microarray design cost.</p>
         </sec>
         <sec>
            <st>
               <p>Opportunities for expanded genome-wide population genetics</p>
            </st>
            <p>We present an accurate genome scanning method that allows simultaneous discovery of polymorphisms and genotyping of thousands of loci by surveying for restriction cut site polymorphisms using an affordable, species-specific microarray. The RSTA array approach can be applied to any species with a cDNA library database or 454 transcriptome sequence, for example. A combination of 454 transcriptome sequencing with a breadth of gene coverage and RSTA polymorphism discovery and genotyping could be very fruitful for the discovery of functionally important genes in non-model species. A breadth of gene coverage in transcriptome sequencing could be accomplished by pooling across multiple tissues and life history stages and tissues sampled after treatment with various environmental stimuli. Because 4-base restriction sites occur at random about every 256 bp (for gene regions with equal nucleotide frequencies), 10,000 kb of sequence data (comparable to what was generated for the Glanville fritillary butterfly using 454 sequencing <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>) would provide on the order of 40,000 RSTA tiles. There is also great potential to increase genome-wide coverage by increasing the number of restriction cut sites surveyed. There is no compromise in data quality in assays of sites from multiple restriction enzymes as long as sites are further than 50 bases apart such that tiles are not overlapping (data not shown).</p>
            <p>The application of RSTA in species with lower genetic diversity than purple sea urchins could reveal a lower proportion of polymorphic RSTA tiles. However, the high degree of genetic diversity in purple sea urchins (approximately 4% in single copy genes <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>) may have dramatically reduced the proportion of polymorphic RSTA tiles detected in sections of the genome that have multiple substitutions, largely because such areas may not hybridize well. Thus, in species with less genetic diversity, it could be possible to identify an equal or greater proportion of polymorphisms as were observed in this study, depending on the polymorphism rate in the species and the number of individuals sampled in the study.</p>
            <p>The absence of ascertainment bias in RSTA is a major advantage in SNP determination compared to targeted SNP genotyping. RSTA also has the ability to identify rare polymorphisms; the Mclust clustering algorithm defines the number of clusters that best describe the data regardless of the number of data points in each cluster. However, RSTA does not identify all polymorphisms in a gene, and there are many SNPs that remain undetected using this method.</p>
            <p>In species without a complete genome sequence, noise could be added to the data by failure to exclude probes that match multiple places in the genome. We excluded approximately 19% of probes due to redundancy when RSTA features were compared back to coding regions. This fraction of redundant probes could also be excluded if using a 454 transcriptome sequence that has a good breadth of gene coverage.</p>
            <p>Differences in gene frequencies between two sea urchin populations suggest that <it>S. purpuratus </it>is mildly differentiated along the US west coast, just as it is along the coast of Baja Mexico <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>. Previous assays of population structure were derived from relatively few mitochondrial DNA, allozyme or microsatellite loci <abbrgrp><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr><abbr bid="B71">71</abbr></abbrgrp>, and reported no population differentiation except for the southern end of the species range <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B70">70</abbr></abbrgrp>, or between age classes at one locus <abbrgrp><abbr bid="B71">71</abbr></abbrgrp>. In the present study, population structure is indicated by F<sub>ST </sub>values that are higher than expected, from a greater fraction of homozygous uncut genotypes in Oregon than in California, and a higher heterozygosity in the southern end of the species range. In addition, several loci appear more differentiated than expected under neutral evolution, a result that might be due to natural selection on these loci. Selection on single loci has been inferred in other marine species living across environmental gradients with allozymes <abbrgrp><abbr bid="B72">72</abbr><abbr bid="B73">73</abbr></abbrgrp> or through outlier F<sub>ST </sub>analyses <abbrgrp><abbr bid="B74">74</abbr></abbrgrp>. Conclusions about selection from our data are preliminary due to the potential impact of mild population structure on the distribution of F<sub>ST </sub>among loci. However, the outlier loci and highest F<sub>ST </sub>loci play roles in biological processes that we would predict to be important for local adaptation in this species: immunity, transcriptional regulation, environmental response, and reproduction and development.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusions</p>
         </st>
         <p>We have presented a new genome scanning technique that allows the discovery of polymorphic loci and returns quantitative genotype data at tens of thousands of markers. The approach requires genome or transcriptome sequence data from one individual, though is free from ascertainment bias as polymorphisms are discovered without any prior knowledge by screening all individuals studied. Genotype data can be paired with locus position information to map disease-related or adaptive phenotype-related traits to specific genomic regions or paired with coalescent simulations to identify divergent (F<sub>ST</sub>) outlier loci. This approach, and others like it that generate data on genome-wide distributions of polymorphisms, promises to aid in the identification of ecologically relevant genes and traits in both model and non-model organisms. Such high-throughput genotype data will allow a much greater understanding of the role of environmental variation in shaping genetic diversity patterns and help reveal the genetic basis of adaptive evolution in natural populations.</p>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>RSTA array design</p>
            </st>
            <p>We designed 50-bp oligonucleotide tiles by screening the published purple sea urchin genome sequence <abbrgrp><abbr bid="B46">46</abbr></abbrgrp> for Taq&#945;I restriction enzyme cut sites (TCGA). We centered tiles on Taq&#945;I cut sites and screened for uniqueness and complexity using BLASTN (NCBI), comparing tiles to the full genome sequence to reduce cross-reactivity. We excluded tiles with more than one hit greater than 90% sequence similarity. Across the genome, we included 50,935 Taq&#945;I cut sites: 27,128 in protein coding regions, 9,418 within 1,000 bases upstream of genes, and 14,389 in intergenic 'non-coding' regions. The average inter-marker distance was 15.7 kb across the 800 Mb purple urchin genome. We designed control tiles to non-cut sites (TTGA, n = 10,523), ribosomal DNA (positive control for hybridization efficiency, n = 100), and randomly generated tiles that did not match anywhere in the genome according to BLASTN results (negative control for background signal and cross-reactivity, n = 1,036). We also designed a degradation series of tiles in which we randomly changed 1 to 10 bases of a 50-bp tile that matched only one place in the genome (based on BLASTN). We did this for 100 unique tiles, resulting in 1,100 tiles. We used these tiles to estimate the effect of mutational differences between sample DNA and the published genome sequence from which tiles were designed. Tile design was done using MATLAB (2007a, The MathWorks, Natick, MA, USA). All tiles were synthesized in triplicate <it>in situ </it>on a 244K-feature high-density custom commercial microarray (Agilent-015554) by Agilent Technologies (Santa Clara, CA, USA). Agilent array probe length is typically 60 bp; 10 'T' nucleotides were first synthesized onto the glass slide before each probe sequence. All raw data files and array platform descriptions have been deposited in NCBI's Gene Expression Omnibus and are accessible through GEO Series accession number [GEO: GSE20857]. Tile names, sequences, and a detailed description of how the characters in the tile name reflect the tile type, position in the genome and gene number are accessible through GEO accession number [GEO: GPL10171].</p>
         </sec>
         <sec>
            <st>
               <p>DNA processing</p>
            </st>
            <p>We extracted genomic DNA from tube foot tissue using Nucleospin columns following the manufacturer's instructions (Macherey-Nagel, Bethlehem, PA, USA). We randomly sheared 10 &#956;g of DNA per individual, as quantified by NanoDrop (ThermoScientific, Waltham, MA, USA), by sonication (Branson Cell Sonifier, Danbury, CT, USA) for 10 seconds at output control level 3 in a 600 &#956;l volume, followed by ethanol precipitation. Note that although we used 10 &#956;g of DNA as this was readily available in this species, this amount is not required. Based on our experience and Agilent protocols, 250 ng to 1.5 &#956;g are recommended depending on the size of the array used, 60 thousand to 1 million features per array, respectively. We confirmed shearing and DNA recovery on agarose gels (fragment size ranged from 1,000 to 100 bp) and NanoDrop quantification, respectively. We then divided DNA from an individual into two samples of 5 &#956;g each. We treated one sample with a total of 10 units Taq&#945;I restriction enzyme (New England Biolabs, Ipswitch, MA, USA) for 18 hours at 65&#176;C; we then added another 5 units of enzyme for 6 hours. We carried out restriction digestion in 2.5 &#956;g batches in 25 &#956;l reaction volumes using New England Biolabs buffers; we found these conditions important to ensure complete digestion. We heat inactivated the restriction enzyme by incubation at 80&#176;C for 15 minutes. We treated control DNA in the same buffer and temperature conditions, but without the restriction enzyme. We confirmed complete digestion by failure of PCR amplification for an exon with a known Taq&#945;I cut site compared to successful amplification of uncut DNA. We ethanol precipitated DNA before entering labeling reactions. We internally labeled DNA using random octomers and polymerase to incorporate Cy3 (or Cy5) labeled dCTPs (Invitrogen BioPrime labeling and purification kit (Carlsbad, CA, USA), Amersham Cy-dyes (GE Healthcare, Little Chalfont, Buckinghamshire, UK). We labeled non-digested DNA with Cy5-dCTP; we labeled digested DNA with Cy3-dCTP. Labeling efficiency, or specific activity (calculated as picomoles dye per microgram DNA and measured using NanoDrop), was between 80 and 100 pmol dye per microgram DNA for all samples, above the minimum recommended 50 pmol/&#956;g. We carried out ethanol precipitation after sonication and after restriction digestion by adding 1:20 (volume:volume) 3 M sodium acetate and 125 mM EDTA each, then 3:1 (volume:volume) ice cold high-grade 100% ethanol. We quickly vortexed samples then incubated them at -20&#176;C for 15 minutes then spun them at 14,000 g for 30 minutes at 2 to 4&#176;C (TOMY centrifuge, TX-160, Fremont, CA, USA). We found this procedure to yield 95 to 100% DNA recovery based on NanoDrop quantification.</p>
         </sec>
         <sec>
            <st>
               <p>Microarray processing</p>
            </st>
            <p>We competitively hybridized equal amounts of digested (Cy3-labeled) and non-digested (Cy5-labeled) DNA from an individual to our custom microarray for 40 hours at 65&#176;C, rotating at 20 rpm, following the Agilent protocol for aCGH arrays. Arrays were scanned using a GenePix 4000B scanner (Axon, Molecular Devices, Silicon Valley, CA, USA) set at 5 &#956;m/pixel resolution. We dynamically set PMT gains for 650 (Cy5) and 550 (Cy3) wavelengths for each array such that the overall slide count ratio equaled one. In microarray scanners, the PMT (photomultiplier tube) converts photons into electrical signal, which is then digitized. Note that PMT gains for 650 and 550 wavelengths could be set such that the count ratio equaled one for a subset of tiles on the array, particularly control tiles that are not centered on restriction cut sites (for example, TTGA centered tiles). This would more accurately reflect signal intensity for each channel across the array as equal binding of Cy5 and Cy3 labeled DNA is expected for such control tiles while reduced Cy3 signal intensity is expected for restriction cut site centered tiles (TCGA), the dominant tile type across the array. Though it does not affect the accuracy of polymorphism detection or genotyping, setting the overall slide count ratio equal to one unnecessarily amplifies the Cy3 signal intensity. We extracted and normalized data from the scanned microarray image using Agilent Feature Extraction software. We used the resulting log ratio data (log2 of the ratio of Cy5 (non-digested) signal intensity to Cy3 (digested) signal intensity) to identify polymorphisms and genotype individuals.</p>
         </sec>
         <sec>
            <st>
               <p>SNP identification</p>
            </st>
            <p>To identify polymorphic loci across the population data set of 20 individuals, we screened for loci with a range in log ratio greater than 0.7 and more than one cluster according to a Bayesian hierarchical clustering algorithm, Mclust <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>, implemented in R <abbrgrp><abbr bid="B75">75</abbr></abbrgrp>. We used the average of triplicate tiles for this and subsequent analyses. We used Mclust to determine the number of clusters, from one to four, that best described the log ratio data for all 20 individuals for each locus. We allowed four clusters rather than three as the maximum because the algorithm better assigns three clusters if a fourth is an option <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. We used a one-dimensional model, parameterization identifier 'VII', with log ratio as input data. Data with one cluster were considered monomorphic. The combined criteria of clusters and log ratio range resulted in trimodal data that reflected the three genotypes of homozygous uncut (low log ratio), heterozygote (intermediate log ratio), and homozygous cut (high log ratio). Note that the homozygous uncut genotype did not result in a log ratio equal to zero (even binding of Cy5 and Cy3) because the whole array image when scanned was normalized for a log ratio equal to zero, offsetting the homozygous uncut genotype to less than zero.</p>
         </sec>
         <sec>
            <st>
               <p>Sequencing</p>
            </st>
            <p>We designed primers using Primer3 <abbrgrp><abbr bid="B76">76</abbr></abbrgrp> and the published genome sequence <abbrgrp><abbr bid="B77">77</abbr></abbrgrp> to amplify approximately 200 bp within an exon around each restriction cut site. We chose primers such that the 3' end of each primer terminated in the second base position of a codon. We performed PCR amplification using a touchdown protocol for all primer pairs, from 62 to 48&#176;C for 40 cycles. We sequenced amplified DNA using an ABI3100 sequencer.</p>
         </sec>
         <sec>
            <st>
               <p>Data visualization and analyses</p>
            </st>
            <p>We used MATLAB plotting tools to look at log ratio patterns of the five known polymorphic loci and subsequent loci identified based on Mclust. We used MATLAB functions to perform Kolmogorov-Smirnov tests and correlation statistics. We wrote programs to calculate heterozygosity, F<sub>ST</sub>, and Hardy-Weinberg equilibrium, and to permute the data to simulate panmixia in MATLAB. We used the princomp function in R to perform PCA with loci as rows and samples as columns. We corrected for multiple tests using the Benjamini-Hochberg method <abbrgrp><abbr bid="B78">78</abbr></abbrgrp> and Fisher's combined probability test <abbrgrp><abbr bid="B79">79</abbr></abbrgrp>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>bp: base pair; EST: expressed sequence tag; GABA: gamma-aminobutyric acid; NCBI: National Center for Biotechnology Information; PCA: Principal Components Analysis; RAD: restriction-site-associated DNA; RSTA: Restriction Site Tiling Analysis; SFP: single feature polymorphism; SNP: single nucleotide polymorphism.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>MHP and SRP conceived of the study, MHP developed technical aspects of the method, performed experiments, and performed data analysis with assistance from SRP. TAO designed tile sequences and contributed intellectually to the development of the method. MKM performed gene expression studies. MHP and SRP wrote the manuscript, with contributions from all authors.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work was supported by NSF SGER-0714997 (SRP), NSF Graduate Research Fellowship (MHP), and the Partnership for Interdisciplinary Studies of Coastal Oceans (PISCO). We thank D Garfield, E Jacobs-Palmer, B Lockwood, M Pinsky, C Tepolt, D Hirschberg, and G Nestorova for technical and computing help. We also thank M Samanta and V Stolc for helpful discussions early in the development of the array design and A Sivasundar and V Vacquier for assistance with sample collections. We also thank two anonymous reviewers for their valuable comments on this paper.</p>
         </sec>
      </ack>
      <refgrp><bibl id="B1"><aug><au><snm>Dobzhansky</snm><fnm>T</fnm></au></aug><source>Genetics and the Origin of Species</source><publisher>New York, NY: Columbia University Press</publisher><pubdate>1937</pubdate></bibl><bibl id="B2"><title><p>Population genomics: Genome-wide sampling of insect populations.</p></title><aug><au><snm>Black</snm><fnm>WC</fnm></au><au><snm>Baer</snm><fnm>CF</fnm></au><au><snm>Antolin</snm><fnm>MF</fnm></au><au><snm>DuTeau</snm><fnm>NM</fnm></au><au><snm>Berenbaum</snm><fnm>MR</fnm></au><au><snm>Carde</snm><fnm>RT</fnm></au><au><snm>Robinson</snm><fnm>GE</fnm></au></aug><source>Annu Rev Entomol</source><pubdate>2001</pubdate><volume>46</volume><fpage>441</fpage><lpage>469</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1146/annurev.ento.46.1.441</pubid><pubid idtype="pmpid" link="fulltext">11112176</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>The power and promise of population genomics: from genotyping to genome typing.</p></title><aug><au><snm>Luikart</snm><fnm>G</fnm></au><au><snm>England</snm><fnm>PR</fnm></au><au><snm>Tallmon</snm><fnm>D</fnm></au><au><snm>Jordan</snm><fnm>S</fnm></au><au><snm>Taberlet</snm><fnm>P</fnm></au></aug><source>Nat Rev Genet</source><pubdate>2003</pubdate><volume>4</volume><fpage>981</fpage><lpage>994</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nrg1226</pubid><pubid idtype="pmpid" link="fulltext">14631358</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>Land ahead: using genome scans to identify molecular markers of adaptive relevance.</p></title><aug><au><snm>Holderegger</snm><fnm>R</fnm></au><au><snm>Herrmann</snm><fnm>D</fnm></au><au><snm>Poncet</snm><fnm>B</fnm></au><au><snm>Gugerli</snm><fnm>F</fnm></au><au><snm>Thuiller</snm><fnm>W</fnm></au><au><snm>Taberlet</snm><fnm>P</fnm></au><au><snm>Gielly</snm><fnm>L</fnm></au><au><snm>Rioux</snm><fnm>D</fnm></au><au><snm>Brodbeck</snm><fnm>S</fnm></au><au><snm>Aubert</snm><fnm>S</fnm></au></aug><source>Plant Ecol Diversity</source><pubdate>2008</pubdate><volume>1</volume><fpage>273</fpage><lpage>283</lpage><xrefbib><pubid idtype="doi">10.1080/17550870802338420</pubid></xrefbib></bibl><bibl id="B5"><title><p>Recent and ongoing selection in the human genome.</p></title><aug><au><snm>Nielsen</snm><fnm>R</fnm></au><au><snm>Hellmann</snm><fnm>I</fnm></au><au><snm>Hubisz</snm><fnm>M</fnm></au><au><snm>Bustamante</snm><fnm>C</fnm></au><au><snm>Clark</snm><fnm>AG</fnm></au></aug><source>Nat Rev Genet</source><pubdate>2007</pubdate><volume>8</volume><fpage>857</fpage><lpage>868</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nrg2187</pubid><pubid idtype="pmcid">2933187</pubid><pubid idtype="pmpid" link="fulltext">17943193</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>Using genome scans of DNA polymorphism to infer adaptive population divergence.</p></title><aug><au><snm>Storz</snm><fnm>JF</fnm></au></aug><source>Mol Ecol</source><pubdate>2005</pubdate><volume>14</volume><fpage>671</fpage><lpage>688</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1365-294X.2005.02437.x</pubid><pubid idtype="pmpid" link="fulltext">15723660</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><title><p>Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome.</p></title><aug><au><snm>Wang</snm><fnm>DG</fnm></au><au><snm>Fan</snm><fnm>JB</fnm></au><au><snm>Siao</snm><fnm>CJ</fnm></au><au><snm>Berno</snm><fnm>A</fnm></au><au><snm>Young</snm><fnm>P</fnm></au><au><snm>Sapolsky</snm><fnm>R</fnm></au><au><snm>Ghandour</snm><fnm>G</fnm></au><au><snm>Perkins</snm><fnm>N</fnm></au><au><snm>Winchester</snm><fnm>E</fnm></au><au><snm>Spencer</snm><fnm>J</fnm></au></aug><source>Science</source><pubdate>1998</pubdate><volume>280</volume><fpage>1077</fpage><lpage>1082</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.280.5366.1077</pubid><pubid idtype="pmpid" link="fulltext">9582121</pubid></pubidlist></xrefbib></bibl><bibl id="B8"><title><p>Interrogating a high-density SNP map for signatures of natural selection.</p></title><aug><au><snm>Akey</snm><fnm>JM</fnm></au><au><snm>Zhang</snm><fnm>G</fnm></au><au><snm>Zhang</snm><fnm>K</fnm></au><au><snm>Jin</snm><fnm>L</fnm></au><au><snm>Shriver</snm><fnm>MD</fnm></au></aug><source>Genome Res</source><pubdate>2002</pubdate><volume>12</volume><fpage>1805</fpage><lpage>1814</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.631202</pubid><pubid idtype="pmcid">187574</pubid><pubid idtype="pmpid">12466284</pubid></pubidlist></xrefbib></bibl><bibl id="B9"><title><p>Natural selection on protein-coding genes in the human genome.</p></title><aug><au><snm>Bustamante</snm><fnm>CD</fnm></au><au><snm>Fledel-Alon</snm><fnm>A</fnm></au><au><snm>Williamson</snm><fnm>S</fnm></au><au><snm>Nielsen</snm><fnm>R</fnm></au><au><snm>Hubisz</snm><fnm>MT</fnm></au><au><snm>Glanowski</snm><fnm>S</fnm></au><au><snm>Tanenbaum</snm><fnm>DM</fnm></au><au><snm>White</snm><fnm>TJ</fnm></au><au><snm>Sninsky</snm><fnm>JJ</fnm></au><au><snm>Hernandez</snm><fnm>RD</fnm></au><au><snm>Civello</snm><fnm>D</fnm></au><au><snm>Adams</snm><fnm>MD</fnm></au><au><snm>Cargill</snm><fnm>M</fnm></au><au><snm>Clark</snm><fnm>AG</fnm></au></aug><source>Nature</source><pubdate>2005</pubdate><volume>437</volume><fpage>1153</fpage><lpage>1157</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature04240</pubid><pubid idtype="pmpid" link="fulltext">16237444</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>Population genomics of domestic and wild yeasts.</p></title><aug><au><snm>Liti</snm><fnm>G</fnm></au><au><snm>Carter</snm><fnm>DM</fnm></au><au><snm>Moses</snm><fnm>AM</fnm></au><au><snm>Warringer</snm><fnm>J</fnm></au><au><snm>Parts</snm><fnm>L</fnm></au><au><snm>James</snm><fnm>SA</fnm></au><au><snm>Davey</snm><fnm>RP</fnm></au><au><snm>Roberts</snm><fnm>IN</fnm></au><au><snm>Burt</snm><fnm>A</fnm></au><au><snm>Koufopanou</snm><fnm>V</fnm></au><au><snm>Tsai</snm><fnm>IJ</fnm></au><au><snm>Bergman</snm><fnm>CM</fnm></au><au><snm>Bensasson</snm><fnm>D</fnm></au><au><snm>O&apos;Kelly</snm><fnm>MJT</fnm></au><au><snm>van Oudenaarden</snm><fnm>A</fnm></au><au><snm>Barton</snm><fnm>DBH</fnm></au><au><snm>Bailes</snm><fnm>E</fnm></au><au><snm>Ba</snm><fnm>ANN</fnm></au><au><snm>Jones</snm><fnm>M</fnm></au><au><snm>Quail</snm><fnm>MA</fnm></au><au><snm>Goodhead</snm><fnm>I</fnm></au><au><snm>Sims</snm><fnm>S</fnm></au><au><snm>Smith</snm><fnm>F</fnm></au><au><snm>Blomberg</snm><fnm>A</fnm></au><au><snm>Durbin</snm><fnm>R</fnm></au><au><snm>Louis</snm><fnm>EJ</fnm></au></aug><source>Nature</source><pubdate>2009</pubdate><volume>458</volume><fpage>337</fpage><lpage>341</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature07743</pubid><pubid idtype="pmcid">2659681</pubid><pubid idtype="pmpid">19212322</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>A single IGF1 allele is a major determinant of small size in dogs.</p></title><aug><au><snm>Sutter</snm><fnm>NB</fnm></au><au><snm>Bustamante</snm><fnm>CD</fnm></au><au><snm>Chase</snm><fnm>K</fnm></au><au><snm>Gray</snm><fnm>MM</fnm></au><au><snm>Zhao</snm><fnm>K</fnm></au><au><snm>Zhu</snm><fnm>L</fnm></au><au><snm>Padhukasahasram</snm><fnm>B</fnm></au><au><snm>Karlins</snm><fnm>E</fnm></au><au><snm>Davis</snm><fnm>S</fnm></au><au><snm>Jones</snm><fnm>PG</fnm></au></aug><source>Science</source><pubdate>2007</pubdate><volume>316</volume><fpage>112</fpage><lpage>115</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1137045</pubid><pubid idtype="pmcid">2789551</pubid><pubid idtype="pmpid">17412960</pubid></pubidlist></xrefbib></bibl><bibl id="B12"><title><p>Genome-wide diversity map of <it>Plasmodium falciparum</it>.</p></title><aug><au><snm>Volkman</snm><fnm>SK</fnm></au><au><snm>Sabeti</snm><fnm>PC</fnm></au><au><snm>DeCaprio</snm><fnm>D</fnm></au><au><snm>Neafsey</snm><fnm>D</fnm></au><au><snm>Schaffner</snm><fnm>S</fnm></au><au><snm>Derr</snm><fnm>A</fnm></au><au><snm>Milner</snm><fnm>D</fnm></au><au><snm>Stange-Thomann</snm><fnm>N</fnm></au><au><snm>Shamovsky</snm><fnm>O</fnm></au><au><snm>Onofrio</snm><fnm>R</fnm></au><au><snm>Richter</snm><fnm>DJ</fnm></au><au><snm>Waggoner</snm><fnm>S</fnm></au><au><snm>Mauceli</snm><fnm>E</fnm></au><au><snm>Gnerre</snm><fnm>S</fnm></au><au><snm>Zainoun</snm><fnm>J</fnm></au><au><snm>Daily</snm><fnm>JP</fnm></au><au><snm>Sarr</snm><fnm>O</fnm></au><au><snm>Mboup</snm><fnm>S</fnm></au><au><snm>Wiegand</snm><fnm>R</fnm></au><au><snm>Hartl</snm><fnm>DL</fnm></au><au><snm>Jaffe</snm><fnm>D</fnm></au><au><snm>Birren</snm><fnm>B</fnm></au><au><snm>Galagan</snm><fnm>JE</fnm></au><au><snm>Lander</snm><fnm>E</fnm></au><au><snm>Wirth</snm><fnm>DF</fnm></au></aug><source>Am J Tropical Med Hygiene</source><pubdate>2006</pubdate><volume>75</volume><fpage>95</fpage></bibl><bibl id="B13"><title><p>Population genomics: whole-genome analysis of polymorphism and divergence in <it>Drosophila simulans</it>.</p></title><aug><au><snm>Begun</snm><fnm>DJ</fnm></au><au><snm>Holloway</snm><fnm>AK</fnm></au><au><snm>Stevens</snm><fnm>K</fnm></au><au><snm>Hillier</snm><fnm>LW</fnm></au><au><snm>Poh</snm><fnm>Y-P</fnm></au><au><snm>Hahn</snm><fnm>MW</fnm></au><au><snm>Nista</snm><fnm>PM</fnm></au><au><snm>Jones</snm><fnm>CD</fnm></au><au><snm>Kern</snm><fnm>AD</fnm></au><au><snm>Dewey</snm><fnm>CN</fnm></au><au><snm>Pachter</snm><fnm>L</fnm></au><au><snm>Myers</snm><fnm>E</fnm></au><au><snm>Langley</snm><fnm>CH</fnm></au></aug><source>PLoS Biol</source><pubdate>2007</pubdate><volume>5</volume><fpage>e310</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pbio.0050310</pubid><pubid idtype="pmcid">2062478,2062478</pubid><pubid idtype="pmpid">17988176</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>High-diversity genes in the <it>Arabidopsis </it>genome.</p></title><aug><au><snm>Cork</snm><fnm>JM</fnm></au><au><snm>Purugganan</snm><fnm>MD</fnm></au></aug><source>Genetics</source><pubdate>2005</pubdate><volume>170</volume><fpage>1897</fpage><lpage>1911</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1534/genetics.104.036152</pubid><pubid idtype="pmcid">1449776</pubid><pubid idtype="pmpid">15911589</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>Usefulness of single nucleotide polymorphism data for estimating population parameters.</p></title><aug><au><snm>Kuhner</snm><fnm>MK</fnm></au><au><snm>Beerli</snm><fnm>P</fnm></au><au><snm>Yamato</snm><fnm>J</fnm></au><au><snm>Felsenstein</snm><fnm>J</fnm></au></aug><source>Genetics</source><pubdate>2000</pubdate><volume>156</volume><fpage>439</fpage><lpage>447</lpage><xrefbib><pubidlist><pubid idtype="pmcid">1461258</pubid><pubid idtype="pmpid">10978306</pubid></pubidlist></xrefbib></bibl><bibl id="B16"><title><p>Correcting for ascertainment biases when analyzing SNP data: applications to the estimation of linkage disequilibrium.</p></title><aug><au><snm>Nielsen</snm><fnm>R</fnm></au><au><snm>Signorovitch</snm><fnm>J</fnm></au></aug><source>Theor Popul Biol</source><pubdate>2003</pubdate><volume>63</volume><fpage>245</fpage><lpage>255</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0040-5809(03)00005-4</pubid><pubid idtype="pmpid" link="fulltext">12689795</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><title><p>Differential gene exchange between parapatric morphs of <it>Littorina saxatilis </it>detected using AFLP markers.</p></title><aug><au><snm>Wilding</snm><fnm>CS</fnm></au><au><snm>Butlin</snm><fnm>RK</fnm></au><au><snm>Grahame</snm><fnm>J</fnm></au></aug><source>J Evol Biol</source><pubdate>2001</pubdate><volume>14</volume><fpage>611</fpage><lpage>619</lpage><xrefbib><pubid idtype="doi">10.1046/j.1420-9101.2001.00304.x</pubid></xrefbib></bibl><bibl id="B18"><title><p>Generic scan using AFLP markers as a means to assess the role of directional selection in the divergence of sympatric whitefish ecotypes.</p></title><aug><au><snm>Campbell</snm><fnm>D</fnm></au><au><snm>Bernatchez</snm><fnm>L</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2004</pubdate><volume>21</volume><fpage>945</fpage><lpage>956</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/molbev/msh101</pubid><pubid idtype="pmpid" link="fulltext">15014172</pubid></pubidlist></xrefbib></bibl><bibl id="B19"><title><p>Expressed sequence tag-linked microsatellites as a source of gene-associated polymorphisms for detecting signatures of divergent selection in Atlantic salmon (<it>Salmo salar </it>L.).</p></title><aug><au><snm>Vasemagi</snm><fnm>A</fnm></au><au><snm>Nilsson</snm><fnm>J</fnm></au><au><snm>Primmer</snm><fnm>CR</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2005</pubdate><volume>22</volume><fpage>1067</fpage><lpage>1076</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/molbev/msi093</pubid><pubid idtype="pmpid" link="fulltext">15689532</pubid></pubidlist></xrefbib></bibl><bibl id="B20"><title><p>Explorative genome scan to detect candidate loci for adaptation along a gradient of altitude in the common frog (<it>Rana temporaria</it>).</p></title><aug><au><snm>Bonin</snm><fnm>A</fnm></au><au><snm>Taberlet</snm><fnm>P</fnm></au><au><snm>Miaud</snm><fnm>C</fnm></au><au><snm>Pompanon</snm><fnm>F</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2006</pubdate><volume>23</volume><fpage>773</fpage><lpage>783</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/molbev/msj087</pubid><pubid idtype="pmpid" link="fulltext">16396915</pubid></pubidlist></xrefbib></bibl><bibl id="B21"><title><p>Natural selection and climate change: temperature-linked spatial and temporal trends in gene frequency in <it>Fagus sylvatica</it>.</p></title><aug><au><snm>Jump</snm><fnm>AS</fnm></au><au><snm>Hunt</snm><fnm>JM</fnm></au><au><snm>Martinez-Izquierdo</snm><fnm>JA</fnm></au><au><snm>Penuelas</snm><fnm>J</fnm></au></aug><source>Mol Ecol</source><pubdate>2006</pubdate><volume>15</volume><fpage>3469</fpage><lpage>3480</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1365-294X.2006.03027.x</pubid><pubid idtype="pmpid" link="fulltext">16968284</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>Combining population genomics and quantitative genetics: finding the genes underlying ecologically important traits.</p></title><aug><au><snm>Stinchcombe</snm><fnm>JR</fnm></au><au><snm>Hoekstra</snm><fnm>HE</fnm></au></aug><source>Heredity</source><pubdate>2007</pubdate><volume>100</volume><fpage>158</fpage><lpage>170</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/sj.hdy.6800937</pubid><pubid idtype="pmpid" link="fulltext">17314923</pubid></pubidlist></xrefbib></bibl><bibl id="B23"><title><p>Array-based high-throughput DNA markers for crop improvement.</p></title><aug><au><snm>Gupta</snm><fnm>PK</fnm></au><au><snm>Rustgi</snm><fnm>S</fnm></au><au><snm>Mir</snm><fnm>RR</fnm></au></aug><source>Heredity</source><pubdate>2008</pubdate><volume>101</volume><fpage>5</fpage><lpage>18</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/hdy.2008.35</pubid><pubid idtype="pmpid" link="fulltext">18461083</pubid></pubidlist></xrefbib></bibl><bibl id="B24"><title><p>Direct allelic variation scanning of the yeast genome.</p></title><aug><au><snm>Winzeler</snm><fnm>EA</fnm></au><au><snm>Richards</snm><fnm>DR</fnm></au><au><snm>Conway</snm><fnm>AR</fnm></au><au><snm>Goldstein</snm><fnm>AL</fnm></au><au><snm>Kalman</snm><fnm>S</fnm></au><au><snm>McCullough</snm><fnm>MJ</fnm></au><au><snm>McCusker</snm><fnm>JH</fnm></au><au><snm>Stevens</snm><fnm>DA</fnm></au><au><snm>Wodicka</snm><fnm>L</fnm></au><au><snm>Lockhart</snm><fnm>DJ</fnm></au></aug><source>Science</source><pubdate>1998</pubdate><volume>281</volume><fpage>1194</fpage><lpage>1197</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.281.5380.1194</pubid><pubid idtype="pmpid" link="fulltext">9712584</pubid></pubidlist></xrefbib></bibl><bibl id="B25"><title><p>Rapid and cost-effective polymorphism identification and genotyping using restriction site associated DNA (RAD) markers.</p></title><aug><au><snm>Miller</snm><fnm>MR</fnm></au><au><snm>Dunham</snm><fnm>JP</fnm></au><au><snm>Amores</snm><fnm>A</fnm></au><au><snm>Cresko</snm><fnm>WA</fnm></au><au><snm>Johnson</snm><fnm>EA</fnm></au></aug><source>Genome Res</source><pubdate>2007</pubdate><volume>17</volume><fpage>240</fpage><lpage>248</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.5681207</pubid><pubid idtype="pmcid">1781356</pubid><pubid idtype="pmpid">17189378</pubid></pubidlist></xrefbib></bibl><bibl id="B26"><title><p>Large-scale identification of single-feature polymorphisms in complex genomes.</p></title><aug><au><snm>Borevitz</snm><fnm>JO</fnm></au><au><snm>Liang</snm><fnm>D</fnm></au><au><snm>Plouffe</snm><fnm>D</fnm></au><au><snm>Chang</snm><fnm>HS</fnm></au><au><snm>Zhu</snm><fnm>T</fnm></au><au><snm>Weigel</snm><fnm>D</fnm></au><au><snm>Berry</snm><fnm>CC</fnm></au><au><snm>Winzeler</snm><fnm>E</fnm></au><au><snm>Chory</snm><fnm>J</fnm></au></aug><source>Genome Res</source><pubdate>2003</pubdate><volume>13</volume><fpage>513</fpage><lpage>523</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.541303</pubid><pubid idtype="pmcid">430246</pubid><pubid idtype="pmpid">12618383</pubid></pubidlist></xrefbib></bibl><bibl id="B27"><title><p>Genetic diversity in yeast assessed with whole-genome oligonucleotide arrays.</p></title><aug><au><snm>Winzeler</snm><fnm>EA</fnm></au><au><snm>Castillo-Davis</snm><fnm>CI</fnm></au><au><snm>Oshiro</snm><fnm>G</fnm></au><au><snm>Liang</snm><fnm>D</fnm></au><au><snm>Richards</snm><fnm>DR</fnm></au><au><snm>Zhou</snm><fnm>Y</fnm></au><au><snm>Hartl</snm><fnm>DL</fnm></au></aug><source>Genetics</source><pubdate>2003</pubdate><volume>163</volume><fpage>79</fpage><lpage>89</lpage><xrefbib><pubidlist><pubid idtype="pmcid">1462430</pubid><pubid idtype="pmpid">12586698</pubid></pubidlist></xrefbib></bibl><bibl id="B28"><title><p>RAD marker microarrays enable rapid mapping of zebrafish mutations.</p></title><aug><au><snm>Miller</snm><fnm>MR</fnm></au><au><snm>Atwood</snm><fnm>TS</fnm></au><au><snm>Eames</snm><fnm>BF</fnm></au><au><snm>Eberhart</snm><fnm>JK</fnm></au><au><snm>Yan</snm><fnm>YL</fnm></au><au><snm>Postlethwait</snm><fnm>JH</fnm></au><au><snm>Johnson</snm><fnm>E</fnm></au></aug><source>Genome Biol</source><pubdate>2007</pubdate><volume>8</volume><fpage>R105</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2007-8-6-r105</pubid><pubid idtype="pmcid">2394753</pubid><pubid idtype="pmpid">17553171</pubid></pubidlist></xrefbib></bibl><bibl id="B29"><title><p>High-density detection of restriction-site-associated DNA markers for rapid mapping of mutated loci in <it>Neurospora</it>.</p></title><aug><au><snm>Lewis</snm><fnm>ZA</fnm></au><au><snm>Shiver</snm><fnm>AL</fnm></au><au><snm>Stiffler</snm><fnm>N</fnm></au><au><snm>Miller</snm><fnm>MR</fnm></au><au><snm>Johnson</snm><fnm>EA</fnm></au><au><snm>Selker</snm><fnm>EU</fnm></au></aug><source>Genetics</source><pubdate>2007</pubdate><volume>177</volume><fpage>1163</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1534/genetics.107.078147</pubid><pubid idtype="pmcid">2034621</pubid><pubid idtype="pmpid">17660537</pubid></pubidlist></xrefbib></bibl><bibl id="B30"><title><p>Rapid SNP discovery and genetic mapping using sequenced RAD markers.</p></title><aug><au><snm>Baird</snm><fnm>NA</fnm></au><au><snm>Etter</snm><fnm>PD</fnm></au><au><snm>Atwood</snm><fnm>TS</fnm></au><au><snm>Currey</snm><fnm>MC</fnm></au><au><snm>Shiver</snm><fnm>AL</fnm></au><au><snm>Lewis</snm><fnm>ZA</fnm></au><au><snm>Selker</snm><fnm>EU</fnm></au><au><snm>Cresko</snm><fnm>WA</fnm></au><au><snm>Johnson</snm><fnm>EA</fnm></au></aug><source>PLoS ONE</source><pubdate>2008</pubdate><volume>3</volume><fpage>e3376</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pone.0003376</pubid><pubid idtype="pmcid">2557064</pubid><pubid idtype="pmpid">18852878</pubid></pubidlist></xrefbib></bibl><bibl id="B31"><title><p>Genome-wide investigation of reproductive isolation in experimental lineages and natural species of <it>Neurospora </it>: identifying candidate regions by microarray-based genotyping and mapping.</p></title><aug><au><snm>Dettman</snm><fnm>JR</fnm></au><au><snm>Anderson</snm><fnm>JB</fnm></au><au><snm>Kohn</snm><fnm>LM</fnm></au></aug><source>Evolution</source><pubdate>2009</pubdate><volume>64</volume><fpage>694</fpage><lpage>709</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1558-5646.2009.00863.x</pubid><pubid idtype="pmpid" link="fulltext">19817850</pubid></pubidlist></xrefbib></bibl><bibl id="B32"><title><p>Genomic islands of speciation in <it>Anopheles gambiae</it>.</p></title><aug><au><snm>Turner</snm><fnm>TL</fnm></au><au><snm>Hahn</snm><fnm>MW</fnm></au><au><snm>Nuzhdin</snm><fnm>SV</fnm></au></aug><source>PLoS Biol</source><pubdate>2005</pubdate><volume>3</volume><fpage>e285</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pbio.0030285</pubid><pubid idtype="pmcid">1182689</pubid><pubid idtype="pmpid">16076241</pubid></pubidlist></xrefbib></bibl><bibl id="B33"><title><p>Construction of a high-resolution genetic linkage map and comparative genome analysis for the reef-building coral <it>Acropora millepora</it>.</p></title><aug><au><snm>Wang</snm><fnm>S</fnm></au><au><snm>Zhang</snm><fnm>L</fnm></au><au><snm>Meyer</snm><fnm>E</fnm></au><au><snm>Matz</snm><fnm>M</fnm></au></aug><source>Genome Biol</source><pubdate>2009</pubdate><volume>10</volume><fpage>R126</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2009-10-11-r126</pubid><pubid idtype="pmpid" link="fulltext">19900279</pubid></pubidlist></xrefbib></bibl><bibl id="B34"><title><p>Rapidly developing functional genomics in ecological model systems via 454 transcriptome sequencing.</p></title><aug><au><snm>Wheat</snm><fnm>CW</fnm></au></aug><source>Genetica</source><pubdate>2010</pubdate><volume>138</volume><fpage>433</fpage><lpage>451</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1007/s10709-008-9326-y</pubid><pubid idtype="pmpid" link="fulltext">18931921</pubid></pubidlist></xrefbib></bibl><bibl id="B35"><title><p>Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing.</p></title><aug><au><snm>Vera</snm><fnm>JC</fnm></au><au><snm>Wheat</snm><fnm>CW</fnm></au><au><snm>Fescemyer</snm><fnm>HW</fnm></au><au><snm>Frilander</snm><fnm>MJ</fnm></au><au><snm>Crawford</snm><fnm>DL</fnm></au><au><snm>Hanski</snm><fnm>I</fnm></au><au><snm>Marden</snm><fnm>JH</fnm></au></aug><source>Mol Ecol</source><pubdate>2008</pubdate><volume>17</volume><fpage>1636</fpage><lpage>1647</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1365-294X.2008.03666.x</pubid><pubid idtype="pmpid" link="fulltext">18266620</pubid></pubidlist></xrefbib></bibl><bibl id="B36"><title><p>Parallel tagged sequencing on the 454 platform.</p></title><aug><au><snm>Meyer</snm><fnm>M</fnm></au><au><snm>Stenzel</snm><fnm>U</fnm></au><au><snm>Hofreiter</snm><fnm>M</fnm></au></aug><source>Nat Protoc</source><pubdate>2008</pubdate><volume>3</volume><fpage>267</fpage><lpage>278</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nprot.2007.520</pubid><pubid idtype="pmpid" link="fulltext">18274529</pubid></pubidlist></xrefbib></bibl><bibl id="B37"><title><p>Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy.</p></title><aug><au><snm>Lupski</snm><fnm>JR</fnm></au><au><snm>Reid</snm><fnm>JG</fnm></au><au><snm>Gonzaga-Jauregui</snm><fnm>C</fnm></au><au><snm>Rio Deiros</snm><fnm>D</fnm></au><au><snm>Chen</snm><fnm>DCY</fnm></au><au><snm>Nazareth</snm><fnm>L</fnm></au><au><snm>Bainbridge</snm><fnm>M</fnm></au><au><snm>Dinh</snm><fnm>H</fnm></au><au><snm>Jing</snm><fnm>C</fnm></au><au><snm>Wheeler</snm><fnm>DA</fnm></au></aug><source>New Engl J Med</source><pubdate>2010</pubdate><volume>362</volume><fpage>1181</fpage><lpage>1191</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1056/NEJMoa0908094</pubid><pubid idtype="pmpid" link="fulltext">20220177</pubid></pubidlist></xrefbib></bibl><bibl id="B38"><title><p>Analysis of genetic inheritance in a family quartet by whole-genome sequencing.</p></title><aug><au><snm>Roach</snm><fnm>JC</fnm></au><au><snm>Glusman</snm><fnm>G</fnm></au><au><snm>Smit</snm><fnm>AFA</fnm></au><au><snm>Huff</snm><fnm>CD</fnm></au><au><snm>Hubley</snm><fnm>R</fnm></au><au><snm>Shannon</snm><fnm>PT</fnm></au><au><snm>Rowen</snm><fnm>L</fnm></au><au><snm>Pant</snm><fnm>KP</fnm></au><au><snm>Goodman</snm><fnm>N</fnm></au><au><snm>Bamshad</snm><fnm>M</fnm></au><au><snm>Shendure</snm><fnm>J</fnm></au><au><snm>Drmanac</snm><fnm>R</fnm></au><au><snm>Jorde</snm><fnm>LB</fnm></au><au><snm>Hood</snm><fnm>L</fnm></au><au><snm>Galas</snm><fnm>DJ</fnm></au></aug><source>Science</source><pubdate>2010</pubdate><inpress/><xrefbib><pubid idtype="pmpid" link="fulltext">20220176</pubid></xrefbib></bibl><bibl id="B39"><title><p>The impact of next-generation sequencing technology on genetics.</p></title><aug><au><snm>Mardis</snm><fnm>ER</fnm></au></aug><source>Trends Genet</source><pubdate>2008</pubdate><volume>24</volume><fpage>133</fpage><lpage>141</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">18262675</pubid></xrefbib></bibl><bibl id="B40"><aug><au><snm>Schultz</snm><fnm>H</fnm></au></aug><source>Sea-urchins, a Guide to Worldwide Shallow Water Species</source><publisher>Germany: Heinke &amp; Peter Schultz Partner Scientific Publications</publisher><pubdate>2005</pubdate></bibl><bibl id="B41"><title><p>Length of pelagic period in echinoderms with feeding larvae from the Northeast Pacific.</p></title><aug><au><snm>Strathmann</snm><fnm>R</fnm></au></aug><source>J Exp Marine Biol Ecol</source><pubdate>1978</pubdate><volume>34</volume><fpage>23</fpage><lpage>28</lpage><xrefbib><pubid idtype="doi">10.1016/0022-0981(78)90054-0</pubid></xrefbib></bibl><bibl id="B42"><title><p>Phylum Echinodermata, Class Echinoidea.</p></title><aug><au><snm>Strathmann</snm><fnm>MF</fnm></au></aug><source>Reproduction and Development of Marine Invertebrates of the Northern Pacific Coast: Data and Methods for the Study of Eggs, Embryos, and Larvae</source><publisher>Seattle, WA: University of Washington Press</publisher><pubdate>1987</pubdate><fpage>511</fpage><lpage>534</lpage></bibl><bibl id="B43"><title><p>Mitochondrial DNA diversity in the sea urchins <it>Strongylocentrotus purpuratus </it>and <it>S. droebachiensis</it>.</p></title><aug><au><snm>Palumbi</snm><fnm>SR</fnm></au><au><snm>Wilson</snm><fnm>AC</fnm></au></aug><source>Evolution</source><pubdate>1990</pubdate><volume>44</volume><fpage>403</fpage><lpage>415</lpage><xrefbib><pubid idtype="doi">10.2307/2409417</pubid></xrefbib></bibl><bibl id="B44"><title><p>Allozyme and mitochondrial DNA evidence of population subdivision in the purple sea urchin <it>Strongylocentrotus purpuratus</it>.</p></title><aug><au><snm>Edmands</snm><fnm>S</fnm></au><au><snm>Moberg</snm><fnm>PE</fnm></au><au><snm>Burton</snm><fnm>RS</fnm></au></aug><source>Marine Biol</source><pubdate>1996</pubdate><volume>126</volume><fpage>443</fpage><lpage>450</lpage><xrefbib><pubid idtype="doi">10.1007/BF00354626</pubid></xrefbib></bibl><bibl id="B45"><title><p>Latitudinal variation in size structure of the west coast purple sea urchin: a correlation with headlands.</p></title><aug><au><snm>Ebert</snm><fnm>TA</fnm></au><au><snm>Russell</snm><fnm>MP</fnm></au></aug><source>Limnol Oceanography</source><pubdate>1988</pubdate><volume>33</volume><fpage>286</fpage><lpage>294</lpage><xrefbib><pubid idtype="doi">10.4319/lo.1988.33.2.0286</pubid></xrefbib></bibl><bibl id="B46"><title><p>The genome of the sea urchin <it>Strongylocentrotus purpuratus</it>.</p></title><aug><au><cnm>Sea Urchin Genome Sequencing C</cnm></au><au><snm>Sodergren</snm><fnm>E</fnm></au><au><snm>Weinstock</snm><fnm>GM</fnm></au><au><snm>Davidson</snm><fnm>EH</fnm></au><au><snm>Cameron</snm><fnm>RA</fnm></au><au><snm>Gibbs</snm><fnm>RA</fnm></au><au><snm>Angerer</snm><fnm>RC</fnm></au><au><snm>Angerer</snm><fnm>LM</fnm></au><au><snm>Arnone</snm><fnm>MI</fnm></au><au><snm>Burgess</snm><fnm>DR</fnm></au><au><snm>Burke</snm><fnm>RD</fnm></au><au><snm>Coffman</snm><fnm>JA</fnm></au><au><snm>Dean</snm><fnm>M</fnm></au><au><snm>Elphick</snm><fnm>MR</fnm></au><au><snm>Ettensohn</snm><fnm>CA</fnm></au><au><snm>Foltz</snm><fnm>KR</fnm></au><au><snm>Hamdoun</snm><fnm>A</fnm></au><au><snm>Hynes</snm><fnm>RO</fnm></au><au><snm>Klein</snm><fnm>WH</fnm></au><au><snm>Marzluff</snm><fnm>W</fnm></au><au><snm>McClay</snm><fnm>DR</fnm></au><au><snm>Morris</snm><fnm>RL</fnm></au><au><snm>Mushegian</snm><fnm>A</fnm></au><au><snm>Rast</snm><fnm>JP</fnm></au><au><snm>Smith</snm><fnm>LC</fnm></au><au><snm>Thorndyke</snm><fnm>MC</fnm></au><au><snm>Vacquier</snm><fnm>VD</fnm></au><au><snm>Wessel</snm><fnm>GM</fnm></au><au><snm>Wray</snm><fnm>G</fnm></au><au><snm>Zhang</snm><fnm>L</fnm></au><etal/></aug><source>Science</source><pubdate>2006</pubdate><volume>314</volume><fpage>941</fpage><lpage>952</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1133609</pubid><pubid idtype="pmpid" link="fulltext">17095691</pubid></pubidlist></xrefbib></bibl><bibl id="B47"><title><p>Ecological role of purple sea urchins.</p></title><aug><au><snm>Pearse</snm><fnm>JS</fnm></au></aug><source>Science</source><pubdate>2006</pubdate><volume>314</volume><fpage>940</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1131888</pubid><pubid idtype="pmpid" link="fulltext">17095690</pubid></pubidlist></xrefbib></bibl><bibl id="B48"><title><p>The ecology of <it>Strongylocentrotus franciscanus </it>and <it>Strongylocentrotus purpuratus</it>.</p></title><aug><au><snm>Rogers-Bennett</snm><fnm>L</fnm></au></aug><source>Edible Sea Urchins Biol Ecol</source><pubdate>2007</pubdate><fpage>393</fpage><lpage>425</lpage></bibl><bibl id="B49"><title><p>The single-copy DNA sequence polymorphism of the sea urchin <it>Strongylocentrotus purpuratus</it>.</p></title><aug><au><snm>Britten</snm><fnm>RJ</fnm></au><au><snm>Cetta</snm><fnm>A</fnm></au><au><snm>Davidson</snm><fnm>EH</fnm></au></aug><source>Cell</source><pubdate>1978</pubdate><volume>15</volume><fpage>1175</fpage><lpage>1186</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0092-8674(78)90044-2</pubid><pubid idtype="pmpid" link="fulltext">728997</pubid></pubidlist></xrefbib></bibl><bibl id="B50"><title><p>Enhanced model-based clustering, density estimation, and discriminant analysis software: MCLUST.</p></title><aug><au><snm>Fraley</snm><fnm>C</fnm></au><au><snm>Raftery</snm><fnm>AE</fnm></au></aug><source>J Classification</source><pubdate>2003</pubdate><volume>20</volume><fpage>263</fpage><lpage>286</lpage><xrefbib><pubid idtype="doi">10.1007/s00357-003-0015-3</pubid></xrefbib></bibl><bibl id="B51"><title><p>A comparative study of asymmetric migration events across a marine biogeographic boundary.</p></title><aug><au><snm>Wares</snm><fnm>JP</fnm></au><au><snm>Gaines</snm><fnm>SD</fnm></au><au><snm>Cunningham</snm><fnm>CW</fnm></au></aug><source>Evolution</source><pubdate>2001</pubdate><volume>55</volume><fpage>295</fpage><lpage>306</lpage><xrefbib><pubid idtype="pmpid">11308087</pubid></xrefbib></bibl><bibl id="B52"><title><p>Isolation by distance.</p></title><aug><au><snm>Wright</snm><fnm>S</fnm></au></aug><source>Genetics</source><pubdate>1943</pubdate><volume>28</volume><fpage>114</fpage><lpage>138</lpage><xrefbib><pubidlist><pubid idtype="pmcid">1209196</pubid><pubid idtype="pmpid">17247074</pubid></pubidlist></xrefbib></bibl><bibl id="B53"><title><p>GENEPOP: population genetics software for exact tests and ecumenicism.</p></title><aug><au><snm>Raymond</snm><fnm>M</fnm></au><au><snm>Rousset</snm><fnm>F</fnm></au></aug><source>J Heredity</source><pubdate>1995</pubdate><volume>86</volume><fpage>248</fpage><lpage>249</lpage></bibl><bibl id="B54"><title><p>LOSITAN: A workbench to detect molecular adaptation based on a Fst-outlier method.</p></title><aug><au><snm>Antao</snm><fnm>T</fnm></au><au><snm>Lopes</snm><fnm>A</fnm></au><au><snm>Lopes</snm><fnm>RJ</fnm></au><au><snm>Beja-Pereira</snm><fnm>A</fnm></au><au><snm>Luikart</snm><fnm>G</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2008</pubdate><volume>9</volume><fpage>323</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-9-323</pubid><pubid idtype="pmcid">2515854</pubid><pubid idtype="pmpid">18662398</pubid></pubidlist></xrefbib></bibl><bibl id="B55"><title><p>Identifying adaptive genetic divergence among populations from genome scans.</p></title><aug><au><snm>Beaumont</snm><fnm>MA</fnm></au><au><snm>Balding</snm><fnm>DJ</fnm></au></aug><source>Mol Ecol</source><pubdate>2004</pubdate><volume>13</volume><fpage>969</fpage><lpage>980</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1365-294X.2004.02125.x</pubid><pubid idtype="pmpid" link="fulltext">15012769</pubid></pubidlist></xrefbib></bibl><bibl id="B56"><title><p>A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective.</p></title><aug><au><snm>Foll</snm><fnm>M</fnm></au><au><snm>Gaggiotti</snm><fnm>O</fnm></au></aug><source>Genetics</source><pubdate>2008</pubdate><volume>180</volume><fpage>977</fpage><lpage>993</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1534/genetics.108.092221</pubid><pubid idtype="pmcid">2567396</pubid><pubid idtype="pmpid">18780740</pubid></pubidlist></xrefbib></bibl><bibl id="B57"><title><p>The mannose receptor family.</p></title><aug><au><snm>East</snm><fnm>L</fnm></au><au><snm>Isacke</snm><fnm>CM</fnm></au></aug><source>BBA-General Subjects</source><pubdate>2002</pubdate><volume>1572</volume><fpage>364</fpage><lpage>386</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0304-4165(02)00319-7</pubid><pubid idtype="pmpid" link="fulltext">12223280</pubid></pubidlist></xrefbib></bibl><bibl id="B58"><title><p>Polymorphisms in CD14, mannose-binding lectin, and Toll-like receptor-2 are associated with increased prevalence of infection in critically ill adults*.</p></title><aug><au><snm>Sutherland</snm><fnm>AM</fnm></au><au><snm>Walley</snm><fnm>KR</fnm></au><au><snm>Russell</snm><fnm>JA</fnm></au></aug><source>Crit Care Med</source><pubdate>2005</pubdate><volume>33</volume><fpage>638</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/01.CCM.0000156242.44356.C5</pubid><pubid idtype="pmpid" link="fulltext">15753758</pubid></pubidlist></xrefbib></bibl><bibl id="B59"><title><p>Megalin and cubilin: multifunctional endocytic receptors.</p></title><aug><au><snm>Christensen</snm><fnm>EI</fnm></au><au><snm>Birn</snm><fnm>H</fnm></au></aug><source>Nat Rev Mol Cell Biol</source><pubdate>2002</pubdate><volume>3</volume><fpage>258</fpage><lpage>268</lpage><xrefbib><pubid idtype="doi">10.1038/nrm778</pubid></xrefbib></bibl><bibl id="B60"><title><p>Golgi matrix protein gene, Golga3/Mea2, rearranged and re-expressed in pachytene spermatocytes restores spermatogenesis in the mouse.</p></title><aug><au><snm>Banu</snm><fnm>Y</fnm></au><au><snm>Matsuda</snm><fnm>M</fnm></au><au><snm>Yoshihara</snm><fnm>M</fnm></au><au><snm>Kondo</snm><fnm>M</fnm></au><au><snm>Sutou</snm><fnm>S</fnm></au><au><snm>Matsukuma</snm><fnm>S</fnm></au></aug><source>Mol Reprod Dev</source><pubdate>2002</pubdate><volume>61</volume><fpage>288</fpage><lpage>301</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/mrd.10035</pubid><pubid idtype="pmpid" link="fulltext">11835574</pubid></pubidlist></xrefbib></bibl><bibl id="B61"><title><p>Natural selection drives <it>Drosophila </it>immune system evolution.</p></title><aug><au><snm>Schlenke</snm><fnm>TA</fnm></au><au><snm>Begun</snm><fnm>DJ</fnm></au></aug><source>Genetics</source><pubdate>2003</pubdate><volume>164</volume><fpage>1471</fpage><xrefbib><pubidlist><pubid idtype="pmcid">1462669</pubid><pubid idtype="pmpid">12930753</pubid></pubidlist></xrefbib></bibl><bibl id="B62"><title><p>The evolution of transcriptional regulation in eukaryotes.</p></title><aug><au><snm>Wray</snm><fnm>GA</fnm></au><au><snm>Hahn</snm><fnm>MW</fnm></au><au><snm>Abouheif</snm><fnm>E</fnm></au><au><snm>Balhoff</snm><fnm>JP</fnm></au><au><snm>Pizer</snm><fnm>M</fnm></au><au><snm>Rockman</snm><fnm>MV</fnm></au><au><snm>Romano</snm><fnm>LA</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2003</pubdate><volume>20</volume><fpage>1377</fpage><lpage>1419</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/molbev/msg140</pubid><pubid idtype="pmpid" link="fulltext">12777501</pubid></pubidlist></xrefbib></bibl><bibl id="B63"><title><p>Positive selection in the carbohydrate recognition domains of sea urchin sperm receptor for egg jelly (suREJ) proteins.</p></title><aug><au><snm>Mah</snm><fnm>SA</fnm></au><au><snm>Swanson</snm><fnm>WJ</fnm></au><au><snm>Vacquier</snm><fnm>VD</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2005</pubdate><volume>22</volume><fpage>533</fpage><lpage>541</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/molbev/msi037</pubid><pubid idtype="pmpid" link="fulltext">15525699</pubid></pubidlist></xrefbib></bibl><bibl id="B64"><title><p>Invertebrate immune systems-not homogeneous, not simple, not well understood.</p></title><aug><au><snm>Loker</snm><fnm>ES</fnm></au><au><snm>Adema</snm><fnm>CM</fnm></au><au><snm>Zhang</snm><fnm>SM</fnm></au><au><snm>Kepler</snm><fnm>TB</fnm></au></aug><source>Immunol Rev</source><pubdate>2004</pubdate><volume>198</volume><fpage>10</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.0105-2896.2004.0117.x</pubid><pubid idtype="pmpid" link="fulltext">15199951</pubid></pubidlist></xrefbib></bibl><bibl id="B65"><title><p>Genomic insights into the immune system of the sea urchin.</p></title><aug><au><snm>Rast</snm><fnm>JP</fnm></au><au><snm>Smith</snm><fnm>LC</fnm></au><au><snm>Loza-Coll</snm><fnm>M</fnm></au><au><snm>Hibino</snm><fnm>T</fnm></au><au><snm>Litman</snm><fnm>GW</fnm></au></aug><source>Science</source><pubdate>2006</pubdate><volume>314</volume><fpage>952</fpage><lpage>956</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1134301</pubid><pubid idtype="pmpid" link="fulltext">17095692</pubid></pubidlist></xrefbib></bibl><bibl id="B66"><title><p>Cytochrome P450 enzymes belonging to the CYP4 family from marine invertebrates.</p></title><aug><au><snm>Snyder</snm><fnm>MJ</fnm></au></aug><source>Biochem Biophys Res Commun</source><pubdate>1998</pubdate><volume>249</volume><fpage>187</fpage><lpage>190</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1006/bbrc.1998.9104</pubid><pubid idtype="pmpid" link="fulltext">9705854</pubid></pubidlist></xrefbib></bibl><bibl id="B67"><title><p>The chemical defensome: environmental sensing and response genes in the <it>Strongylocentrotus purpuratus </it>genome.</p></title><aug><au><snm>Goldstone</snm><fnm>JV</fnm></au><au><snm>Hamdoun</snm><fnm>A</fnm></au><au><snm>Cole</snm><fnm>BJ</fnm></au><au><snm>Howard-Ashby</snm><fnm>M</fnm></au><au><snm>Nebert</snm><fnm>DW</fnm></au><au><snm>Scally</snm><fnm>M</fnm></au><au><snm>Dean</snm><fnm>M</fnm></au><au><snm>Epel</snm><fnm>D</fnm></au><au><snm>Hahn</snm><fnm>ME</fnm></au><au><snm>Stegeman</snm><fnm>JJ</fnm></au></aug><source>Dev Biol</source><pubdate>2006</pubdate><volume>300</volume><fpage>366</fpage><lpage>384</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.ydbio.2006.08.066</pubid><pubid idtype="pmpid" link="fulltext">17097629</pubid></pubidlist></xrefbib></bibl><bibl id="B68"><title><p>Gamma-aminobutyric acid, a neurotransmitter, induces planktonic abalone larvae to settle and begin metamorphosis.</p></title><aug><au><snm>Morse</snm><fnm>DE</fnm></au><au><snm>Hooker</snm><fnm>N</fnm></au><au><snm>Duncan</snm><fnm>H</fnm></au><au><snm>Jensen</snm><fnm>L</fnm></au></aug><source>Science</source><pubdate>1979</pubdate><volume>204</volume><fpage>407</fpage><lpage>410</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.204.4391.407</pubid><pubid idtype="pmpid" link="fulltext">17758015</pubid></pubidlist></xrefbib></bibl><bibl id="B69"><title><p>The complete genome of an individual by massively parallel DNA sequencing.</p></title><aug><au><snm>Wheeler</snm><fnm>DA</fnm></au><au><snm>Srinivasan</snm><fnm>M</fnm></au><au><snm>Egholm</snm><fnm>M</fnm></au><au><snm>Shen</snm><fnm>Y</fnm></au><au><snm>Chen</snm><fnm>L</fnm></au><au><snm>McGuire</snm><fnm>A</fnm></au><au><snm>He</snm><fnm>W</fnm></au><au><snm>Chen</snm><fnm>YJ</fnm></au><au><snm>Makhijani</snm><fnm>V</fnm></au><au><snm>Roth</snm><fnm>GT</fnm></au></aug><source>Nature</source><pubdate>2008</pubdate><volume>452</volume><fpage>872</fpage><lpage>876</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature06884</pubid><pubid idtype="pmpid" link="fulltext">18421352</pubid></pubidlist></xrefbib></bibl><bibl id="B70"><title><p>Population structure of purple sea urchin <it>Strongylocentrotus purpuratus </it>along the Baja California peninsula.</p></title><aug><au><snm>Olivares-Banuelos</snm><fnm>NC</fnm></au><au><snm>Enriquez-Paredes</snm><fnm>LM</fnm></au><au><snm>Ladah</snm><fnm>LB</fnm></au><au><snm>De La Rosa-Velez</snm><fnm>J</fnm></au></aug><source>Fisheries Sci</source><pubdate>2008</pubdate><volume>74</volume><fpage>804</fpage><lpage>812</lpage><xrefbib><pubid idtype="doi">10.1111/j.1444-2906.2008.01592.x</pubid></xrefbib></bibl><bibl id="B71"><title><p>The recruitment sweepstakes has many winners: genetic evidence from the sea urchin <it>Strongylocentrotus purpuratus</it>.</p></title><aug><au><snm>Flowers</snm><fnm>JM</fnm></au><au><snm>Schroeter</snm><fnm>SC</fnm></au><au><snm>Burton</snm><fnm>RS</fnm></au></aug><source>Evolution</source><pubdate>2002</pubdate><volume>56</volume><fpage>1445</fpage><lpage>1453</lpage><xrefbib><pubid idtype="pmpid">12206244</pubid></xrefbib></bibl><bibl id="B72"><title><p>Adaptive maintenance of genetic polymorphism in an intertidal barnacle: habitat-and life-stage-specific survivorship of MPI genotypes.</p></title><aug><au><snm>Schmidt</snm><fnm>PS</fnm></au><au><snm>Rand</snm><fnm>DM</fnm></au></aug><source>Evolution</source><pubdate>2001</pubdate><volume>55</volume><fpage>1336</fpage><lpage>1344</lpage><xrefbib><pubid idtype="pmpid">11525458</pubid></xrefbib></bibl><bibl id="B73"><title><p>Maintenance of an aminopeptidase allele frequency cline by natural selection.</p></title><aug><au><snm>Koehn</snm><fnm>RK</fnm></au><au><snm>Newell</snm><fnm>RIE</fnm></au><au><snm>Immermann</snm><fnm>F</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>1980</pubdate><volume>77</volume><fpage>5385</fpage><lpage>5389</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.77.9.5385</pubid><pubid idtype="pmcid">350063</pubid><pubid idtype="pmpid">6933563</pubid></pubidlist></xrefbib></bibl><bibl id="B74"><title><p>Disentangling the effects of evolutionary, demographic, and environmental factors influencing genetic structure of natural populations: Atlantic herring as a case study.</p></title><aug><au><snm>Gaggiotti</snm><fnm>OE</fnm></au><au><snm>Bekkevold</snm><fnm>D</fnm></au><au><snm>Jorgensen</snm><fnm>HBH</fnm></au><au><snm>Foll</snm><fnm>M</fnm></au><au><snm>Carvalho</snm><fnm>GR</fnm></au><au><snm>Andre</snm><fnm>C</fnm></au><au><snm>Ruzzante</snm><fnm>DE</fnm></au></aug><source>Evolution</source><pubdate>2009</pubdate><volume>63</volume><fpage>2939</fpage><lpage>2951</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1558-5646.2009.00779.x</pubid><pubid idtype="pmpid" link="fulltext">19624724</pubid></pubidlist></xrefbib></bibl><bibl id="B75"><aug><au><cnm>Team RDC</cnm></au></aug><source>R: A Language and Environment for Statistical Computing</source><publisher>Vienna, Austria: Foundation for Statistical Computing</publisher><pubdate>2009</pubdate></bibl><bibl id="B76"><title><p>Primer3 on the WWW for general users and for biologist programmers.</p></title><aug><au><snm>Rozen</snm><fnm>S</fnm></au><au><snm>Skaletsky</snm><fnm>H</fnm></au></aug><source>Methods Mol Biol</source><pubdate>2000</pubdate><volume>132</volume><fpage>365</fpage><lpage>386</lpage><xrefbib><pubid idtype="pmpid">10547847</pubid></xrefbib></bibl><bibl id="B77"><title><p>Genboree.</p></title><url>http://www.genboree.org/java-bin/PurpleUrchin/index.jsp?isPublic=Yes</url></bibl><bibl id="B78"><title><p>Controlling the false discovery rate: a practical and powerful approach to multiple testing.</p></title><aug><au><snm>Benjamini</snm><fnm>Y</fnm></au><au><snm>Hochberg</snm><fnm>Y</fnm></au></aug><source>J Roy Stat Soc Series B (Methodological)</source><pubdate>1995</pubdate><volume>51</volume><fpage>289</fpage><lpage>300</lpage></bibl><bibl id="B79"><title><p>Combining independent tests of significance.</p></title><aug><au><snm>Fisher</snm><fnm>RA</fnm></au></aug><source>Am Stat</source><pubdate>1948</pubdate><volume>2</volume><fpage>30</fpage><xrefbib><pubid idtype="doi">10.2307/2681650</pubid></xrefbib></bibl></refgrp>
   </bm>
</art>