<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2010-11-4-r36</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Method</dochead>
      <bibl>
         <title>
            <p>Optimized design and data analysis of tag-based cytosine methylation assays</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Suzuki</snm>
               <fnm>Masako</fnm>
               <insr iid="I1"/>
               <email>masako.suzuki@einstein.yu.edu</email>
            </au>
            <au id="A2">
               <snm>Jing</snm>
               <fnm>Qiang</fnm>
               <insr iid="I1"/>
               <email>qiang.jing@einstein.yu.edu</email>
            </au>
            <au id="A3">
               <snm>Lia</snm>
               <fnm>Daniel</fnm>
               <insr iid="I1"/>
               <email>DL1297@nyu.edu</email>
            </au>
            <au id="A4">
               <snm>Pascual</snm>
               <fnm>Mari&#233;n</fnm>
               <insr iid="I1"/>
               <email>marienpdp@gmail.com</email>
            </au>
            <au id="A5">
               <snm>McLellan</snm>
               <fnm>Andrew</fnm>
               <insr iid="I1"/>
               <email>andrew.mclellan@einstein.yu.edu</email>
            </au>
            <au ca="yes" id="A6">
               <snm>Greally</snm>
               <mi>M</mi>
               <fnm>John</fnm>
               <insr iid="I1"/>
               <email>john.greally@einstein.yu.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Genetics (Computational Genetics), Center for Epigenomics, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY 10461, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2010</pubdate>
         <volume>11</volume>
         <issue>4</issue>
         <fpage>R36</fpage>
         <url>http://genomebiology.com/2010/11/4/R36</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/gb-2010-11-4-r36</pubid>
               <pubid idtype="pmpid">20359321</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>8</day>
               <month>1</month>
               <year>2010</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>16</day>
               <month>3</month>
               <year>2010</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>1</day>
               <month>4</month>
               <year>2010</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>1</day>
               <month>4</month>
               <year>2010</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2010</year>
         <collab>Suzuki et al.; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited</note>
      </cpyrt>
      <shorttitle>
         <p>Cytosine methylation</p>
      </shorttitle>
      <shortabs>
         <p>Genome-wide, tag-based cytosine methylation analysis is optimized.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>Using the type III restriction-modification enzyme EcoP15I, we isolated sequences flanking sites digested by the methylation-sensitive HpaII enzyme or its methylation-insensitive MspI isoschizomer for massively parallel sequencing. A novel data transformation allows us to normalise HpaII by MspI counts, resulting in more accurate quantification of methylation at >1.8 million loci in the human genome. This HELP-tagging assay is not sensitive to sequence polymorphism or base composition and allows exploration of both CG-rich and depleted genomic contexts.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification id="300100010" subtype="man_spc_id" type="BMC">Genome studies</classification>
         <classification id="300100013" subtype="man_spc_id" type="BMC">Methods</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Epigenetic mechanisms of transcriptional regulation are increasingly being studied for their potential influences in human disease pathogenesis. Much of this interest is based on the paradigm of neoplastic transformation, in which epigenetic changes appear to be universal, widespread throughout the genome, causative of critical transcriptional changes and predictive of disease prognosis (reviewed in <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>). Furthermore, these epigenetic changes represent potential pharmacological targets for reversal and amelioration of the disease process <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>.</p>
         <p>Of the large number of regulatory processes referred to as epigenetic, there exist numerous assays to study chromatin component distribution, cytosine methylation and microRNA expression genome-wide. The chromatin components include a large number of post-translational modifications of histones, variant histones, DNA-binding proteins and associated complexes, all tested by chromatin immunoprecipitation (ChIP) approaches coupled with microarray hybridization or massively parallel sequencing (MPS). MicroRNAs can be identified and quantified by using microarrays and MPS, while cytosine methylation can be definitively studied by converting the DNA of the genome using sodium bisulfite, shotgun sequencing the product using MPS and mapping this back to the genome to count how frequently cytosines remain unconverted, indicating their methylation in the starting material, due to the resistance of methylcytosine to bisulfite conversion compared with unmethylated cytosines. This allows nucleotide resolution, strand-specific, quantitative assessment of cytosine methylation, with such studies performed in <it>Arabidopsis </it><abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp> and human cells to date <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>.</p>
         <p>While this approach represents the ideal means of testing cytosine methylation, the amount of sequencing necessary (for the human genome, over 1 billion sequences of ~75 bp each <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>) to generate quantitative information genome-wide remains prohibitive in terms of cost, limiting these studies to the few referred to above. When studying human disease, the emphasis remains on cytosine methylation assays, as it is generally easier to collect clinical samples for DNA purification than for ChIP or even RNA assays. However, the cell populations harvested are rarely of high purity, and we generally do not know the degree of change in cytosine methylation in the disease of interest and thus the quantitative discrimination required for an assay, with some studies to date indicating that the changes may be quite subtle <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. These concerns emphasize the need for cytosine methylation assays that can detect methylation levels intermediate in value and changes in disease that are relatively modest in magnitude. Certain microarray-based assays to study cytosine methylation have addressed this issue, with the methylated DNA immunoprecipitation (meDIP) assay amenable to such quantification when used for CpG islands <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> and possibly also for less CG dinucleotide-rich regions <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Restriction enzyme-based assays used with microarrays have also proven to be reasonably quantitative, whether based on methylation-sensitive (for example, the HELP assay <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>) or methylation-dependent (for example, MethylMapper <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>) enzymes. A promising new MPS-based assay is reduced representation bisulfite sequencing (RRBS), which is designed to study the CG-dense regions defined by short MspI fragments, and provides nucleotide resolution, quantitative data <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>.</p>
         <p>The use of MPS for what were previously microarray-based assays has been associated with improved performance <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>, as we found when we modified our HELP (HpaII tiny fragment Enrichment by Ligation-mediated PCR) assay <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> for MPS, creating an assay similar to Methyl-Seq <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. The strength of the HELP assay involves the comparison of the HpaII with the methylation-insensitive MspI representation, allowing a normalization step that makes the assay semi-quantitative <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. The HELP representation approach was improved upon by Ball <it>et al. </it><abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, who developed the Methyl-Sensitive Cut Counting (MSCC) assay, which involves digesting DNA with HpaII, ligating an adapter to the cohesive end formed, using a restriction enzyme site within the adapter to digest at a flanking sequence and thus capturing the sequence immediate adjacent to the HpaII site. By adding a second MPS-compatible adapter, a library can be generated for MPS, allowing the counting of reads at these sites to represent the degree of methylation at the site. The authors demonstrated the assay to be reasonably quantitative, testing over 1.3 million sites in the human genome, representing not only HpaII sites clustered in CG-dense regions of the genome (approximately 12% of all HpaII sites are located in annotated CpG islands in the human genome <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>) but also the remaining majority of the genome in which CG dinucleotides are depleted, a genomic compartment not tested by RRBS as currently designed. A focus on the CG-dense minority of the genome will fail to observe changes such as those at CG-depleted promoters (such as <it>OCT4 </it><abbrgrp><abbr bid="B17">17</abbr></abbrgrp>) and CpG island shores <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, and within gene bodies where cytosine methylation has been found to be positively correlated with gene transcription <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. It is likely, therefore, that an assay system that can study both CG-dense and CG-depleted regions will acquire substantially more information about epigenomic states than those directed at the CG-dense compartment alone.</p>
         <p>In the current study, we tested whether the use of an MspI control would improve MSCC assay performance, as we had found for microarray-based HELP, and whether we could develop an analytical pipeline for routine use of this assay in epigenome-wide association studies. We also explored the use of longer tags than those employed in the MSCC, and added T7 RNA polymerase and reverse transcription steps to allow the generation of libraries without contaminating products, thus obviating the need for gel extraction. The influence of base composition and fragment length parameters as potential sources of bias were also tested, using the H1 (WA01) human embryonic stem (ES) cell line. The outcome is a modified assay that combines the strengths of MSCC and HELP-seq/Methyl-seq, and the supporting analytical workflow that maximizes the quantitative capabilities of the data generated.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <sec>
            <st>
               <p>Library preparation and sequencing</p>
            </st>
            <p>We generated HELP tagging libraries with HpaII- or MspI-digested DNA derived from human ES cells using the experimental approach shown in Figure <figr fid="F1">1</figr>. The assay differs from MSCC <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> by using EcoP15I instead of MmeI, generating longer flanking sequences (27 as opposed to 18 to 19 bp) and the addition of a T7 polymerase and reverse transcription step to allow the generation of the library without contaminating single-adapter products, while in addition obviating the need for gel extraction. After the library preparation, a single band of 125 bp in length was generated, as expected. Libraries were sequenced using an Illumina Genome Analyzer (36 bp single end reads) and the sequences were analyzed and aligned using Illumina pipeline software version 1.3 or 1.4. A summary of the Illumina analysis results for each replicate is shown in Table S1 in Additional file <supplr sid="S1">1</supplr>.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>Supplemental data containing two figures (Figures S1 and S2) and nine tables (Tables S1 to S9).</p>
               </text>
               <file name="gb-2010-11-4-r36-S1.PDF">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>HELP-tagging assay design and library preparation</p>
               </caption>
               <text>
                  <p><b>HELP-tagging assay design and library preparation</b>. The genomic DNA is digested by HpaII or MspI, the former only cutting at CCGG sequences where the central CG dinucleotide is unmethylated. The first Illumina adapter (AE) is ligated to the compatible cohesive end created, juxtaposing an EcoP15I site beside the HpaII/MspI digestion site and allowing EcoP15I to digest within the flanking DNA sequence as shown. An A overhang is created, allowing the ligation of the second Illumina adapter (AS, green). This will create not only AE-insert-AS products but also AS-insert-AS molecules. By performing a T7 polymerase-mediated <it>in vitro </it>transcription from a promoter sequence located on the AE adapter, we can selectively enrich for the AE-insert-AS product, following which limited PCR amplification is performed to generate a single sized product for Illumina sequencing. RT, reverse transcription.</p>
               </text>
               <graphic file="gb-2010-11-4-r36-1"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Data quality and reproducibility</p>
            </st>
            <p>Based on our experimental design, successfully generated products would be expected to possess a 5'-CGCTGCTG sequence at the 3' end of the read, the first two nucleotides (CG) representing the cohesive end for ligation of HpaII/MspI digestion products, the remaining six nucleotides the EcoP15I restriction enzyme recognition site. In order to evaluate the yield of desired products, we counted the number of reads containing this sequence and plotted the starting positions of this sequence within the reads obtained. We observed that approximately two-thirds of the reads contained the expected sequence, and found that the majority was located at base positions 25 and 26, consistent with the known digestion properties of the restriction enzyme <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Removal of the approximately 30% of reads lacking the CG-EcoP15I sequence was performed to eliminate spurious sequences. In order to investigate sequence quality further, we also determined the number and relative position of Ns (ambiguous base calls) within the reads obtained. Overall, few reads were found to contain Ns, and where they were present, they were found to be evenly distributed by position within the sequence. To test data reproducibility, we compared the results of three experimental replicates against each other using the Pearson correlation coefficient metric. The results of this study showed that all replicates were highly correlated (all the <it>r </it>values exceed 0.9), which confirmed that the technical reproducibility of this assay was excellent (Table S2 in Additional file <supplr sid="S1">1</supplr>).</p>
         </sec>
         <sec>
            <st>
               <p>Distribution of MspI/HpaII sequence tags</p>
            </st>
            <p>We merged three lanes of MspI data and observed that approximately 80% of the 2,292,198 annotated HpaII sites in the human genome (hg18) were represented by at least one read, for a total of over 1.8 million loci throughout the genome. The mean numbers of reads per locus for MspI and HpaII were 3.94 and 1.82, respectively, and MspI counts were distributed evenly across all genomic compartments examined (Table S3 in Additional file <supplr sid="S1">1</supplr>). We hypothesize that a combination of incomplete genomic coverage and polymorphisms within some CCGG sites (as we have previously observed <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>) accounts for the 20% of HpaII sites that were not represented by any reads.</p>
         </sec>
         <sec>
            <st>
               <p>Normalization of HpaII by MspI counts and data transformation</p>
            </st>
            <p>When we plot the MspI count on the x-axis and HpaII count on the y-axis for each HpaII site, we can see two major groups of values in the plot (Figure <figr fid="F2">2a</figr>), separated into loci with high or with minimal HpaII counts. This plot helped us to develop a new method for normalizing HpaII counts in terms of variability of the MspI representation. We recognize that hypomethylated loci are associated with relatively greater HpaII counts and a larger angle B (Figure <figr fid="F2">2b</figr>, left) whereas methylated loci will be defined by smaller angle values (Figure <figr fid="F2">2b</figr>, middle). Furthermore, some loci will tend to be sequenced more readily than others, and may have identical B values but differing distances from the origin (c distance), allowing a confidence score for identical methylation values (B) in terms of the c distance values (Figure <figr fid="F2">2b</figr>, right). To test this model, we used bisulfite MassArray to test quantitatively the cytosine methylation values for 61 HpaII sites (Tables S4, S5 and S6 in Additional file <supplr sid="S1">1</supplr>), choosing loci representing all components of the B angle spectrum of values. In Figure <figr fid="F2">2c, d</figr> we show the correlations between these gold standard cytosine methylation values and raw HpaII counts or B angle values. We find that there is the same negative correlation (R<sup>2 </sup>= 0.502) between HpaII counts and cytosine methylation values as demonstrated in the MSCC technique <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, and that the angular transformation of the data incorporating the MspI normalization substantially improves this correlation (R<sup>2 </sup>= 0.826), defining the optimal approach for processing of these data. We represent the data for University of California Santa Cruz (UCSC) genome browser visualization as wiggle tracks, with higher B angle values defining less methylated loci. Methylated loci with zero values that would be otherwise difficult to visualize as having been tested are represented as small negative values. We show the details of the analytical workflow in Figure <figr fid="F3">3</figr> and an example of a UCSC genome browser representation of HELP-tagging data in Figure S1 in Additional file <supplr sid="S1">1</supplr>. All data are available through the Gene Expression Omnibus database (accession number [GEO:GSE19937]) and as UCSC genome browser tracks <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Data transformation and bisulfite validation</p>
               </caption>
               <text>
                  <p><b>Data transformation and bisulfite validation</b>. <b>(a) </b>Scatter plot showing the relationship between the number of HpaII and MspI reads at each locus. <b>(b) </b>The location of the data point on the scatter plot indicates whether it is likely to be less or more methylated with larger or smaller angles B subtended as shown, while the confidence of the measurement will be greater when more reads represent the data point, represented by the length of line c. <b>(c) </b>The HpaII count correlates negatively with the degree of methylation, with more counts occurring at loci with less methylation. <b>(d) </b>Transformation of the data to the B angle measure to normalize HpaII by MspI counts substantially improves the correlation with bisulfite MassArray validation data.</p>
               </text>
               <graphic file="gb-2010-11-4-r36-2"/>
            </fig>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>HELP-tagging analysis workflow</p>
               </caption>
               <text>
                  <p><b>HELP-tagging analysis workflow</b>. The analysis workflow for HELP-tagging data is illustrated. Only sequence reads that contain the adapter sequence and map to a single or &#8804; 10 sites are retained, the latter repetitive sequences distributed by weighting among the matched loci. Potential polymorphic loci are annotated. Normalization of HpaII by MspI using the angle calculation described in the previous figure is performed and files are generated for genome browser visualization. UCSC, University of California Santa Cruz.</p>
               </text>
               <graphic file="gb-2010-11-4-r36-3"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Potential sources of bias: base composition and fragment length</p>
            </st>
            <p>As the number of reads at CCGG sites following MspI digestion should not be influenced by methylation, the representation obtained from MspI digestion allowed us to look for systematic sources of bias inherent to the assay. A major concern was that base composition could be a source of such bias, as it has been reported that Illumina sequencing can be influenced by GC composition <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>, possibly because of the gel extraction step <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Our protocol does not require gel extraction and only begins to show an under-representation of sequences when the (G+C) content exceeds approximately 80% (Figure <figr fid="F4">4a</figr>). We also tested to see whether the sizes of the MspI fragments generated influenced the counts obtained, as the digestion by type III endonucleases like EcoP15I is most efficient when a pair of enzymes is present in convergent orientation on the same DNA molecule <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. We find that there is indeed an over-representation for shorter (&#8804;300 bp) and a corresponding modest under-representation for larger MspI fragments (Figure <figr fid="F4">4b</figr>).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Base composition and fragment length influences on sequence counts</p>
               </caption>
               <text>
                  <p><b>Base composition and fragment length influences on sequence counts</b>. <b>(a) </b>The proportion of (G+C) nucleotides was calculated for the 50-bp sequence centered around each annotated CCGG in the reference human genome. The base composition of all of the MspI sequences generated from the human ES cell line studied was also calculated. The relative proportion for (G+C) content in 2% bins for each set of data was calculated and plotted as shown. The black line shows the proportions in the reference genome, while the red line illustrates the distribution we observed in our MspI experiment. Two peaks representing base composition in repetitive sequences are apparent. The MspI distribution closely matches the expected distribution except when the base composition exceeds approximately 80%, when it is slightly under-represented. <b>(b) </b>We calculated the relative frequencies of MspI digestion product sizes in the human reference genome. In this case we found that the shorter fragments are more likely to be sequenced than larger (&#8805;300 bp) fragments. The three major peaks observed represent Alu short interspersed repetitive element (SINE) sequences.</p>
               </text>
               <graphic file="gb-2010-11-4-r36-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Identification of polymorphic CCGG sequences</p>
            </st>
            <p>Whereas MSCC used MmeI and generates an 18- to 19-bp sequence flanking the HpaII site <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, our use of EcoP15I generates a 27-bp flanking sequence. We asked whether this size difference influenced our ability to align sequences to the reference genome. We truncated our sequence reads to 19 bp to mimic the MSCC read length and found that this caused a profound loss of ability to align reads unambiguously (Table S7 in Additional file <supplr sid="S1">1</supplr>). To compensate for the low alignment rate, the MSCC report described an ingenious strategy of alignment to the sequences immediately flanking the annotated HpaII sites in the reference genome <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, an approach sufficiently powerful that it generated the well-validated data that they described. However, it does not offer the possibility of identifying polymorphic HpaII sites at the high frequencies that we previously observed for our HELP-seq assay <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. We tested whether our longer sequences allowed the identification of loci at which an HpaII site is annotated in the reference genome but we obtain no sequence reads, and the opposite situation where we observed at least four MspI reads (the average number per annotated MspI/HpaII site) flanking a locus not annotated in the reference genome. In Table S8 in Additional file <supplr sid="S1">1</supplr> we list approximately 6,600 candidate polymorphic HpaII sites, of which examples are shown in Figure <figr fid="F5">5</figr>, confirmed by targeted resequencing of those loci. The 6,600 loci were selected based on overlap with dbSNP entries, allowing us to evaluate the pattern of sequence variability at these loci. Approximately 80% of the SNPs are C:G to T:A transversions, consistent with deamination-mediated decay of methylcytosine being the cause of the polymorphism <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. Polymorphic CG dinucleotides are major potential sources of error not only for microarrays, which are designed to a consensus genomic sequence, but also for both bisulfite sequencing, which would read the C to T transversion as unmethylated, and mass spectrometry-based assays, requiring the development of specific analytical approaches such as we have described <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Polymorphic HpaII sites identified by HELP-tagging</p>
               </caption>
               <text>
                  <p><b>Polymorphic HpaII sites identified by HELP-tagging</b>. Examples of HpaII sites <b>(a) </b>annotated in the reference genome sequence but not represented by MspI reads or <b>(b) </b>not annotated in the reference human genome and represented by at least four MspI reads are shown. In each case there is a SNP defined by dbSNP that indicates the C:T to G:A transversion that eliminates or restores the CCGG HpaII site.</p>
               </text>
               <graphic file="gb-2010-11-4-r36-5"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>DNA methylation studies of human embryonic stem cells</p>
            </st>
            <p>To test whether the HELP-tagging assay was generating data that are biologically plausible, we tested the methylation of different genomic sequence compartments as density plots of B angle values for the human ES cells used in these studies. In Figure S2a in Additional file <supplr sid="S1">1</supplr> we show how promoters (defined as -2 kb to 2 kb from the transcription start site of RefSeq genes), gene bodies (the remaining region within the RefSeq gene) and intergenic (all other) sequences compare, finding the expected enrichment of hypomethylated loci with larger B angle values in promoter regions. When we compared unique with repetitive sequences, again we found the expected pattern of increased methylation of repetitive DNA compared with unique sequences (Figure S2b in Additional file <supplr sid="S1">1</supplr>). Combining these observations, we tested whether the transposable element component of annotated repetitive DNA sequences showed any tendency to unusual methylation near gene promoters. In Figure <figr fid="F6">6</figr> we show that while transposable elements are generally methylated and are depleted near gene promoters, those that are proximal to promoters tend to be less methylated than those located more distally. While many types of transposable elements were represented in this promoter-proximal hypomethylated group, we found a subset to be the most markedly over-represented, as shown in Figure <figr fid="F6">6c</figr>.</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Identification of a position effect on DNA methylation in transposable elements located close to gene promoters</p>
               </caption>
               <text>
                  <p><b>Identification of a position effect on DNA methylation in transposable elements located close to gene promoters. </b>The distance from RefSeq gene transcription start sites and DNA methylation status are shown. The x-axis displays the distance from transcription start sites (TSSs). HpaII sites were categorized into three groups by angle, 0 to 30 (blue), 31 to 60 (red) and 61 to 90 (green). <b>(a)</b> Number of HpaII sites; <b>(b)</b> proportions of each angle category (%).</p>
               </text>
               <graphic file="gb-2010-11-4-r36-6"/>
            </fig>
            <p>The outcome of these studies was an improvement in the previously described MSCC <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and HELP-seq <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> assays, not only by means of technical modifications such as the use of EcoP15I but also because of the concurrent use of MspI for normalization. The effect of these modifications was not only to increase the accuracy of the assay but also to enhance the ability to align sequences to the genome and thus identify polymorphic HpaII/MspI sites. The means of normalization of HpaII by MspI using an angular metric is an innovation that improved the data accuracy substantially and may have applications in other MPS assay normalization strategies. We were also able to discard reads that did not contain the expected adapter sequences, and created a straightforward data analytical pipeline that will facilitate processing of these HELP-tagging data by others.</p>
            <p>The potential sources of systematic artifacts due to base composition or digestion product size were evaluated. Apart from a modest decrease in representation in regions above approximately 80% (G+C) content, base composition did not cause biases in representations, possibly in part due to our avoidance of a gel purification step in library preparation <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Fragment length does influence the outcome, most likely due to effects on EcoP15I digestion <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, although the effects should be similar for both HpaII and MspI and should, therefore, largely cancel each other out in the normalization step. It is possible that endogeneous EcoP15I sites could influence the representations, but to have an effect they would have to be located within the 27 bp adjacent to HpaII/MspI sites and would cause digestion of the ligated adapter, causing those loci to be under-represented in both HpaII and MspI datasets. The most likely effect of these endogeneous sites is that they contribute to the proportion of loci at which we could not obtain sequence reads.</p>
            <p>Our exploration of the distribution of cytosine methylation in the same human ES cell line studied by Lister <it>et al. </it><abbrgrp><abbr bid="B6">6</abbr></abbrgrp> showed consistent results, with hypomethylation of transcription start sites and methylation of transposable elements, as expected from long-standing observations in the field. We furthermore discovered a limited subset of transposable elements that is hypomethylated when in close proximity to transcription start sites. When this subset was studied to determine whether certain types of transposable elements were disproportionately over-represented, we found two broad classes, one of transposable element fossils with no innate capacity to replicate themselves (the ancient DNA, long interspersed repetitive elements (LINEs) and short interspersed repetitive elements (SINEs) shown in Figure <figr fid="F6">6c</figr>) and younger ERV1 long terminal repeat retroelements. Loss of methylation of functionally inactive transposable elements is likely to be of no negative consequence to the host genome, consistent with the host defense hypothesis <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, while the young ERV long terminal repeats represent a group of transposons whose function has been harnessed as promoters of endogeneous genes <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>. This observation demonstrates the value of a high-resolution, genome-wide assay like HELP-tagging to define potential functional elements in an unbiased manner.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusions</p>
         </st>
         <p>We propose that MPS-based assays such as RRBS <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, MSCC <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and HELP-tagging will prove to be the assays of choice for epigenome-wide association studies in human disease, with the latter two preferable as we begin to explore the CG-depleted majority of the genome. It should not be necessary to run MspI assays every time a HELP-tagging assay is performed, suggesting that a common MspI dataset can serve as a universal reference for a species, allowing a single lane of Illumina sequencing of the HpaII library to provide the methylation data for that sample. The development of analytical pipelines to support analysis of these datasets will be critical to the success of these projects, while the careful ongoing assessment of potential sources of bias will also be essential for improving assay performance.</p>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>Cell preparation and DNA purification</p>
            </st>
            <p>H1 human ES cells (NIH code WA01 from Wicell Research Institute, Madison, WI, USA) were cultured on matrigel (BD Biosciences, San Diego, CA, USA), at 37&#176;C, 5% O<sub>2 </sub>and 5% CO<sub>2</sub>. Amplified human ES cell pluripotency was assessed by flow cytometry with SSEA4, CD24 and Oct4 markers. To extract DNA, the cells were suspended in 10 ml of a solution of 10 mM Tris-HCl (pH 8.0), 0.1 M EDTA and 1 ml of 10% SDS to which 10 &#956;l of RNase A (20 mg/ml) was added. After incubation for 1 hour at 37&#176;C, 50 &#956;l of proteinase K (20 mg/ml) was added and the solution was gently mixed and incubated in a 50&#176;C water bath overnight. To purify the lysate, it was extracted three times using saturated phenol, then twice with chloroform, and dialyzed for 16 hours at 4&#176;C against three changes of 0.2&#215; SSC. Following dialysis, the DNA was concentrated by coating the dialysis bags in polyethylene glycol (molecular weight 20,000). The purity and final concentration of the purified DNA was checked by spectrometry (Nanodrop, Wilmington, DE, USA).</p>
         </sec>
         <sec>
            <st>
               <p>Illumina library preparation</p>
            </st>
            <p>The sample preparation steps are illustrated in Figure <figr fid="F1">1</figr>. Two custom adapters were created for HELP-tagging, referred to as AE and AS. As well as an Illumina adapter sequence, adapter AE contains an EcoP15I recognition site and a T7 promoter sequence. Adapter AS contains an Illumina sequencing primer sequence. The adapter and primer sequences for library preparation are listed in Table S9 in Additional file <supplr sid="S1">1</supplr>. Genomic DNA (5 &#956;g) was digested with HpaII and MspI in separate 200 &#956;l reactions and purified by phenol/chloroform extraction followed by ethanol precipitation. The digested genomic DNA was ligated to adapter AE using a New England Biolabs Quick Ligation Kit (25 &#956;l of 2&#215; Quick ligase buffer, 3 &#956;g of HpaII-digested DNA or 1 &#956;g of MspI-digested DNA, 0.1 &#956;l of Adapter AE (1 &#956;M), 3 &#956;l of Quick Ligase in a final volume of 50 &#956;l). After AE ligation, the products were purified using Agencourt AMpure beads (Beckman Coulter, Brea CA, USA), then digested with EcoP15I (New England Biolabs). The restriction fragments were end-repaired to inhibit to dimerization of adapters, and tailed with a single dA, at the 3' end. After the dA tailing reaction, adapter AS was ligated to the dA-tailed fragments using a New England Biolabs Quick Ligation Kit (25 &#956;l of 2&#215; Quick ligase buffer, 2.5 &#956;l of adapter AS (10 &#956;M), 2.5 &#956;l of Quick Ligase in a final volume 50 &#956;l). After ligation, products were purified using the MinElute PCR purification kit (Qiagen, Hilden, Germany) and <it>in vitro</it>-transcribed using the Ambion MEGAshortscriptkit (Life Technologies, Carlsbad, CA, USA). Following <it>in vitro </it>transcription, products were purified with an RNeasy clean-up kit (Qiagen) before reverse transcription was performed using the Invitrogen SuperScript III kit (Life Technologies). The first strand cDNA produced was used as a template for PCR using the following conditions: 96&#176;C for 2 minutes, then 18 cycles of 96&#176;C for 15 seconds and 72&#176;C for 15 seconds followed by 5 minutes at 72&#176;C for the final extension. After PCR, the library was purified using a QIAQuick PCR clean-up kit (Qiagen).</p>
         </sec>
         <sec>
            <st>
               <p>Single-locus quantitative validation assays</p>
            </st>
            <p>Bisulfite conversion and MassArray (Sequenom, San Diego, CA, USA) were performed using an aliquot of the same sample of DNA as was used for the high-throughput assays described above. Bisulfite conversion was performed with an EZ DNA Methylation kit (Zymo Research, Orange, CA, USA). Bisulfite primers were designed using MethPrimer <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, specifying the desired product length (250 to 450 bp), primer length (23 to 29 bp) and primer Tm (56 to 62&#176;C). PCR was performed using FastStart High Fidelity Taq polymerase (Roche, Basel, Switzerland) with the following conditions: 95&#176;C for 10 minutes, then 42 cycles of 95&#176;C for 30 seconds, primer-specific Tm for 30 seconds and 72&#176;C for 1 minute, followed by 72&#176;C for 10 minutes for the final extension. Primer-specific Tm and sequence information are provided in Table S6 in Additional file <supplr sid="S1">1</supplr>. Bisulfite MassArray assays were performed by the institutional Genomics Core Facility. The data were analyzed using the analytical pipeline we have previously described <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Bioinformatic analysis</p>
            </st>
            <p>Four lanes of sequencing were performed using an Illumina GA IIx Sequencer at the institutional Epigenomics Shared Facility. Three lanes were used for technical replicates of MspI, for the methylation-insensitive reference dataset. Images generated by the Illumina sequencer were analyzed by Illumina pipeline software (versions 1.3 to 1.4). Initial data processing was performed using the default read length of 36 bp, after which we isolated the sequences in which we found adapter sequences on the 3'-end, replaced the adapter sequence with a poly(N) sequence of the same length, and re-ran the Illumina ELAND pipeline again on these sequences with the sequence length set at 27 bp (the 2 to 28 bp subsequence). The data within the ELAND_extended.txt files were used for counting the number of aligned sequences adjacent to each CCGG (HpaII/MspI) site annotated in the hg18 freeze of the human genome at the UCSC genome browser. We permitted up to two mismatches in each sequence, and allowed a sequence to align to up to a maximum of 10 locations within the genome. For non-unique alignments, a sequence was assigned a partial count for each alignment location amounting to 1/n, where n represents the total number of aligned positions. To normalize the data between experiments, the number of sequences associated with each HpaII site was divided by the total number of sequences (including partial counts) aligning to all HpaII sites in the same sample. We refer to this figure as the fixed count below.</p>
            <p>To examine an influence of (G+C) mononucleotide content on counts of sequences obtained, we extracted the (G+C) annotation from the hg18 freeze of the human genome at UCSC and examined the distribution of sequence counts according to (G+C) content. Annotated percentages of (G+C) content were available for adjacent 5-bp windows. For each annotated HpaII site, we calculated the mean percentage (G+C) for a 50-bp region centered at the restriction site. Counts of sequences associated with HpaII sites were obtained for 50 sequential non-overlapping windows of 2% (G+C) (the minimum possible in a sample of 50-bp regions). These data were then normalized as a proportion of the total number of fragments. Comparisons were made to the expected frequencies, which for each 2% (G+C) bin was represented by the counts of HpaII sites falling within a range relative to the total number of HpaII sites in the genome. This analysis was performed on both HpaII and MspI-digested DNA for comparison.</p>
            <p>The potential effect of distance between HpaII sites on sequences counts obtained at each HpaII site was measured by summing the counts of sequences aligning within each restriction fragment, and normalizing the result with respect to total sequence count. As with the (G+C) analysis above, this was performed for both MspI and HpaII digested restriction fragments. The data were compared with the expected distribution determined by performing virtual restriction digestion using genomic HpaII site coordinates, and normalizing the number of virtual fragments of each size with respect to the total number of these virtual fragments.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>bp: base pair; CG/CpG: cytosine-guanine dinucleotide; ChIP: chromatin immunoprecipitation; ES: embryonic stem; (G+C): guanine and cytosine mononucleotides; HELP: HpaII tiny fragment Enrichment by Ligation-mediated PCR; MPS: massively-parallel sequencing; MSCC: methyl-sensitive cut counting; RRBS: reduced representation bisulfite sequencing; UCSC: University of California Santa Cruz.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>MS and JMG designed the assays and strategies for its analysis, MS performed all library preparation and characterization, MS, DL and MP performed bisulfite validation studies, while QJ and AMcL performed computational analyses. JMG and MS prepared the manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work is supported by a grant from the National Institute of Health (NIH, R01 HG004401) to JMG. The authors thank Shahina Maqbool PhD, Raul Olea and Gael Westby of the Einstein Epigenomics Shared Facility for their contributions, Drs Eric Bouhassira and Emmanuel Olivier (Einstein) for the WA01/H1 human ES cell line, and Einstein's Center for Epigenomics.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Epigenetics in cancer.</p>
            </title>
            <aug>
               <au>
                  <snm>Esteller</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>N Engl J Med</source>
            <pubdate>2008</pubdate>
            <volume>358</volume>
            <fpage>1148</fpage>
            <lpage>1159</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1056/NEJMra072067</pubid>
                  <pubid idtype="pmpid">18337604</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>The epigenome as a target for cancer chemoprevention.</p>
            </title>
            <aug>
               <au>
                  <snm>Kopelovich</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Crowell</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Fay</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>J Natl Cancer Inst</source>
            <pubdate>2003</pubdate>
            <volume>95</volume>
            <fpage>1747</fpage>
            <lpage>1757</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14652236</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Genome-wide analysis of <it>Arabidopsis thaliana </it>DNA methylation uncovers an interdependence between methylation and transcription.</p>
            </title>
            <aug>
               <au>
                  <snm>Zilberman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Gehring</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tran</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Ballinger</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Henikoff</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2007</pubdate>
            <volume>39</volume>
            <fpage>61</fpage>
            <lpage>69</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1929</pubid>
                  <pubid idtype="pmpid" link="fulltext">17128275</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Shotgun bisulphite sequencing of the <it>Arabidopsis </it>genome reveals DNA methylation patterning.</p>
            </title>
            <aug>
               <au>
                  <snm>Cokus</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Feng</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Merriman</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Haudenschild</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Pradhan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Pellegrini</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jacobsen</snm>
                  <fnm>SE</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2008</pubdate>
            <volume>452</volume>
            <fpage>215</fpage>
            <lpage>219</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature06745</pubid>
                  <pubid idtype="pmcid">2377394</pubid>
                  <pubid idtype="pmpid" link="fulltext">18278030</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Highly integrated single-base resolution maps of the epigenome in <it>Arabidopsis</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Lister</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>O'Malley</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Tonti-Filippini</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gregory</snm>
                  <fnm>BD</fnm>
               </au>
               <au>
                  <snm>Berry</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Millar</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Ecker</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2008</pubdate>
            <volume>133</volume>
            <fpage>523</fpage>
            <lpage>536</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.cell.2008.03.029</pubid>
                  <pubid idtype="pmcid">2723732</pubid>
                  <pubid idtype="pmpid" link="fulltext">18423832</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Human DNA methylomes at base resolution show widespread epigenomic differences.</p>
            </title>
            <aug>
               <au>
                  <snm>Lister</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Pelizzola</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dowen</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Hawkins</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Hon</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Tonti-Filippini</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Nery</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Ye</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Ngo</snm>
                  <fnm>QM</fnm>
               </au>
               <au>
                  <snm>Edsall</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Antosiewicz-Bourget</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Stewart</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ruotti</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Millar</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Thomson</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Ren</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Ecker</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2009</pubdate>
            <volume>462</volume>
            <fpage>315</fpage>
            <lpage>322</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">19829295</pubid>
                  <pubid idtype="doi">10.1038/nature08514</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Persistent epigenetic differences associated with prenatal exposure to famine in humans.</p>
            </title>
            <aug>
               <au>
                  <snm>Heijmans</snm>
                  <fnm>BT</fnm>
               </au>
               <au>
                  <snm>Tobi</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Stein</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Putter</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Blauw</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Susser</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Slagboom</snm>
                  <fnm>PE</fnm>
               </au>
               <au>
                  <snm>Lumey</snm>
                  <fnm>LH</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2008</pubdate>
            <volume>105</volume>
            <fpage>17046</fpage>
            <lpage>17049</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.0806560105</pubid>
                  <pubid idtype="pmcid">2579375</pubid>
                  <pubid idtype="pmpid" link="fulltext">18955703</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Development of a novel output value for quantitative assessment in methylated DNA immunoprecipitation-CpG island microarray analysis.</p>
            </title>
            <aug>
               <au>
                  <snm>Yamashita</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hosoya</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Gyobu</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Takeshima</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Ushijima</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>DNA Res</source>
            <pubdate>2009</pubdate>
            <volume>16</volume>
            <fpage>275</fpage>
            <lpage>286</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/dnares/dsp017</pubid>
                  <pubid idtype="pmcid">2762412</pubid>
                  <pubid idtype="pmpid" link="fulltext">19767598</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis.</p>
            </title>
            <aug>
               <au>
                  <snm>Down</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Rakyan</snm>
                  <fnm>VK</fnm>
               </au>
               <au>
                  <snm>Turner</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Flicek</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kulesha</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Graf</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Herrero</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Tomazou</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Thorne</snm>
                  <fnm>NP</fnm>
               </au>
               <au>
                  <snm>Backdahl</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Herberth</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Howe</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Jackson</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Miretti</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Marioni</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Tavare</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Beck</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2008</pubdate>
            <volume>26</volume>
            <fpage>779</fpage>
            <lpage>785</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt1414</pubid>
                  <pubid idtype="pmcid">2644410</pubid>
                  <pubid idtype="pmpid" link="fulltext">18612301</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>High-resolution genome-wide cytosine methylation profiling with simultaneous copy number analysis and optimization for limited cell numbers.</p>
            </title>
            <aug>
               <au>
                  <snm>Oda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Glass</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>RF</fnm>
               </au>
               <au>
                  <snm>Mo</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Olivier</snm>
                  <fnm>EN</fnm>
               </au>
               <au>
                  <snm>Figueroa</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Selzer</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Richmond</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Dannenberg</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Melnick</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hatchwell</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Bouhassira</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Verma</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Greally</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2009</pubdate>
            <volume>37</volume>
            <fpage>3829</fpage>
            <lpage>3839</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/gkp260</pubid>
                  <pubid idtype="pmcid">2709560</pubid>
                  <pubid idtype="pmpid" link="fulltext">19386619</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>MethylMapper: a method for high-throughput, multilocus bisulfite sequence analysis and reporting.</p>
            </title>
            <aug>
               <au>
                  <snm>Ordway</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Bedell</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Citek</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Nunberg</snm>
                  <fnm>AN</fnm>
               </au>
               <au>
                  <snm>Jeddeloh</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Biotechniques</source>
            <pubdate>2005</pubdate>
            <volume>39</volume>
            <note>464, 466, 468 passim.</note>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16235556</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Genome-scale DNA methylation maps of pluripotent and differentiated cells.</p>
            </title>
            <aug>
               <au>
                  <snm>Meissner</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mikkelsen</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Gu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Wernig</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hanna</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Sivachenko</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Bernstein</snm>
                  <fnm>BE</fnm>
               </au>
               <au>
                  <snm>Nusbaum</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Jaffe</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Gnirke</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jaenisch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2008</pubdate>
            <volume>454</volume>
            <fpage>766</fpage>
            <lpage>770</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">18600261</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms.</p>
            </title>
            <aug>
               <au>
                  <snm>t Hoen</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Ariyurek</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Thygesen</snm>
                  <fnm>HH</fnm>
               </au>
               <au>
                  <snm>Vreugdenhil</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Vossen</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>de Menezes</snm>
                  <fnm>RX</fnm>
               </au>
               <au>
                  <snm>Boer</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>van Ommen</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>den Dunnen</snm>
                  <fnm>JT</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2008</pubdate>
            <volume>36</volume>
            <fpage>e141</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/gkn705</pubid>
                  <pubid idtype="pmcid">2588528</pubid>
                  <pubid idtype="pmpid" link="fulltext">18927111</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Distinct DNA methylation patterns characterize differentiated human embryonic stem cells and developing human fetal liver.</p>
            </title>
            <aug>
               <au>
                  <snm>Brunner</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Valouev</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Reddy</snm>
                  <fnm>TE</fnm>
               </au>
               <au>
                  <snm>Neff</snm>
                  <fnm>NF</fnm>
               </au>
               <au>
                  <snm>Anton</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Medina</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Nguyen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Chiao</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Oyolu</snm>
                  <fnm>CB</fnm>
               </au>
               <au>
                  <snm>Schroth</snm>
                  <fnm>GP</fnm>
               </au>
               <au>
                  <snm>Absher</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2009</pubdate>
            <volume>19</volume>
            <fpage>1044</fpage>
            <lpage>1056</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.088773.108</pubid>
                  <pubid idtype="pmcid">2694474</pubid>
                  <pubid idtype="pmpid" link="fulltext">19273619</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Targeted and genome-scale strategies reveal gene-body methylation signatures in human cells.</p>
            </title>
            <aug>
               <au>
                  <snm>Ball</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Gao</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>LeProust</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>IH</fnm>
               </au>
               <au>
                  <snm>Xie</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Daley</snm>
                  <fnm>GQ</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2009</pubdate>
            <volume>27</volume>
            <fpage>361</fpage>
            <lpage>368</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt.1533</pubid>
                  <pubid idtype="pmpid" link="fulltext">19329998</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Epigenomics: beyond CpG islands.</p>
            </title>
            <aug>
               <au>
                  <snm>Fazzari</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Greally</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>446</fpage>
            <lpage>455</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg1349</pubid>
                  <pubid idtype="pmpid" link="fulltext">15153997</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Epigenetic control of mouse Oct-4 gene expression in embryonic stem cells and trophoblast stem cells.</p>
            </title>
            <aug>
               <au>
                  <snm>Hattori</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Nishino</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ko</snm>
                  <fnm>YG</fnm>
               </au>
               <au>
                  <snm>Hattori</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Ohgane</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Tanaka</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shiota</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2004</pubdate>
            <volume>279</volume>
            <fpage>17063</fpage>
            <lpage>17069</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M309002200</pubid>
                  <pubid idtype="pmpid" link="fulltext">14761969</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores.</p>
            </title>
            <aug>
               <au>
                  <snm>Irizarry</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Ladd-Acosta</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wen</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Montano</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Onyango</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Cui</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Gabo</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Rongione</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Webster</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ji</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Potash</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Sabunciyan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Feinberg</snm>
                  <fnm>AP</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2009</pubdate>
            <volume>41</volume>
            <fpage>178</fpage>
            <lpage>186</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng.298</pubid>
                  <pubid idtype="pmcid">2729128</pubid>
                  <pubid idtype="pmpid" link="fulltext">19151715</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Functional characterization and modulation of the DNA cleavage efficiency of type III restriction endonuclease EcoP15I in its interaction with two sites in the DNA target.</p>
            </title>
            <aug>
               <au>
                  <snm>Moncke-Buchner</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rothenberg</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Reich</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wagenfuhr</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Matsumura</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Terauchi</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kruger</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>Reuter</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2009</pubdate>
            <volume>387</volume>
            <fpage>1309</fpage>
            <lpage>1319</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.jmb.2009.02.047</pubid>
                  <pubid idtype="pmpid" link="fulltext">19250940</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Greallylab</p>
            </title>
            <url>http://greallylab.aecom.yu.edu/HELP-tagging/</url>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Substantial biases in ultra-short read data sets from high-throughput DNA sequencing.</p>
            </title>
            <aug>
               <au>
                  <snm>Dohm</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Lottaz</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Borodina</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Himmelbauer</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2008</pubdate>
            <volume>36</volume>
            <fpage>e105</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/gkn425</pubid>
                  <pubid idtype="pmcid">2532726</pubid>
                  <pubid idtype="pmpid" link="fulltext">18660515</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>A large genome center's improvements to the Illumina sequencing system.</p>
            </title>
            <aug>
               <au>
                  <snm>Quail</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Kozarewa</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Scally</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Stephens</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Swerdlow</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Turner</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nat Methods</source>
            <pubdate>2008</pubdate>
            <volume>5</volume>
            <fpage>1005</fpage>
            <lpage>1010</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nmeth.1270</pubid>
                  <pubid idtype="pmcid">2610436</pubid>
                  <pubid idtype="pmpid" link="fulltext">19034268</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Mutagenic deamination of cytosine residues in DNA.</p>
            </title>
            <aug>
               <au>
                  <snm>Duncan</snm>
                  <fnm>BK</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>JH</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1980</pubdate>
            <volume>287</volume>
            <fpage>560</fpage>
            <lpage>561</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/287560a0</pubid>
                  <pubid idtype="pmpid">6999365</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>A pipeline for the quantitative analysis of CG dinucleotide methylation using mass spectrometry.</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>RF</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lau</snm>
                  <fnm>KW</fnm>
               </au>
               <au>
                  <snm>Greally</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2009</pubdate>
            <volume>25</volume>
            <fpage>2164</fpage>
            <lpage>2170</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btp382</pubid>
                  <pubid idtype="pmpid" link="fulltext">19561019</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Cytosine methylation and the ecology of intragenomic parasites.</p>
            </title>
            <aug>
               <au>
                  <snm>Yoder</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Walsh</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Bestor</snm>
                  <fnm>TH</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1997</pubdate>
            <volume>13</volume>
            <fpage>335</fpage>
            <lpage>340</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(97)01181-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">9260521</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Retroviral promoters in the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Conley</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>Piriyapongsa</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Jordan</snm>
                  <fnm>IK</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2008</pubdate>
            <volume>24</volume>
            <fpage>1563</fpage>
            <lpage>1567</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btn243</pubid>
                  <pubid idtype="pmpid" link="fulltext">18535086</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Endogenous retroviral LTRs as promoters for human genes: a critical assessment.</p>
            </title>
            <aug>
               <au>
                  <snm>Cohen</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Lock</snm>
                  <fnm>WM</fnm>
               </au>
               <au>
                  <snm>Mager</snm>
                  <fnm>DL</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>2009</pubdate>
            <volume>448</volume>
            <fpage>105</fpage>
            <lpage>114</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.gene.2009.06.020</pubid>
                  <pubid idtype="pmpid" link="fulltext">19577618</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>MethPrimer</p>
            </title>
            <url>http://www.urogene.org/methprimer/</url>
         </bibl>
      </refgrp>
   </bm>
</art>
