<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
<ui>gb-2012-13-8-r70</ui>
<ji>1465-6906</ji>
<fm>
<dochead>Research</dochead>
<bibl>
<title><p>The mouse DXZ4 homolog retains Ctcf binding and proximity to <it>Pls3 </it>despite substantial organizational differences compared to the primate macrosatellite</p></title>
<aug>
<au id="A1"><snm>Horakova</snm><mi>H</mi><fnm>Andrea</fnm><insr iid="I1"/><email>horakova.andrea@gmail.com</email></au>
<au id="A2"><snm>Calabrese</snm><fnm>J Mauro</fnm><insr iid="I2"/><email>jmcalabr@med.unc.edu</email></au>
<au id="A3"><snm>McLaughlin</snm><mi>R</mi><fnm>Christine</fnm><insr iid="I1"/><email>crm07h@my.fsu.edu</email></au>
<au id="A4"><snm>Tremblay</snm><mi>C</mi><fnm>Deanna</fnm><insr iid="I1"/><email>dtremblay@fsu.edu</email></au>
<au id="A5"><snm>Magnuson</snm><fnm>Terry</fnm><insr iid="I2"/><email>terry_magnuson@med.unc.edu</email></au>
<au id="A6" ca="yes"><snm>Chadwick</snm><mi>P</mi><fnm>Brian</fnm><insr iid="I1"/><email>chadwick@bio.fsu.edu</email></au>
</aug>
<insg><ins id="I1"><p>Department of Biological Science, Florida State University, Tallahassee, FL 32306-4295, USA</p></ins><ins id="I2"><p>Department of Genetics, Carolina Center for Genome Sciences and Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA</p></ins>
</insg>
<source>Genome Biology</source>
<issn>1465-6906</issn>
<pubdate>2012</pubdate>
<volume>13</volume>
<issue>8</issue>
<fpage>R70</fpage><url>http://genomebiology.com/2012/13/8/R70</url>
<xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2012-13-8-r70</pubid><pubid idtype="pmpid">22906166</pubid></pubidlist></xrefbib>
</bibl>
<history><rec><date><day>25</day><month>4</month><year>2012</year></date></rec><revrec><date><day>17</day><month>7</month><year>2012</year></date></revrec><acc><date><day>20</day><month>8</month><year>2012</year></date></acc><pub><date><day>20</day><month>8</month><year>2012</year></date></pub></history>
<cpyrt><year>2012</year><collab>Horakova et al.; licensee BioMed Central Ltd</collab><note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
<abs>
<sec><st><p>Abstract</p></st>
<sec><st><p>Background</p></st>
<p>The X-linked macrosatellite DXZ4 is a large homogenous tandem repeat that in females adopts an alternative chromatin organization on the primate X chromosome in response to X-chromosome inactivation. It is packaged into heterochromatin on the active X chromosome but into euchromatin and bound by the epigenetic organizer protein CTCF on the inactive X chromosome. Because its DNA sequence diverges rapidly beyond the New World monkeys, the existence of DXZ4 outside the primate lineage is unknown.</p>
</sec>
<sec><st><p>Results</p></st>
<p>Here we extend our comparative genome analysis and report the identification and characterization of the mouse homolog of the macrosatellite. Furthermore, we provide evidence of DXZ4 in a conserved location downstream of the <it>PLS3 </it>gene in a diverse group of mammals, and reveal that DNA sequence conservation is restricted to the CTCF binding motif, supporting a central role for this protein at this locus. However, many features that characterize primate DXZ4 differ in mouse, including the overall size of the array, the mode of transcription, the chromatin organization and conservation between adjacent repeat units of DNA sequence and length. Ctcf binds Dxz4 but is not exclusive to the inactive X chromosome, as evidenced by association in some males and equal binding to both X chromosomes in trophoblast stem cells.</p>
</sec>
<sec><st><p>Conclusions</p></st>
<p>Characterization of Dxz4 reveals substantial differences in the organization of DNA sequence, chromatin packaging, and the mode of transcription, so the potential roles performed by this sequence in mouse have probably diverged from those on the primate X chromosome.</p>
</sec>
</sec>
</abs>
</fm>
<bdy>
<sec><st><p>Background</p></st>
<p>Over two-thirds of the human genome is likely to be composed of repetitive DNA <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, of which a significant proportion is tandem repeat DNA <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. The tandem repeats consist of homologous DNA sequences arranged head to tail, and the number of repeat units is invariably polymorphic from one individual to the next <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. The size of the individual repeat unit varies substantially, from the simple microsatellite composed of individual repeat units of 1 to 6 bp spanning tens to hundreds of base pairs <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> to those consisting of individual repeat units of several kilobases that can cover hundreds to thousands of kilobases <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. For some tandem repeat DNA, deciphering of function is assisted by location, such as the alpha satellite DNA that defines active centromeres <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> to the telomeric minisatellite <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, but the roles of others in our genome remain unknown, resulting in opinions in the past that they serve no purpose <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>.</p>
<p>Despite a lack of functional understanding for these sequences, their contribution to disease susceptibility is obvious, as is demonstrated by the devastating impact of simple repeat expansions <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> or macrosatellite contraction <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>.</p>
<p>Macrosatellites are tandem repeat DNA with some of the largest individual repeat units (most &gt;2 kb), which can extend over hundreds to thousands of kilobases <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B11">11</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>. Most occupy specific locations on one or two chromosomes, like the X-linked macrosatellite DXZ4, which is unique to Xq23 <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Because of its physical location on the X chromosome, DXZ4 is exposed to the process of X-chromosome inactivation (XCI). XCI is the mammalian form of dosage compensation, an epigenetic process that serves to balance the levels of X-linked gene expression in the two sexes <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. It occurs early in female development and shuts down gene expression from the X chromosome (Xi) chosen to become inactive by repackaging the DNA into facultative heterochromatin <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. One characteristic difference between Xi chromatin and that of the active X chromosome (Xa) is hypermethylation of cytosine residues at CpG islands (CGIs) <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>, but DXZ4, which is itself one of the largest CGIs in the human genome, does not conform. Instead, DXZ4 CpG residues are hypomethylated on the Xi and hypermethylated on the Xa <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B22">22</abbr></abbrgrp>. Consistent with the DNA methylation profile of DXZ4, its nucleosomes are characterized by the heterochromatin-associated modification histone H3 trimethylated at lysine 9 <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp> on the Xa and the euchromatin-associated modification histone H3 dimethylated at lysine 4 (H3K4me2) <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> on the Xi <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B25">25</abbr></abbrgrp>. Furthermore, the multifunctional zinc-finger protein CCCTC-binding factor (CTCF) <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> associates specifically with the euchromatic form of DXZ4 on the Xi <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B27">27</abbr></abbrgrp>. The role DXZ4 performs on the Xi when packaged as CTCF-bound euchromatin flanked by heterochromatin or on the Xa and male X chromosome when packaged into heterochromatin flanked by euchromatin remains unclear. However, we have recently shown that, in humans, DXZ4 mediates Xi-specific CTCF-dependent long-range intrachromosomal interactions with other tandem repeat DNA <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, suggesting a structural role for DXZ4 that may orchestrate the alternative three-dimensional organization of the Xi relative to the Xa <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. To gain insight into DXZ4 function, we previously investigated DXZ4 in a variety of representative primates and found that CTCF binding at the Xi was conserved, as were the chromatin organization, expression, and arrangement of the macrosatellite into large homogenous tandem arrays <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>, but beyond the New World monkey branch, primary DNA sequence composition and tandem-repeat unit size diverged rapidly from that observed in humans, with the notable exception of a relatively small proportion of DXZ4 that encompassed the CTCF binding site and promoter element <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp>. To further our understanding of DXZ4, we extended our analysis beyond the primate lineage in an attempt to identify a homolog of DXZ4 in mouse. Mouse has been the logical model organism of choice for investigation of XCI, and much of what we understand about the process has been obtained through mouse manipulations <it>in vivo </it>and <it>in vitro </it><abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. Despite differences in the early stages of XCI between humans and mice <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>, and differences in the extent of escape from XCI <abbrgrp><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr></abbrgrp>, identification of a mouse homolog of DXZ4 would provide a tractable system in which to investigate function. Here we report the identification and characterization of the mouse homolog of DXZ4. We show that DNA sequence conservation is restricted to a short DNA sequence corresponding to the CTCF binding site, but many features of DXZ4 differ substantially in the mouse, and as a result manipulation of mouse Dxz4 is unlikely to provide insight into all aspects of DXZ4 function in primates.</p>
</sec>
<sec><st><p>Results and discussion</p></st>
<sec><st><p>Genomic organization of a mouse candidate for Dxz4</p></st>
<p>A comparison of a human DXZ4 3.0-kb tandem repeat monomer against the assembled mouse genome (mm9) with Blast-Like Alignment Tool (BLAT) produced no significant matches on the Ensembl genome browser <abbrgrp><abbr bid="B35">35</abbr></abbrgrp> and a limited number of autosomal and X-linked matches on the UCSC Genome Browser <abbrgrp><abbr bid="B36">36</abbr></abbrgrp> (data not shown). We therefore explored conserved gene order in human and mouse to identify a DXZ4 homolog <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. DXZ4 resides at Xq23 <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> and is located between the t-Plastin gene (<it>PLS3</it>) and the Angiotensin II receptor, type-2 gene (<it>AGTR2</it>) (Figure <figr fid="F1">1a</figr>). Comparative analysis of human genes in the vicinity of DXZ4 in the mouse genome revealed several differences in the gene order (Figure <figr fid="F1">1a</figr>), including a break point between the mouse <it>PLS3 </it>and <it>AGTR2 </it>orthologs <it>Pls3 </it>and <it>Agtr2 </it>and between <it>Pls3 </it>and the mouse ortholog of <it>HTR2C</it>. In mouse, the nearest proximal gene to <it>Pls3 </it>is Rab39b (&gt;200 kb proximal), whereas the nearest distal gene is <it>Tbl1x</it>, located 1.8 Mb distal to <it>Pls3</it>. In humans, the respective orthologs of these two genes are located &gt;39 Mb distal to and &gt;20 Mb proximal to <it>PLS3</it>, indicating that <it>Pls3 </it>alone defines the block of synteny for this region with the human X chromosome. In primates, DXZ4 is a homogenous tandem repeat <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B30">30</abbr></abbrgrp>; we therefore performed pair-wise alignments of the genomic DNA sequence upstream of <it>Agtr2 </it>and downstream of <it>Pls3 </it>to look for evidence of tandem repeat DNA. Approximately 150 kb upstream of <it>Agtr2</it>, we identified an inverted repeat (Figure <figr fid="F1">1b</figr>) but no obvious tandem repeats. In contrast, pairwise alignments of the genomic DNA sequence distal to <it>Pls3 </it>identified an extensive tandem repeat spanning approximately 35 kb located 19 kb 3' to <it>Pls3 </it>(Figure <figr fid="F1">1c</figr>). In addition, an extensive minisatellite sequence spanning approximately 30 kb was located a further 24 kb downstream of the tandem repeat. The minisatellite was composed primarily of a novel gamma satellite sequence that is interrupted by several L1 and SYNREP repetitive elements, and the sequence itself displayed an inversion almost midway through the locus (Additional file <supplr sid="S1">1a</supplr>). The repeat showed significant sequence matches only to the mouse X chromosome, and no homologous repeat exists on the human or rat X chromosomes (data not shown). We therefore focused primarily on the tandem repeat.</p>
<fig id="F1"><title><p>Figure 1</p></title><caption><p>Genomic characterization of the mouse Dxz4 locus</p></caption><text>
   <p><b>Genomic characterization of the mouse Dxz4 locus</b>. <b>(a) </b>Ideograms of the human (HSAX) and mouse (MMUX) X chromosomes. Regions relevant to the search for Dxz4 are expanded to the right of the chromosome. Genes are represented by solid arrows pointing in the direction of transcription. Length represents extent of the gene. Human DXZ4 is represented as the red box. The location of the putative Dxz4 homolog and the downstream tandem repeat are highlighted proximal to mouse <it>Pls3 </it>as red and blue boxes, respectively. <b>(b) </b>Pair-wise alignment of approximately 360 kb (scale in kilobases given on the y-axis) downstream of the mouse <it>Agtr2 </it>gene (20.7 to 21.1 Mb, mm9, indicated for the x-axis). Sequence similarity is shown in blue with inverted similarity in yellow. Black bars on the top and left edges indicate extensive repeats. <b>(c) </b>Pair-wise alignment of approximately 240 kb encompassing the <it>Pls3 </it>gene (72.9 to 73.1 Mb, mm9) and distal sequence. <b>(d) </b>Pairwise alignment of the 36-kb mouse Dxz4 array. The block arrows on the top and left edges represent Dxz4 tandem repeat monomers. <b>(e) </b>Pairwise alignment of the largest and smallest Dxz4 monomers (block arrows on top and left edges) highlighting the existence of an internal variable number tandem repeat (VNTR) represented by the black arrows above the blue boxes. Perpendicular black lines within the monomers indicate the locations of simple repeats. <b>(f) </b>Extended DNA fiber fluorescence <it>in situ </it>hybridization (FISH) of the Dxz4 array. At the top is a schematic of a single Dxz4 monomer. The regions of Dxz4 used to generate direct-labeled FISH probes are indicated to the left (red) and right (green) of the VNTR (blue). Immediately below are examples of hybridization results. All pairwise alignments used the DNA sequence compared with a repeat-masked version of itself with the exception of that shown in (c), which compared non-repeat-masked sequences to show the inverted satellite repeat. Alignments were all made with YASS <abbrgrp><abbr bid="B71">71</abbr></abbrgrp>, and the output was pseudocolored to avoid red-green.</p>
</text><graphic file="gb-2012-13-8-r70-1"/></fig><suppl id="S1">
<title><p>Additional file 1</p></title>
<text><p><b>Genomic organization and expression of the downstream tandem repeat</b>. The pair-wise alignment and repeat content of the Ds-TR as well as expression as demonstrated by RT-PCR.</p></text><file name="gb-2012-13-8-r70-S1.PDF">
   <p>Click here for file</p>
</file></suppl><p>We next checked to see how frequently a tandem repeat of comparable size (35 kb) occurred on the mouse X chromosome to see if detection of such a sequence downstream of <it>Pls3 </it>would likely occur by chance. Pair-wise alignments along the length of the mouse X chromosome indicated that large tandem repeats are not common (Additional file <supplr sid="S2">2</supplr>), supporting the possibility that this might be the mouse homolog of DXZ4.</p><suppl id="S2">
<title><p>Additional file 2</p></title>
<text><p><b>X chromosome tandem repeat survey</b>. Table summarizing large tandem repeat elements along the mouse X chromosome.</p></text><file name="gb-2012-13-8-r70-S2.PDF">
   <p>Click here for file</p>
</file></suppl><p>Pair-wise alignment of the tandem repeat sequence revealed that, unlike DXZ4 in primates, where repeat units are very similar in size within a species <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B30">30</abbr></abbrgrp>, the individual repeating units of the mouse tandem repeat varied from 3.8 to 5.7 kb (Figure <figr fid="F1">1d</figr>). Closer examination showed that the size variation was accounted for by the presence of an internal variable number tandem repeat (VNTR) of an approximately 900-bp sequence present as between one and three copies per monomer (Figure <figr fid="F1">1e</figr>). As in primate DXZ4 <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B17">17</abbr><abbr bid="B30">30</abbr></abbrgrp>, less than 6% of the smallest monomer DNA sequence (3.8 kb) was repeat masked, and all of the masked regions corresponded to simple repeats. Examination of the largest monomer (5.7 kb) revealed that the first 147 bp of the internal VNTR was derived from an ERV class II long terminal repeat and that the other edge of the VNTR is defined by a simple repeat. The location of these repeat sequences may contribute to the observed copy-number variation. Three other defining features of human DXZ4 were examined for the novel mouse tandem repeat: CpG content, sequence variation between monomers, and size of the tandem array. Human DXZ4 DNA is 62.2% GC, contains 186 CpG dinucleotides per monomer <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, and shows less than 1% sequence divergence between adjacent monomers <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. In contrast, the mouse 3.8-kb monomer is 53.4% GC, contains 36 CpG dinucleotides, and shows greater than 5% sequence divergence from other monomers in the tandem array. In primates, DXZ4 is composed of as many as 100 repeat units spanning hundreds of kilobases on the X chromosome <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B17">17</abbr></abbrgrp>. In the current build of the mouse genome, the tandem repeat is composed of approximately seven repeat units. Given the inherent difficulty with the computer-based assembly of tandem repeats <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>, the actual array could be more extensive. We have previously used extended DNA fiber fluorescence <it>in situ </it>hybridization (FISH) to confirm tandem arrangement and copy-number variation of human DXZ4 <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. We applied the same procedure to examine such variation in the mouse tandem repeat, revealing approximately six tandem repeats in two independent mouse cell lines (Figure <figr fid="F1">1f</figr>). This result suggested that the mouse tandem repeat is relatively small, and the presence of the tandem repeat and extensive flanking DNA sequences entirely within the inserts of at least ten independent mouse bacterial artificial chromosomes from three different libraries derived from two <it>Mus musculus </it>subspecies lends additional support (Additional file <supplr sid="S3">3</supplr>). The logical interpretation of these observations was that the mouse sequence downstream of <it>Pls3 </it>is a tandem repeat but that the overall copy number of repeat units is low, resulting in a smaller array than in primates. Despite these differences from primate DXZ4, the tandem repeat remains a good candidate for the mouse homolog and from this point forward is referred to as Dxz4.</p><suppl id="S3">
<title><p>Additional file 3</p></title>
<text><p><b>Mouse BAC clones encompassing Dxz4</b>. BAC clones that completely span the mouse Dxz4 tandem repeat.</p></text><file name="gb-2012-13-8-r70-S3.PDF">
   <p>Click here for file</p>
</file></suppl>
</sec>
<sec><st><p>Expression of Dxz4</p></st>
<p>Primate DXZ4 is expressed, and all regions of a monomer can be detected in complementary DNA (cDNA) <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp>. Six regions of Dxz4 were assessed in cDNA from several different mouse total-RNA sources. The example shown in Figure <figr fid="F2">2a</figr> indicates that mouse Dxz4 was also expressed and that all parts of the Dxz4 monomer are transcribed into RNA. This result was confirmed by RNA FISH showing readily detectable Dxz4 primary transcript by means of direct-labeled Dxz4 probes (Figure <figr fid="F2">2b</figr>). In humans, DXZ4 is primarily transcribed from one strand (since designated the sense strand), but antisense transcript can be detected in females and is therefore interpreted as originating from the Xi <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Our previous data showed that only sense transcript could be detected in macaque <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. To assess the relative frequencies of sense and antisense transcription of Dxz4, we primed male and female cDNA from total RNA using oligonucleotides that would prime sense or antisense cDNA synthesis. As in macaque, only sense transcript was readily detected (Figure <figr fid="F2">2c</figr>). In humans, DXZ4 transcript can be detected from the Xa and the Xi <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B22">22</abbr></abbrgrp>, although expression in macaque is almost exclusively restricted to the Xa <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. RNA FISH was performed on female mouse cells with a direct-labeled Dxz4 probe and a probe to the X inactive specific transcript (Xist) <abbrgrp><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr></abbrgrp> to define the location of the Xi (Figure <figr fid="F2">2d</figr>). As in macaque, Dxz4 could only be readily detected from the Xa (Figure <figr fid="F2">2e</figr>). Collectively, our interpretation of these data is that expression of Dxz4 is restricted to the Xa allele and from one strand only.</p>
<fig id="F2"><title><p>Figure 2</p></title><caption><p>Characterization of unspliced Dxz4 transcript</p></caption><text>
   <p><b>Characterization of unspliced Dxz4 transcript</b>. <b>(a) </b>Schematic map of a Dxz4 monomer. The internal VNTR is represented by the black box. Below it are indicated six intervals (i to vi) assessed by reverse-transcription PCR (RT-PCR). The RT-PCR results for i to vi are given as images of ethidium bromide-stained agarose gels for NIH/3T3 complementary DNA (cDNA). Samples include water (W), RNA incubated with (+RT) and without (-RT) reverse transcriptase, and genomic DNA. <b>(b) </b>RNA FISH results of direct-labeled Spectrum-Orange or Spectrum-Green probes for regions i to vi in NIH/3T3 cells. Signals are indicated by white arrows merged with DAPI (black and white). <b>(c) </b>Strand-specific quantitative RT-PCR analysis of Dxz4 expression in two independent male and female samples. Graph shows fold expression of dxz4 in sense (left) and anti-sense (right) primed cDNA relative to cDNA prepared with no gene-specific primer. Error bars show standard deviation. <b>(d) </b>RNA FISH analysis of unspliced Dxz4 (red) and Xist RNA (green) merged with DAPI (black and white) in female cells. Dxz4 indicated by the white arrowheads and inactive X chromosome-specific transcript (Xist) by the white arrows. <b>(e) </b>Frequency of Dxz4 RNA FISH signals overlapping Xist in female cells.</p>
</text><graphic file="gb-2012-13-8-r70-2"/></fig>
<p>Examination of the GenBank mouse mRNA annotation for the Dxz4 locus on the UCSC Genome Browser <abbrgrp><abbr bid="B36">36</abbr></abbrgrp> revealed the presence of two alternatively spliced transcripts spanning Dxz4. Both transcripts originate at an exon almost 2.2 kb from the distal edge of the array (Figure <figr fid="F3">3a</figr>). The transcript then spliced to the same 163-nucleotide sequence within each of the monomers before splicing to an exon located 1.1 kb proximal to the array. One of the two spliced transcripts proceeded to be spliced to two additional exons approximately 16.0 kb downstream, whereas the other read through the splice site before terminating after a further 2.0 kb. To confirm the existence of the spliced forms of Dxz4, we performed reverse-transcription PCR (RT-PCR) between different combinations of the exons. The anticipated product was detected for each of the RT-PCR experiments (Figure <figr fid="F3">3b</figr>). Furthermore, the RT-PCR confirmed that the transcript contains multiple copies of the 163-nucleotide exon as can be seen from the laddered effect of progressively larger PCR products (see the PCR of exon 1 to 2 or 2 to 3 as examples). Furthermore, this exon was also alternatively spliced with some transcripts omitting one or more 163-nucleotide exons. This result could be observed as smaller laddered bands when RT-PCR was performed across the entire array (Figure <figr fid="F3">3b</figr>, exon 1 to 9/10).</p>
<fig id="F3"><title><p>Figure 3</p></title><caption><p>Expression of spliced Dxz4 and promoter characterization</p></caption><text>
   <p><b>Expression of spliced Dxz4 and promoter characterization</b>. <b>(a) </b>Schematic map of the Dxz4 region representing 72.95 to 73.01 Mb of the mouse X chromosome (mm9). The map is inverted for simplicity and the distal direction toward <it>Pls3 </it>indicated. Open block arrows represent Dxz4 monomers. A downstream CGI is indicated. Immediately below is a map indicating location and type of repeat elements for the interval: LINE, long interspersed nuclear element; LTR, long terminal repeat; SINE, short interspersed nuclear element. Below that are the maps of two putative alternatively spliced transcripts based on expressed sequence tag evidence. <b>(b) </b>Confirmation of spliced transcripts by RT-PCR. Each of the seven panels is an image of an ethidium bromide-stained agarose gel showing RT-PCR results for PCR between the exons indicated above. To the left of each image is the predicted product size. Samples include water control (W) and RNA incubated with (+RT) and without (-RT) reverse transcriptase. <b>(c) </b>DNA sequence feature map of the 1.3-kb region immediately upstream of Dxz4 exon 1 (green). Repetitive elements are indicated above the corresponding colored boxes. Immediately below are the regions cloned upstream of a promoterless luciferase reporter gene: construct A (Con.A) and construct B (Con.B). <b>(d) </b>Luciferase activity measured in NIH/3T3 cell extracts 72 hours after transfection with the promoterless luciferase vector (pGL4.10) or the same vector containing inserts for construct A or B. Fold activation of luciferase is shown to the left. Data represent the mean and standard deviation of replicate experiments each performed in triplicate.</p>
</text><graphic file="gb-2012-13-8-r70-3"/></fig>
<p>Both the spliced and unspliced transcripts corresponded to the sense transcript, and therefore probably originated from a common promoter, unlike human DXZ4, which contains a region with promoter activity within each monomer <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Examination of histone modification profiles from the Encyclopedia of DNA Elements (ENCODE) <abbrgrp><abbr bid="B42">42</abbr></abbrgrp> revealed a distinct peak of histone H3 trimethylated at lysine 4 (H3K4me3) <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> in the vicinity of exon 1 (data not shown). H3K4me3 is a modification associated with transcriptional start sites <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>. We therefore cloned the DNA sequence 5' of Dxz4 exon 1 immediately upstream of a promoterless luciferase reporter gene. Two constructs were generated. The first consisted of a 1.2-kb sequence that contained several repetitive elements that are located immediately upstream of exon 1 (Figure <figr fid="F3">3c</figr>). The second construct consisted of a 238-bp unique sequence 5' of exon 1. Robust promoter activity was detected for both constructs (Figure <figr fid="F3">3d</figr>); the highest activity consistently originated from the smaller unique sequence construct, confirming the location of the minimal Dxz4 promoter.</p>
<p>We next checked to see if the Dxz4 tandem repeat possessed intrinsic promoter activity like human DXZ4 <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Two overlapping fragments encompassing a complete Dxz4 monomer were PCR amplified (Additional file <supplr sid="S4">4a</supplr>), TA cloned and sequence verified. The DNA was then subcloned upstream of the promoterless luciferase reporter gene and assessed for promoter activity alongside the Dxz4 minimal promoter described above. Neither Dxz4 fragment showed obvious activity compared to the Dxz4 minimal promoter that consistently activated luciferase greater than 200-fold over the empty vector (Additional file <supplr sid="S4">4b</supplr>). Therefore, our interpretation of this result is that both the spliced and unspliced Dxz4 transcripts likely originate from transcription initiating from the minimal promoter. Consequently, it should be possible to detect by RT-PCR a transcript that spans exon 1 directly to the tandem repeat (Additional file <supplr sid="S5">5a</supplr>). Despite the relatively large size (approximately 2.5 kb) and proximity to the very 5' end of the message, this transcript can be detected in cDNA (Additional file <supplr sid="S5">5b</supplr>).</p><suppl id="S4">
<title><p>Additional file 4</p></title>
<text><p><b>Assessing Dxz4 for promoter activity</b>. Assessment of mouse Dxz4 for internal promoter activity.</p></text><file name="gb-2012-13-8-r70-S4.PDF">
   <p>Click here for file</p>
</file></suppl><suppl id="S5">
<title><p>Additional file 5</p></title>
<text><p><b>Dxz4 exon1 to tandem repeat monomer RT-PCR</b>. The presence of primary transcript bridging exon 1 of Dxz4 into the tandem repeat.</p></text><file name="gb-2012-13-8-r70-S5.PDF">
   <p>Click here for file</p>
</file></suppl><p>When the H3K4me3 profile of Dxz4 was examined, an additional major peak was noticed immediately distal to the downstream inverted tandem repeat (Ds-TR; data not shown), suggesting promoter activity within this region and the possibility that, like Dxz4, the Ds-TR is expressed. RT-PCR confirmed expression of Ds-TR in both male and female samples (Additional file <supplr sid="S1">1b</supplr>).</p>
</sec>
<sec><st><p>CpG methylation analysis in and around Dxz4</p></st>
<p>DXZ4 is unusual in that CpG dinucleotides are hypermethylated on the Xa but hypomethylated on the Xi <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B22">22</abbr></abbrgrp>, a trait that is conserved in macaque <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. We selected several regions in and around Dxz4 at which to determine and compare the CpG methylation profiles of males and females (Figure <figr fid="F4">4a</figr>). These sites included the Dxz4 promoter, a region of relatively high CpG incidence within the Dxz4 internal VNTR, a CGI immediately downstream of the Dxz4 array (DD-CGI), and two regions in the vicinity of the H3K4me3 peak adjacent to the Ds-TR.</p>
<fig id="F4"><title><p>Figure 4</p></title><caption><p>DNA methylation of elements in the vicinity of Dxz4</p></caption><text>
   <p><b>DNA methylation of elements in the vicinity of Dxz4</b>. <b>(a) </b>Schematic map of the region encompassing Dxz4 and the downstream satellite repeat (diverging open arrows). Left-pointing arrows represent Dxz4, and the location of the Dxz4 promoter and CGI are indicated. The red boxes indicate regions assessed for DNA methylation by PCR of bisulfite-modified DNA, cloning, and sequencing. The location of bisulfite analysis within the Dxz4 array is shown for a single monomer immediately below the array. <b>(b) </b>Cytosine methylation at CpG dinucleotides for the five regions shown in (a). Data are given for two independent males (top) and two independent females (bottom). Methylated cytosine is represented by a black circle whereas unmethylated is represented by an open circle. DNA variants that result in a sequence that is no longer a CpG are represented by dashes. Each row of circles represents DNA sequence obtained from a single clone, and each set of data consists of at least nine independent clones.</p>
</text><graphic file="gb-2012-13-8-r70-4"/></fig>
<p>The Dxz4 promoter (Figure <figr fid="F4">4b</figr>, far right) showed a significantly higher percentage of CpG methylation in females than in males (<it>P </it>= 0.0052, two-sample <it>t</it>-test). This result is consistent with our expression analysis (Figure <figr fid="F2">2</figr>), suggesting that transcription of Dxz4 is subject to XCI <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp> and explaining why Dxz4 transcript was only detected from the Xa (Figure <figr fid="F2">2e</figr>).</p>
<p>Males and females did not differ significantly in methylation of the sequence closest to the Ds-TR (profile on far left in Figure <figr fid="F4">4b</figr>; <it>P </it>= 0.7580) or in the region immediately distal to it (<it>P </it>= 0.0577), but the two regions differed vastly; the proximal sequence was almost entirely methylated, and the distal sequence hypomethylated, on both X chromosomes. Both sites overlap a broad signal of H3K4me3 (data not shown), but examination of other ENCODE features <abbrgrp><abbr bid="B42">42</abbr></abbrgrp> at these two regions revealed that the hypomethylated sequence overlapped a major peak of occupancy for Ctcf <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> and a DNaseI hypersensitive site <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>, whereas the hypermethylated site did not (Additional file <supplr sid="S6">6</supplr>). Binding of Ctcf to target sites containing CpG is sensitive to methylation <abbrgrp><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr></abbrgrp>. The hypomethylation in males and females suggests that Ctcf has the potential to bind this region on both the Xa and the Xi.</p><suppl id="S6">
<title><p>Additional file 6</p></title>
<text><p><b>CpG methylation relative to Ctcf and DNaseI hypersensitivity</b>. The location of a Ctcf and DNaseI peak relative to CpG methylation immediately adjacent to the Ds-TR.</p></text><file name="gb-2012-13-8-r70-S6.PDF">
   <p>Click here for file</p>
</file></suppl><p>Males and females did differ significantly in CpG methylation at the Dxz4 array (<it>P </it>= 0.0027) similar to what we and others have reported for primate DXZ4 <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp>. However, many sites of CpG residues predicted on the basis of the reference genome sequence (mm9) are not conserved, as demonstrated by the numerous gaps in the bisulfite profiles. Methylated cytosine in CpG is prone to mutation by deamination, whereas mutation rates of unmethylated CpG are lower <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. As a consequence, hypomethylated CGIs are evolutionarily conserved <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. The apparent lack of conservation of CpG dinucleotides at Dxz4 is consistent with the overall hypermethylated profiles (Figure <figr fid="F4">4b</figr>). This situation differs from that of primate DXZ4, where CpG residues are highly conserved <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp>, consistent with evolutionary maintenance of DXZ4 as an extensive CGI <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>.</p>
<p>Furthermore, males and females did differ significantly in methylation at DD-CGI (<it>P </it>= &lt;0.0001); more hypomethylated clones were obtained from the female samples (Figure <figr fid="F4">4b</figr>). Our interpretation of these data is that DD-CGI is hypomethylated on the Xi. DD-CGI spans 333 bp and contains 40 CpGs on the basis of the C57BL/6J reference genome sequence (mm9). None of the genomic feature annotations generated by ENCODE <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>, including Ctcf, highlight DD-CGI, and therefore the significance of Xi hypomethylation remains unclear.</p>
</sec>
<sec><st><p>Histone methylation and Ctcf association with sequences in the vicinity of Dxz4</p></st>
<p>Next we sought to complement the DNA methylation analysis by examining histone methylation and Ctcf binding in and around Dxz4. Several sites were selected, including the Dxz4 promoter, the Dxz4 VNTR region, DD-CGI, Ds-TR, and the <it>Pls3 </it>promoter as a control for mouse genes subject to XCI <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> (Figure <figr fid="F5">5a</figr>).</p>
<fig id="F5"><title><p>Figure 5</p></title><caption><p>Characterization of chromatin in the vicinity of Dxz4</p></caption><text>
   <p><b>Characterization of chromatin in the vicinity of Dxz4</b>. <b>(a) </b>Schematic map of the region encompassing Dxz4 and the downstream satellite repeat (diverging open arrows). Left-pointing facing arrows represent Dxz4, and the location of the CGI and promoters for Dxz4 and <it>Pls3 </it>are indicated. The angled double strike through the map between the Dxz4 and <it>Pls3 </it>promoters represents an approximately 114-kb gap. The red boxes indicate the regions assessed by chromatin immunoprecipitation (ChIP)-PCR. <b>(b) </b>Graphs showing results of ChIP assayed by quantitative PCR. The mean and standard deviation for the ChIP elution (IP) and for a negative control rabbit serum (RS) are shown as percentage of the input. For H3K4me2 and H3K27me3 at the Dxz4 promoter (Dxz4-Prom) and <it>Pls3 </it>promoter (Pls3-Prom), data for one male and one female are shown. For H3K4me2 and Ctcf at Dxz4, DS-TR and DD-CGI, data are shown for two independent male and female samples. <b>(c) </b>Pie charts showing the percentage of C57BL/6J (B6) or castaneous (Cast) informative allele calls for Ctcf ChIP-Seq fragments for Dxz4 and the downstream tandem repeat (Ds-TR) Ctcf binding sites.</p>
</text><graphic file="gb-2012-13-8-r70-5"/></fig>
<p>Consistent with the expression analysis (Figure <figr fid="F2">2</figr>) and CpG methylation (Figure <figr fid="F4">4b</figr>), the Dxz4 promoter was characterized by the euchromatin mark H3K4me2 in male and female cells, whereas the facultative heterochromatin marker histone H3 trimethylated at lysine 27 (H3K27me3) was only a feature of the female samples (Figure <figr fid="F5">5b</figr>). The same profile is obtained for the <it>Pls3 </it>promoter, which is subject to XCI in mouse <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. Given that genes on the Xi are silenced by H3K27me3 <abbrgrp><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr></abbrgrp>, these data further support the conclusion that Dxz4 expression is subject to XCI.</p>
<p>In primates, H3K4me2 is a feature of DXZ4 on the Xi <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp>, although this modification can be detected on the male X at low levels in some individuals and as a result of cellular transformation <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>. In contrast, H3K4me2 was readily detected at Dxz4 in males and females (Figure <figr fid="F5">5b</figr>), another difference between mouse and primate DXZ4. Somewhat surprisingly, given the methylation profile at DD-CGI (Figure <figr fid="F4">4b</figr>), H3K4me2 could also be detected at this site in males and females. One possible explanation is that the DD-CGI is located within the transcriptional unit of one of the spliced Dxz4 transcripts (Figure <figr fid="F3">3a</figr>). Therefore, the detection of the euchromatin mark may reflect variable levels of H3K4me2 in the body of active genes <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>.</p>
<p>A defining feature of primate DXZ4 is the association of CTCF with the Xi allele <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp>. Ctcf was readily detected at Dxz4 in multiple independent female samples, but Ctcf was also detected, albeit at lower levels, in some but not all males (Figure <figr fid="F5">5b</figr> and data not shown). To investigate further the relationship between Ctcf and Dxz4 on the Xa and Xi, we examined DNA sequence reads from Ctcf chromatin immunoprecipitation (ChIP) combined with next generation sequencing (ChIP-Seq) performed on trophoblast stem cells (TSCs), which are derived from the extraembryonic material and undergo imprinted XCI with preferential inactivation of the paternal X chromosome <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>. The TSCs were derived from a cross of a male C57BL/6J (BL6) with a female castaneous (cast) mouse. As a result, the BL6 X chromosome will be the Xi. ChIP-Seq reads were compared to BL6 and cast variant sequences for the Dxz4 interval assessed by ChIP-PCR and, where informative, were designated as originating from the Xa (cast) or Xi (BL6). Of 152 ChIP-Seq reads, almost half were assigned to the Xa and half to the Xi (Figure <figr fid="F5">5c</figr>), consistent with detection of Ctcf at the Xa in some males. One interpretation of these data is that Ctcf binds Dxz4 at the Xa and Xi equally, but not detecting Ctcf at Dxz4 in all males even when it is readily detected in the same samples at a known Ctcf binding site within the H19 imprinted control region <abbrgrp><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr></abbrgrp> suggests that binding of Ctcf to Dxz4 varies. This result could reflect subtle differences in CpG methylation (compare the two male bisulfite profiles in Figure <figr fid="F4">4b</figr>), strain or cell-type differences. Nevertheless, these observations are consistent with the differences we report above for Dxz4 chromatin organization at the Xa and Xi between mouse and primates. Notably, the association of Ctcf within the VNTR region means that although the array itself is relatively small, the potential Ctcf occupancy is higher than one per repeat monomer.</p>
<p>As mentioned above, the unique sequence (Ds-TR) located immediately distal to the large inverted satellite repeat (Figure <figr fid="F5">5a</figr>; Additional file <supplr sid="S1">1</supplr>) is characterized by DNaseI hypersensitivity and Ctcf binding (Additional file <supplr sid="S6">6</supplr>). Ctcf ChIP-PCR confirmed association with this sequence in males and females (Figure <figr fid="F5">5b</figr>), and as anticipated given the CpG hypomethylation (Figure <figr fid="F4">4b</figr>), the region was characterized by H3K4me2. To determine whether Ctcf at Ds-TR is associated with the Xa alone or with Xa and Xi, we used informative BL6 and cast SNPs to assign Ctcf ChIP-Seq reads to their X chromosome of origin. Unlike Dxz4, Ctcf at Ds-TR was biased toward the Xa but could also bind the Xi to a lower extent (Figure <figr fid="F5">5c</figr>).</p>
</sec>
<sec><st><p>Conservation of a large tandem repeat downstream of <it>PLS3 </it>in mammals</p></st>
<p>Thus far we have shown that, as in primates <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B17">17</abbr><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp>, a large tandem repeat is present downstream of <it>Pls3 </it>on the mouse X chromosome despite extensive shuffling of the locations of genes from the same interval (Figure <figr fid="F1">1a</figr>). We sought to determine whether a tandem repeat was present downstream of <it>PLS3 </it>in a diverse set of mammals for which genome assemblies were sufficiently complete. Pairwise alignment of genomic sequence distal to <it>PLS3 </it>was performed for seven different mammals. Each revealed the presence of a tandem repeat within 28 to 110 kb of the 3' end of <it>PLS3 </it>(Figure <figr fid="F6">6</figr>).</p>
<fig id="F6"><title><p>Figure 6</p></title><caption><p>Identification of a tandem repeat downstream of <it>PLS3 </it>in eight different mammals</p></caption><text>
   <p><b>Identification of a tandem repeat downstream of <it>PLS3 </it>in eight different mammals</b>. Pairwise alignment of genomic DNA sequence encompassing and extending downstream of <it>PLS3 </it>for each mammal (labeled above each plot). The structure and location of <it>PLS3 </it>is indicated on the top and left edge of each alignment. Distance in kilobases is indicated to the right of each plot. The distance between the 3' end of <it>PLS3 </it>and the downstream tandem repeat is highlighted above each plot. The extent of the tandem repeat is highlighted by the black bar above and to the left of each plot. Arrows pointing down from the top or rightward from the left edge indicate gaps in the genome assembly.</p>
</text><graphic file="gb-2012-13-8-r70-6"/></fig>
</sec>
<sec><st><p>Conservation of the CTCF binding sequence at DXZ4</p></st>
<p>Previously we have shown that a region encompassing the CTCF binding site is conserved in primates, but outside of this interval divergence of the sequence of DXZ4 and size of the individual tandem repeat unit increases substantially with distance down the primate tree <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. Focusing only on this region, we identified 74% nucleotide identity over 100 bp between human DXZ4 and the VNTR region within each mouse Dxz4 monomer. Similar levels of nucleotide identity over the same interval were identified within the tandem repeat DNAs shown in Figure <figr fid="F6">6</figr>. We used this interval to extract homologous DNA sequence entries from 25 different mammals before aligning all of the sequences. Most of the mammals examined formed clades corresponding to their respective orders and suborders, such as the primates, which all branch from a single node (Figure <figr fid="F7">7a</figr>). These data support evolution of DXZ4 from a common ancestor in a manner analogous to that of coding sequences. Close examination of the DNA sequence alignment revealed a subregion of the conserved DNA sequence in which several nucleotides were identical in all 25 mammals. A 34-bp sequence encompassing all invariable nucleotides was extracted from each sequence and used to generate a position weight matrix <abbrgrp><abbr bid="B55">55</abbr></abbrgrp> that clearly revealed the nonrandom nature of this sequence (Figure <figr fid="F7">7b</figr>). Given that this sequence is entirely contained within the region assessed by PCR in primate CTCF ChIP <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp>, mouse Ctcf ChIP (Figure <figr fid="F5">5b</figr>), and mouse Ctcf ChIP-Seq (Figure <figr fid="F5">5c</figr>), the position weight matrix sequence was compared with a previously defined Ctcf consensus sequence <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>, and as can be seen in Figure <figr fid="F7">7b</figr>, the most conserved DXZ4 sequence in all mammals examined was a good match to this consensus.</p>
<fig id="F7"><title><p>Figure 7</p></title><caption><p>Identification of a conserved DNA sequence element with homology to a CTCF consensus sequence in mammalian DXZ4</p></caption><text>
   <p><b>Identification of a conserved DNA sequence element with homology to a CTCF consensus sequence in mammalian DXZ4</b>. <b>(a) </b>Schematic representation of a mouse Dxz4 monomer. The green arrowhead indicates the spliced exon. The blue vertical bars indicate repeat-masked sequence. The black bar represents the VNTR. The yellow box within the VNTR (bases 919 to 1,061) represents the conserved Dxz4 sequence. This sequence was used to align to the corresponding sequences from the mammals listed to generate the cladogram. The tree image was generated with MUSCLE version 3.8 <abbrgrp><abbr bid="B72">72</abbr></abbrgrp> and ClustalW2 <abbrgrp><abbr bid="B73">73</abbr></abbrgrp>. Classification of the groups is given to the right. <b>(b) </b>Schematic representation of a mouse Dxz4 monomer as above. The yellow box within the VNTR (bases 978 to 1,011) represents the DNA sequence that contains nucleotides invariable in all mammalian DXZ4 sequences assessed. This 34-bp sequence from each mammal was used to generate the position weight matrix through WebLogo <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>. Beneath the matrix is a previously determined Ctcf consensus sequence that is adapted from Martin <it>et al. </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. Note that the position weight matrix is the reverse complement of that shown in the referenced manuscript.</p>
</text><graphic file="gb-2012-13-8-r70-7" hint_layout="single"/></fig>
<p>The Ctcf match to the conserved sequence only accounts for bases 3 to 21, yet conservation of DNA sequence across the diverse group of mammals extends for an additional 13 bp. It is conceivable that this extended conservation reflects retention of an additional binding motif(s) for other DNA binding protein(s). To explore this possibility, the consensus sequence was compared to motifs in JASPAR <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. Two motifs showed good matches to this region. The first is a 9 out of 10 base match to the recently determined mouse consensus for the CCAAT/enhancer-binding protein alpha (Cebpa) <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>, whereas the second is a match (9 out of 9) for the human consensus for ETS-domain protein 4 (ELK4) <abbrgrp><abbr bid="B58">58</abbr></abbrgrp> (Additional file <supplr sid="S7">7</supplr>). Cebpa is an essential basic-leucine zipper DNA binding protein that performs essential roles in the development of myeloid cells <abbrgrp><abbr bid="B59">59</abbr></abbrgrp> and in liver function <abbrgrp><abbr bid="B60">60</abbr></abbrgrp>. ELK4 is a ubiquitous serum response factor accessory protein <abbrgrp><abbr bid="B61">61</abbr></abbrgrp> that is found at many locations in the genome <abbrgrp><abbr bid="B62">62</abbr></abbrgrp>. Whether either protein binds to Dxz4 has yet to be determined, but given the broad cross-species conservation of the DNA sequence and good matches with each DNA binding consensus sequence <abbrgrp><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr></abbrgrp>, both are candidates worthy of further investigation.</p><suppl id="S7">
<title><p>Additional file 7</p></title>
<text><p><b>Motif alignments to Dxz4 conserved region</b>. Alignment of the Dxz4 conserved region with DNA binding protein motifs in JASPAR.</p></text><file name="gb-2012-13-8-r70-S7.PDF">
   <p>Click here for file</p>
</file></suppl>
</sec>
</sec>
<sec><st><p>Conclusions</p></st>
<p>Comparative genomics is a powerful means of uncovering important functional DNA elements through DNA sequence conservation <abbrgrp><abbr bid="B63">63</abbr></abbrgrp>, but identification of mouse Dxz4 was initially discovered not through primary DNA sequence conservation but instead through conservation of DNA sequence organization within a syntenic region of the mouse genome. This work led to the subsequent identification of DXZ4 in a diverse group of distantly related mammals. DNA sequence comparisons revealed a highly conserved region within each DXZ4 monomer that corresponds to the CTCF binding motif that is bound by CTCF in all mammals tested thus far. Furthermore, the highly conserved sequence immediately adjacent to the Ctcf consensus site suggests a second DNA binding protein may associate alongside Ctcf. Therefore, on the basis of conservation, several features of DXZ4 appear to have functional importance in eutherian mammals: CTCF binding, tandem-repeat organization, expression, and location downstream of <it>PLS3</it>.</p>
<p>In primates CTCF association with DXZ4 is almost exclusively Xi-specific <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp>, yet the analysis of mouse Dxz4 we report here suggests that its chromosome specificity is not as clearly defined; it apparent binds to both the Xa and the Xi to varying degrees. Primates and mouse appear to differ in several other aspects of DXZ4. First, primate DXZ4 is composed of a large number of tandem repeat units in which adjacent repeat monomers share very high DNA sequence identity and length <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B30">30</abbr></abbrgrp>. The same is not true of mouse Dxz4. The tandem array is small in comparison, and individual repeat monomers display pronounced sequence variation and the presence of an internal VNTR. Perhaps near-identical sequence composition and monomer size are a prerequisite for expansion, such as the observed complex gene conversion mechanisms reported for minisatellites <abbrgrp><abbr bid="B64">64</abbr></abbrgrp> or through alternative processes such as intrachromatid recombination or unequal exchange <abbrgrp><abbr bid="B65">65</abbr></abbrgrp>. Second, DXZ4 DNA sequence is GC-rich in primates <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B17">17</abbr><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp> but not in mouse. Third, DXZ4 in humans contains a DNA sequence with inherent promoter activity in each monomer <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. This sequence is not conserved in mouse and intrinsic promoter activity is not obvious within the Dxz4 monomers. Instead a promoter located to one side of Dxz4 drives transcription across the entire array, but tandem repeat units in several other mammals do show substantial DNA sequence homology to human DXZ4 beyond the CTCF binding region encompassing the promoter sequence. These include cat, dog, horse, elephant, dolphin, microbat, rabbit, and flying fox (data not shown), suggesting that these mammals will likely retain internal promoter activity negating the need for the external promoter. Fourth, although all DXZ4 examined is transcribed <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp>, at least some mouse Dxz4 is spliced, a feature not observed in primates. Finally, euchromatin is largely restricted to DXZ4 on the Xi in primates <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp> yet H3K4me2 is a feature of Dxz4 on the Xa in mouse. One feature that is consistent between the mouse and primate macrosatellite is significantly higher incidence of CpG hypomethylation in females that we interpret as originating from the Xi. Compared to primates, however, the overall profile is more methylated in mouse relative to primates <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B22">22</abbr><abbr bid="B30">30</abbr></abbrgrp>. Conceivably, the hypermethylation of Dxz4 combined with lower overall GC content is accelerating mutation of CpG dinucleotides <abbrgrp><abbr bid="B66">66</abbr></abbrgrp>.</p>
<p>Collectively, these observations suggest that the functions performed by DXZ4 in primates are not all necessarily conserved in mouse. We hypothesize that primate DXZ4 has important but distinct roles on the Xa and Xi that both necessitate a large homogenous tandem array. On the Xa this role involves expression and packaging into heterochromatin. Given the extreme copy-number variation of DXZ4 <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B17">17</abbr></abbrgrp>, the macrosatellite could conceivably modulate the transcription of the adjacent <it>PLS3 </it>gene, which shows considerable variation in expression levels between individuals <abbrgrp><abbr bid="B67">67</abbr></abbrgrp>. In contrast, on the Xi a euchromatic organization bound by CTCF is required. The fact that CTCF is central to mediating genome organization <abbrgrp><abbr bid="B68">68</abbr></abbrgrp>, and that, at least in humans, CTCF-bound DXZ4 mediates Xi-specific long-range intrachromosomal interactions with other Xi-specific CTCF-bound tandem repeats <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> suggests that DXZ4 performs a structural role on the Xi. Mouse Dxz4 may or may not perform either function, and the difference could contribute to some of the observed differences between the biology of the human and mouse X chromosome, such as the variable escape of <it>PLS3 </it>expression from the Xi in humans <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> but not in mouse <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. The distinct differences between DXZ4 and Dxz4 suggest that, if Dxz4 performs a similar function, it has evolved alternative strategies in order to do so. Nevertheless, the evolutionarily constrained association of CTCF/Ctcf with mammalian DXZ4 appears central even if conservation of function is not.</p>
</sec>
<sec><st><p>Materials and methods</p></st>
<sec><st><p>Cells</p></st>
<p>Mouse male fibroblast cell line NIH/3T3 (CRL-1658) and female fibroblast cell line Balb/3T3 (CCL-163) were obtained from ATCC. Mouse female fibroblast cell line BC06 (hybrid C57BL/6J X castaneous) was obtained from Laura Carrel. Male and female CD-1 and C57BL/6J mouse embryonic fibroblasts were derived by standard techniques <abbrgrp><abbr bid="B69">69</abbr></abbrgrp>. All cells were maintained in Dulbecco's modified Eagle's medium containing 10% fetal bovine serum supplemented with 1&#215; nonessential amino acids, 2 mM L-glutamine, 100 U/ml penicillin, and 0.1 mg/ml streptomycin. All medium components were obtained from Invitrogen (Life Technologies Corp, Grand Island, NY, USA); NIH/3T3 cells were cultured in media containing Hyclone bovine calf serum (Thermo Scientific, Rockford, IL, USA) in place of fetal bovine serum.</p>
</sec>
<sec><st><p>Bisulfite modification of DNA, cloning and sequencing</p></st>
<p>Genomic DNA was isolated from primary cells with the NucleoSpin Tissue kit (Machery-Nagel, Bethlehem, PA, USA). Genomic DNA was isolated from mouse tail snips by standard techniques <abbrgrp><abbr bid="B69">69</abbr></abbrgrp>. Unmethylated cytosines were converted to uracil with the EpiTect bisulfite modification kit (Qiagen, Valencia, CA, USA). Bisulfite-modified DNA was used as a template for PCR with OneTaq<sup>&#174;</sup> master mix (NEB, Ipswich, MA, USA) and the primers listed in Additional file <supplr sid="S8">8</supplr>. PCR products were cloned into pDrive TA vector (Qiagen), and positive clones sequenced (Eurofins MWG Operon, Huntsville, AL, USA) and analyzed with Sequencher 5.0 (Gene Codes Corp., Ann Arbor, MI, USA). Statistically significant differences in methylation between males and females were determined as follows. The percent methylation for individual clones (a single horizontal line in the profiles) was determined and the mean and standard deviation was calculated for the males and females. These were compared using the two-tailed <it>t</it>-test with differing variance as described previously for methylation profiles <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>.</p><suppl id="S8">
<title><p>Additional file 8</p></title>
<text><p><b>Table listing all oligonucleotides used in this study</b>.</p></text><file name="gb-2012-13-8-r70-S8.PDF">
   <p>Click here for file</p>
</file></suppl>
</sec>
<sec><st><p>RNA and extended DNA fiber FISH</p></st>
<p>Mouse Dxz4 fragments were PCR amplified and cloned into the TA vector pCR2.1 (Life Technologies Corp.) before sequence verification. Direct-labeled FISH probes were generated from Dxz4-pCR2.1-isolated DNA with SpectrumOrange&#8482; or SpectrumGreen&#8482; and a nick translation kit (Abbott Molecular, Abbott Park, IL, USA). Probes were heat inactivated at 68&#176;C for 10 minutes before ethanol precipitation and resuspension in Hybrisol VII (MP Biomedicals, Santa Ana, CA, USA). RNA FISH was performed on cells grown directly on microscope slides. Cells were rinsed with 1&#215; phosphate-buffered saline (PBS) before being fixed and extracted for 10 minutes at room temperature in 3.7% formaldehyde, 0.1% Triton X-100 in 1&#215; PBS. Slides were rinsed twice in 1&#215; PBS before dehydration for 3 minutes in 70% and 100% ethanol before being air-dried. Probes were denatured in a thermal cycler at 72&#176;C for 10 minutes before the temperature was reduced to 37&#176;C, at which point the probe was applied directly to the slide, sealed under a cover glass, and hybridized overnight at 37&#176;C. Cover slips were removed and the samples washed twice at room temperature for 2 minutes each in 50% formamide/2 SSC, once for 3 minutes at 37&#176;C in 50% formamide/2&#215; SSC, and once for 3 minutes at 37&#176;C in 2&#215; SSC before addition of ProLong<sup>&#174;</sup> Gold antifade reagent supplemented with DAPI (Life Technologies Corp.). Mouse extended DNA fibers were prepared and FISH performed essentially as previously described <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Images were either collected with a Zeiss Axiovert 200 M fitted with an AxioCam MRm and managed with AxioVision 4.4 software (Carl Zeiss microimaging) or collected with a DeltaVision pDV. Delta Vision images were deconvolved with softWoRx 3.7.0 (Applied Precision, Issaquah, WA, USA) and compiled with Adobe Photoshop CS2 (Adobe Systems).</p>
</sec>
<sec><st><p>Standard and strand-specific cDNA preparation and PCR</p></st>
<p>Total RNA was extracted from cells with the NucleoSpin RNA II kit (Machery-Nagel). For standard RT-PCR, first-strand cDNA was prepared from 2 &#956;g of total RNA with random hexamers with and without M-MuLV reverse transcriptase (RT) according to the manufacturer's instructions (NEB). cDNAs prepared with and without RT were used as templates for PCR with either OneTaq<sup>&#174;</sup> master mix (NEB) or HotStar Taq (Qiagen) with the primers listed in Additional file <supplr sid="S8">8</supplr>. PCR was performed using an initial denaturation of 10 minutes at 94&#176;C, followed by 35 cycles of: 94&#176;C for 30 seconds, 58&#176;C for 30 seconds and 72&#176;C for 30 seconds for all products of up to 750 bp, 1 minute for all products up to 1,250 bp and 1 minute 30 seconds for products up to 2 kb. The cycling was followed by 10 minutes at 72&#176;C before holding at 15&#176;C. Strand-specific cDNA was prepared as above except that first-strand cDNA was primed with 1.5 pmol of a specified oligonucleotide (Additional file <supplr sid="S8">8</supplr>) in place of random hexamers and an additional control that included RT but no oligonucleotide that is used to determine the background levels of cDNA synthesized in the absence of a gene-specific primer. Strand-specific cDNA was assessed by quantitative RT-PCR using the primers given (Additional file <supplr sid="S8">8</supplr>) with a SYBR-Green qPCR Mastermix (SABiosciences, Qiagen) on a CFX96 (Biorad, Hercules, CA, USA). PCR was performed using an initial 10-minute denaturing step at 95&#176;C followed by 40 cycles of: 15 seconds at 95&#176;C, 30 seconds at 60&#176;C and 30 seconds at 72&#176;C. The cycle was followed by a melt-curve. PCR was performed in triplicate and the transcript level determined relative to background.</p>
</sec>
<sec><st><p>Promoter luciferase assay</p></st>
<p>DNA fragments initiating in and extending upstream of Dxz4 exon 1 were generated by PCR with Platinum<sup>&#174;</sup>Taq (Life Technologies Corp.; 94&#176;C for 2 minutes followed by 40 cycles of: 94&#176;C for 30 seconds, 58&#176;C for 30 seconds and 68&#176;C for 1 minute 20 seconds for construct A or 68&#176;C for 30 seconds for construct B) and cloned into pDrive (Qiagen). Inserts were verified by DNA sequencing before subcloning into the <it>Kpn</it>I and <it>Xho</it>I sites of pGL4.10[luc2] (Promega, Madison, WI, USA). The Dxz4-promoter pGL4.10[luc2] firefly luciferase reporter constructs were co-transfected in triplicate on two separate occasions with the <it>Renilla</it>-luciferase expression vector pGL4.74[hRluc/TK] (Promega) into NIH/3T3 cells by means of Lipofectamine 2000 (Life Technologies Corp.). Cells were assayed for luciferase activity on a Glomax-20/20 Luminometer (Promega) 72 hours after transfection with the dual-luciferase reporter assay system, according to the manufacturer's recommendations (Promega).</p>
</sec>
<sec><st><p>ChIP and analysis</p></st>
<p>Standard ChIP was performed on mouse cells essentially as described previously <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> except that formaldehyde cross-linking was with 0.75% formaldehyde rather than 1.0%. Chromatin was sheared with a Bioruptor (Diagenode, Denville, NJ, USA) set at 8 cycles of 30 seconds on and 30 seconds off on high setting. Rabbit polyclonal antibodies used were all obtained from Millipore (Billerica, MA, USA) and included anti-H3K4me2 (07-030), anti-H3K27me3 (07-449), and anti-CTCF (07-729). ChIP was assessed by quantitative PCR using the primers given (Additional file <supplr sid="S8">8</supplr>) with a SYBR-Green qPCR Mastermix (SABiosciences, Qiagen) on a CFX96 (Biorad). PCR was performed using an initial 10-minute denaturing step at 95&#176;C followed by 40 cycles of: 15 seconds at 95&#176;C, 30 seconds at 60&#176;C and 30 seconds at 72&#176;C. The cycle was followed by a melt-curve. Standard curves were prepared by making a 1:5 serial dilution of the input for each ChIP. ChIP and mock (rabbit serum) samples were assessed in triplicate and the percentage of quantitative PCR product normalized and determined from the standard curve using Bio-Rad CFX Manager 2.1 software (Biorad). Each ChIP experiment and all PCR assessments were replicated on at least three independent occasions. Anti-Ctcf ChIP on mouse TSCs derived from a C57BL/6J &#215; CAST/EiJ cross was combined with next-generation sequencing (100-bp paired-end reads) as described in detail elsewhere (Calabrese JM and Magnuson T, in preparation). Briefly, ChIP was performed on 10 to 40 &#215; 10<sup>6 </sup>feeder-free TSCs. Cells were crosslinked for 10 minutes at room temperature in 0.6% formaldehyde before quenching in 125mM glycine for 5 minutes. Cells were resuspended in 50 mM Tris-HCl pH 7.5, 140 mM NaCl, 1 mM EDTA, 1 mM EGTA, 0.1% Na-deoxycholate and 0.1% SDS. Cells were sonicated to generate fragments averaging 200 to 500 bp, cleared by centrifugation and resuspended at 20 &#215; 10<sup>6 </sup>cells/ml in the buffer above supplemented with 1% Triton-X100. ChIP was performed with 10 &#956;g of antibody. Post-ChIP, three washes with the buffer used for the ChIP were performed, followed by a wash in the same buffer but with 500 mM NaCl, once with 20 mM Tris pH 8.0, 1 mM EDTA, 250 mM LiCl, 0.5% Na-deoxycholate and once with TE buffer. Chromatin was eluted for 15 minutes at 65&#176;C in 50 mM Tris pH 8.0, 10 mM EDTA and 1% SDS. A ChIP-Seq library was prepared according to Illumina instructions using 10 to 200 ng of ChIP DNA and sequenced on Illumina's Genome Analyzer IIx or HiSeq2000 instrument. Ctcf ChIP-Seq data have been deposited with Gene Expression Omnibus and assigned the provisional accession number GSE40667. The DNA sequence of the mouse Dxz4 array was used to extract ChIP-Seq hits with homology to Dxz4. An approximately 232-bp DNA fragment spanning the putative mouse Dxz4 Ctcf binding site was amplified from C57BL/6J and castaneous genomic DNA isolated from tail snips. PCR was performed using HotStar Taq (Qiagen) with an initial denaturation of 10 minutes at 94&#176;C, followed by 35 cycles of: 94&#176;C for 30 seconds, 58&#176;C for 30 seconds and 72&#176;C for 30 seconds. The PCR product was cloned into pDrive, and for each DNA source over 100 clones were isolated and sequenced. Sequence variants specific to C57BL/6J and castaneous were then used to manually align with 100% sequence identity over a minimum of 30 bp to the Ctcf ChIP-Seq Dxz4 sequences and designated either C57BL/6J or castaneous. All SNP variants have been deposited with dbSNP. Details can be found in Additional file <supplr sid="S9">9</supplr>.</p><suppl id="S9">
<title><p>Additional file 9</p></title>
<text><p><b>Dxz4 SNP data</b>. List of SNPs identified in proximity to the Dxz4 Ctcf site in BL6 and cast DNA that were used to assign Ctcf ChIP-Seq fragments to the BL6 or cast chromosome in Figure <figr fid="F5">5c</figr>.</p></text><file name="gb-2012-13-8-r70-S9.PDF">
   <p>Click here for file</p>
</file></suppl>
</sec>
</sec>
<sec><st><p>Abbreviations</p></st>
<p>BAC: bacterial artificial chromosome; BL6: C57BL/6J mouse; bp: base pair; cast: castaneous mouse; cDNA: complementary DNA; CGI: CpG island; ChIP: chromatin immunoprecipitation; ChIP-Seq: ChIP combined with next generation sequencing; CTCF: CCCTC-binding factor; DD-CGI: CGI immediately downstream of the Dxz4 array; Ds-TR: downstream inverted tandem repeat; ENCODE: Encyclopedia of DNA Elements; FISH: fluorescence <it>in situ </it>hybridization; H3K4me2: histone H3 dimethylated at lysine 4; H3K4me3: histone H3 trimethylated at lysine 4; H3K27me3: histone H3 trimethylated at lysine 27; PBS: phosphate buffered saline; PCR: polymerase chain reaction; RT: reverse transcriptase; TSC: trophoblast stem cell; VNTR: variable number tandem repeat; Xa: active X chromosome; XCI: X-chromosome inactivation; Xi: inactive X chromosome; Xist: X-inactive specific transcript.</p>
</sec>
<sec><st><p>Competing interests</p></st>
<p>The authors declare that they have no competing interests.</p>
</sec>
<sec><st><p>Authors' contributions</p></st>
<p>BPC conceived the study, analyzed and interpreted data, performed experiments, and wrote the manuscript. AHH performed experiments and analyzed the data. MC performed Ctcf ChIP-Seq and analyzed the data. CRM and DT carried out experiments. All authors reviewed and contributed to the manuscript.</p>
</sec>
</bdy>
<bm><ack>
<sec><st><p>Acknowledgements</p></st>
<p>This work was supported by grants from the National Institute of General Medical Sciences to BPC (NIH R01 GM073120) and TRM (NIH R01 GM10974). We are grateful to Danielle Maatouk and Blanche Capel for assistance with derivation of mouse embryonic fibroblasts and to Laura Carrel for use of the BC06 cell line. We are indebted to A Thistle for critically evaluating the manuscript.</p>
</sec></ack>
<refgrp><bibl id="B1"><title><p>Repetitive elements may comprise over two-thirds of the human genome.</p></title><aug><au><snm>de Koning</snm><fnm>AP</fnm></au><au><snm>Gu</snm><fnm>W</fnm></au><au><snm>Castoe</snm><fnm>TA</fnm></au><au><snm>Batzer</snm><fnm>MA</fnm></au><au><snm>Pollock</snm><fnm>DD</fnm></au></aug><source>PLoS Genet</source><pubdate>2011</pubdate><volume>7</volume><fpage>e1002384</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pgen.1002384</pubid><pubid idtype="pmcid">3228813</pubid><pubid idtype="pmpid" link="fulltext">22144907</pubid></pubidlist></xrefbib></bibl><bibl id="B2"><title><p>Initial sequencing and analysis of the human genome.</p></title><aug><au><snm>Lander</snm><fnm>ES</fnm></au><au><snm>Linton</snm><fnm>LM</fnm></au><au><snm>Birren</snm><fnm>B</fnm></au><au><snm>Nusbaum</snm><fnm>C</fnm></au><au><snm>Zody</snm><fnm>MC</fnm></au><au><snm>Baldwin</snm><fnm>J</fnm></au><au><snm>Devon</snm><fnm>K</fnm></au><au><snm>Dewar</snm><fnm>K</fnm></au><au><snm>Doyle</snm><fnm>M</fnm></au><au><snm>FitzHugh</snm><fnm>W</fnm></au><au><snm>Funke</snm><fnm>R</fnm></au><au><snm>Gage</snm><fnm>D</fnm></au><au><snm>Harris</snm><fnm>K</fnm></au><au><snm>Heaford</snm><fnm>A</fnm></au><au><snm>Howland</snm><fnm>J</fnm></au><au><snm>Kann</snm><fnm>L</fnm></au><au><snm>Lehoczky</snm><fnm>J</fnm></au><au><snm>LeVine</snm><fnm>R</fnm></au><au><snm>McEwan</snm><fnm>P</fnm></au><au><snm>McKernan</snm><fnm>K</fnm></au><au><snm>Meldrim</snm><fnm>J</fnm></au><au><snm>Mesirov</snm><fnm>JP</fnm></au><au><snm>Miranda</snm><fnm>C</fnm></au><au><snm>Morris</snm><fnm>W</fnm></au><au><snm>Naylor</snm><fnm>J</fnm></au><au><snm>Raymond</snm><fnm>C</fnm></au><au><snm>Rosetti</snm><fnm>M</fnm></au><au><snm>Santos</snm><fnm>R</fnm></au><au><snm>Sheridan</snm><fnm>A</fnm></au><au><snm>Sougnez</snm><fnm>C</fnm></au><etal/></aug><source>Nature</source><pubdate>2001</pubdate><volume>409</volume><fpage>860</fpage><lpage>921</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/35057062</pubid><pubid idtype="pmpid" link="fulltext">11237011</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>Tandem repeat polymorphisms: modulators of disease susceptibility and candidates for "missing heritability.".</p></title><aug><au><snm>Hannan</snm><fnm>AJ</fnm></au></aug><source>Trends Genet</source><pubdate>2010</pubdate><volume>26</volume><fpage>59</fpage><lpage>65</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.tig.2009.11.008</pubid><pubid idtype="pmpid" link="fulltext">20036436</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>Microsatellites: simple sequences with complex evolution.</p></title><aug><au><snm>Ellegren</snm><fnm>H</fnm></au></aug><source>Nat Rev Genet</source><pubdate>2004</pubdate><volume>5</volume><fpage>435</fpage><lpage>445</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">15153996</pubid></xrefbib></bibl><bibl id="B5"><title><p>Analysis of the largest tandemly repeated DNA families in the human genome.</p></title><aug><au><snm>Warburton</snm><fnm>PE</fnm></au><au><snm>Hasson</snm><fnm>D</fnm></au><au><snm>Guillem</snm><fnm>F</fnm></au><au><snm>Lescale</snm><fnm>C</fnm></au><au><snm>Jin</snm><fnm>X</fnm></au><au><snm>Abrusan</snm><fnm>G</fnm></au></aug><source>BMC Genomics</source><pubdate>2008</pubdate><volume>9</volume><fpage>533</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-9-533</pubid><pubid idtype="pmcid">2588610</pubid><pubid idtype="pmpid" link="fulltext">18992157</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>Genomic and genetic definition of a functional human centromere.</p></title><aug><au><snm>Schueler</snm><fnm>MG</fnm></au><au><snm>Higgins</snm><fnm>AW</fnm></au><au><snm>Rudd</snm><fnm>MK</fnm></au><au><snm>Gustashaw</snm><fnm>K</fnm></au><au><snm>Willard</snm><fnm>HF</fnm></au></aug><source>Science</source><pubdate>2001</pubdate><volume>294</volume><fpage>109</fpage><lpage>115</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1065042</pubid><pubid idtype="pmpid" link="fulltext">11588252</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><title><p>Stringent sequence requirements for the formation of human telomeres.</p></title><aug><au><snm>Hanish</snm><fnm>JP</fnm></au><au><snm>Yanowitz</snm><fnm>JL</fnm></au><au><snm>de Lange</snm><fnm>T</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>1994</pubdate><volume>91</volume><fpage>8861</fpage><lpage>8865</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.91.19.8861</pubid><pubid idtype="pmcid">44706</pubid><pubid idtype="pmpid" link="fulltext">8090736</pubid></pubidlist></xrefbib></bibl><bibl id="B8"><title><p>So much "junk" DNA in our genome.</p></title><aug><au><snm>Ohno</snm><fnm>S</fnm></au></aug><source>Brookhaven Symp Biol</source><pubdate>1972</pubdate><volume>23</volume><fpage>366</fpage><lpage>370</lpage><xrefbib><pubid idtype="pmpid">5065367</pubid></xrefbib></bibl><bibl id="B9"><title><p>Selfish DNA: the ultimate parasite.</p></title><aug><au><snm>Orgel</snm><fnm>LE</fnm></au><au><snm>Crick</snm><fnm>FH</fnm></au></aug><source>Nature</source><pubdate>1980</pubdate><volume>284</volume><fpage>604</fpage><lpage>607</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/284604a0</pubid><pubid idtype="pmpid">7366731</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>The biological effects of simple tandem repeats: lessons from the repeat expansion diseases.</p></title><aug><au><snm>Usdin</snm><fnm>K</fnm></au></aug><source>Genome Res</source><pubdate>2008</pubdate><volume>18</volume><fpage>1011</fpage><lpage>1019</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.070409.107</pubid><pubid idtype="pmpid" link="fulltext">18593815</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>FSHD associated DNA rearrangements are due to deletions of integral copies of a 3.2 kb tandemly repeated unit.</p></title><aug><au><snm>van Deutekom</snm><fnm>JC</fnm></au><au><snm>Wijmenga</snm><fnm>C</fnm></au><au><snm>van Tienhoven</snm><fnm>EA</fnm></au><au><snm>Gruter</snm><fnm>AM</fnm></au><au><snm>Hewitt</snm><fnm>JE</fnm></au><au><snm>Padberg</snm><fnm>GW</fnm></au><au><snm>van Ommen</snm><fnm>GJ</fnm></au><au><snm>Hofker</snm><fnm>MH</fnm></au><au><snm>Frants</snm><fnm>RR</fnm></au></aug><source>Hum Mol Genet</source><pubdate>1993</pubdate><volume>2</volume><fpage>2037</fpage><lpage>2042</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/hmg/2.12.2037</pubid><pubid idtype="pmpid" link="fulltext">8111371</pubid></pubidlist></xrefbib></bibl><bibl id="B12"><title><p>Chromosome 4q DNA rearrangements associated with facioscapulohumeral muscular dystrophy.</p></title><aug><au><snm>Wijmenga</snm><fnm>C</fnm></au><au><snm>Hewitt</snm><fnm>JE</fnm></au><au><snm>Sandkuijl</snm><fnm>LA</fnm></au><au><snm>Clark</snm><fnm>LN</fnm></au><au><snm>Wright</snm><fnm>TJ</fnm></au><au><snm>Dauwerse</snm><fnm>HG</fnm></au><au><snm>Gruter</snm><fnm>AM</fnm></au><au><snm>Hofker</snm><fnm>MH</fnm></au><au><snm>Moerer</snm><fnm>P</fnm></au><au><snm>Williamson</snm><fnm>R</fnm></au><au><snm>Vanommen</snm><fnm>GJB</fnm></au><au><snm>Padberg</snm><fnm>GW</fnm></au><au><snm>Frants</snm><fnm>RR</fnm></au></aug><source>Nat Genet</source><pubdate>1992</pubdate><volume>2</volume><fpage>26</fpage><lpage>30</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/ng0992-26</pubid><pubid idtype="pmpid" link="fulltext">1363881</pubid></pubidlist></xrefbib></bibl><bibl id="B13"><title><p>Long tandem repeats as a form of genomic copy number variation: structure and length polymorphism of a chromosome 5p repeat in control and schizophrenia populations.</p></title><aug><au><snm>Bruce</snm><fnm>HA</fnm></au><au><snm>Sachs</snm><fnm>N</fnm></au><au><snm>Rudnicki</snm><fnm>DD</fnm></au><au><snm>Lin</snm><fnm>SG</fnm></au><au><snm>Willour</snm><fnm>VL</fnm></au><au><snm>Cowell</snm><fnm>JK</fnm></au><au><snm>Conroy</snm><fnm>J</fnm></au><au><snm>McQuaid</snm><fnm>DE</fnm></au><au><snm>Rossi</snm><fnm>M</fnm></au><au><snm>Gaile</snm><fnm>DP</fnm></au><au><snm>Nowak</snm><fnm>NJ</fnm></au><au><snm>Holmes</snm><fnm>SE</fnm></au><au><snm>Sklar</snm><fnm>P</fnm></au><au><snm>Ross</snm><fnm>CA</fnm></au><au><snm>DeLisi</snm><fnm>LE</fnm></au><au><snm>Margolis</snm><fnm>RL</fnm></au></aug><source>Psychiatr Genet</source><pubdate>2009</pubdate><volume>19</volume><fpage>64</fpage><lpage>71</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/YPG.0b013e3283207ff6</pubid><pubid idtype="pmpid" link="fulltext">19672138</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>A novel GC-rich human macrosatellite VNTR in Xq24 is differentially methylated on active and inactive X chromosomes.</p></title><aug><au><snm>Giacalone</snm><fnm>J</fnm></au><au><snm>Friedes</snm><fnm>J</fnm></au><au><snm>Francke</snm><fnm>U</fnm></au></aug><source>Nat Genet</source><pubdate>1992</pubdate><volume>1</volume><fpage>137</fpage><lpage>143</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/ng0592-137</pubid><pubid idtype="pmpid" link="fulltext">1302007</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>A novel tandem repeat sequence located on human chromosome 4p: isolation and characterization.</p></title><aug><au><snm>Kogi</snm><fnm>M</fnm></au><au><snm>Fukushige</snm><fnm>S</fnm></au><au><snm>Lefevre</snm><fnm>C</fnm></au><au><snm>Hadano</snm><fnm>S</fnm></au><au><snm>Ikeda</snm><fnm>JE</fnm></au></aug><source>Genomics</source><pubdate>1997</pubdate><volume>42</volume><fpage>278</fpage><lpage>283</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1006/geno.1997.4746</pubid><pubid idtype="pmpid" link="fulltext">9192848</pubid></pubidlist></xrefbib></bibl><bibl id="B16"><title><p>Expression, tandem repeat copy number variation and stability of four macrosatellite arrays in the human genome.</p></title><aug><au><snm>Tremblay</snm><fnm>DC</fnm></au><au><snm>Alexander</snm><fnm>G</fnm><suf>Jr</suf></au><au><snm>Moseley</snm><fnm>S</fnm></au><au><snm>Chadwick</snm><fnm>BP</fnm></au></aug><source>BMC Genomics</source><pubdate>2010</pubdate><volume>11</volume><fpage>632</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-11-632</pubid><pubid idtype="pmcid">3018141</pubid><pubid idtype="pmpid" link="fulltext">21078170</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><title><p>Variation in array size, monomer composition and expression of the macrosatellite DXZ4.</p></title><aug><au><snm>Tremblay</snm><fnm>DC</fnm></au><au><snm>Moseley</snm><fnm>S</fnm></au><au><snm>Chadwick</snm><fnm>BP</fnm></au></aug><source>PLoS One</source><pubdate>2011</pubdate><volume>6</volume><fpage>e18969</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pone.0018969</pubid><pubid idtype="pmcid">3081327</pubid><pubid idtype="pmpid" link="fulltext">21544201</pubid></pubidlist></xrefbib></bibl><bibl id="B18"><title><p>Gene action in the X-chromosome of the mouse (<it>Mus musculus </it>L.).</p></title><aug><au><snm>Lyon</snm><fnm>MF</fnm></au></aug><source>Nature</source><pubdate>1961</pubdate><volume>190</volume><fpage>372</fpage><lpage>373</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/190372a0</pubid><pubid idtype="pmpid">13764598</pubid></pubidlist></xrefbib></bibl><bibl id="B19"><title><p>Gene silencing in X-chromosome inactivation: advances in understanding facultative heterochromatin formation.</p></title><aug><au><snm>Wutz</snm><fnm>A</fnm></au></aug><source>Nat Rev Genet</source><pubdate>2011</pubdate><volume>12</volume><fpage>542</fpage><lpage>553</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nrg3035</pubid><pubid idtype="pmpid" link="fulltext">21765457</pubid></pubidlist></xrefbib></bibl><bibl id="B20"><title><p>Reactivation of an inactive human X chromosome: evidence for X inactivation by DNA methylation.</p></title><aug><au><snm>Mohandas</snm><fnm>T</fnm></au><au><snm>Sparkes</snm><fnm>RS</fnm></au><au><snm>Shapiro</snm><fnm>LJ</fnm></au></aug><source>Science</source><pubdate>1981</pubdate><volume>211</volume><fpage>393</fpage><lpage>396</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.6164095</pubid><pubid idtype="pmpid" link="fulltext">6164095</pubid></pubidlist></xrefbib></bibl><bibl id="B21"><title><p>In vivo footprint and methylation analysis by PCR-aided genomic sequencing: comparison of active and inactive X chromosomal DNA at the CpG island and promoter of human PGK-1.</p></title><aug><au><snm>Pfeifer</snm><fnm>GP</fnm></au><au><snm>Tanguay</snm><fnm>RL</fnm></au><au><snm>Steigerwald</snm><fnm>SD</fnm></au><au><snm>Riggs</snm><fnm>AD</fnm></au></aug><source>Genes Dev</source><pubdate>1990</pubdate><volume>4</volume><fpage>1277</fpage><lpage>1287</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gad.4.8.1277</pubid><pubid idtype="pmpid" link="fulltext">2227409</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>DXZ4 chromatin adopts an opposing conformation to that of the surrounding chromosome and acquires a novel inactive X-specific role involving CTCF and antisense transcripts.</p></title><aug><au><snm>Chadwick</snm><fnm>BP</fnm></au></aug><source>Genome Res</source><pubdate>2008</pubdate><volume>18</volume><fpage>1259</fpage><lpage>1269</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.075713.107</pubid><pubid idtype="pmcid">2493436</pubid><pubid idtype="pmpid" link="fulltext">18456864</pubid></pubidlist></xrefbib></bibl><bibl id="B23"><title><p>Differentially methylated forms of histone H3 show unique association patterns with inactive human X chromosomes.</p></title><aug><au><snm>Boggs</snm><fnm>BA</fnm></au><au><snm>Cheung</snm><fnm>P</fnm></au><au><snm>Heard</snm><fnm>E</fnm></au><au><snm>Spector</snm><fnm>DL</fnm></au><au><snm>Chinault</snm><fnm>AC</fnm></au><au><snm>Allis</snm><fnm>CD</fnm></au></aug><source>Nat Genet</source><pubdate>2002</pubdate><volume>30</volume><fpage>73</fpage><lpage>76</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/ng787</pubid><pubid idtype="pmpid" link="fulltext">11740495</pubid></pubidlist></xrefbib></bibl><bibl id="B24"><title><p>Histone H3 lysine 9 methylation is an epigenetic imprint of facultative heterochromatin.</p></title><aug><au><snm>Peters</snm><fnm>AH</fnm></au><au><snm>Mermoud</snm><fnm>JE</fnm></au><au><snm>O&apos;Carroll</snm><fnm>D</fnm></au><au><snm>Pagani</snm><fnm>M</fnm></au><au><snm>Schweizer</snm><fnm>D</fnm></au><au><snm>Brockdorff</snm><fnm>N</fnm></au><au><snm>Jenuwein</snm><fnm>T</fnm></au></aug><source>Nat Genet</source><pubdate>2002</pubdate><volume>30</volume><fpage>77</fpage><lpage>80</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/ng789</pubid><pubid idtype="pmpid" link="fulltext">11740497</pubid></pubidlist></xrefbib></bibl><bibl id="B25"><title><p>Cell cycle-dependent localization of macroH2A in chromatin of the inactive X chromosome.</p></title><aug><au><snm>Chadwick</snm><fnm>BP</fnm></au><au><snm>Willard</snm><fnm>HF</fnm></au></aug><source>J Cell Biol</source><pubdate>2002</pubdate><volume>157</volume><fpage>1113</fpage><lpage>1123</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1083/jcb.200112074</pubid><pubid idtype="pmcid">2173542</pubid><pubid idtype="pmpid" link="fulltext">12082075</pubid></pubidlist></xrefbib></bibl><bibl id="B26"><title><p>An exceptionally conserved transcriptional repressor, CTCF, employs different combinations of zinc fingers to bind diverged promoter sequences of avian and mammalian c-myc oncogenes.</p></title><aug><au><snm>Filippova</snm><fnm>GN</fnm></au><au><snm>Fagerlie</snm><fnm>S</fnm></au><au><snm>Klenova</snm><fnm>EM</fnm></au><au><snm>Myers</snm><fnm>C</fnm></au><au><snm>Dehner</snm><fnm>Y</fnm></au><au><snm>Goodwin</snm><fnm>G</fnm></au><au><snm>Neiman</snm><fnm>PE</fnm></au><au><snm>Collins</snm><fnm>SJ</fnm></au><au><snm>Lobanenkov</snm><fnm>VV</fnm></au></aug><source>Mol Cell Biol</source><pubdate>1996</pubdate><volume>16</volume><fpage>2802</fpage><lpage>2813</lpage><xrefbib><pubidlist><pubid idtype="pmcid">231272</pubid><pubid idtype="pmpid" link="fulltext">8649389</pubid></pubidlist></xrefbib></bibl><bibl id="B27"><title><p>Chromatin of the Barr body: histone and non-histone proteins associated with or excluded from the inactive X chromosome.</p></title><aug><au><snm>Chadwick</snm><fnm>BP</fnm></au><au><snm>Willard</snm><fnm>HF</fnm></au></aug><source>Hum Mol Genet</source><pubdate>2003</pubdate><volume>12</volume><fpage>2167</fpage><lpage>2178</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/hmg/ddg229</pubid><pubid idtype="pmpid" link="fulltext">12915472</pubid></pubidlist></xrefbib></bibl><bibl id="B28"><title><p>The macrosatellite DXZ4 mediates CTCF-dependent long-range intrachromosomal interactions on the human inactive X chromosome.</p></title><aug><au><snm>Horakova</snm><fnm>AH</fnm></au><au><snm>Moseley</snm><fnm>SC</fnm></au><au><snm>McLaughlin</snm><fnm>CR</fnm></au><au><snm>Tremblay</snm><fnm>DC</fnm></au><au><snm>Chadwick</snm><fnm>BP</fnm></au></aug><source>Hum Mol Genet</source><pubdate>2012</pubdate><note>doi: 10.1093/hmg/dds270</note></bibl><bibl id="B29"><title><p>A top-down analysis of Xa- and Xi-territories reveals differences of higher order structure at >/= 20 Mb genomic length scales.</p></title><aug><au><snm>Teller</snm><fnm>K</fnm></au><au><snm>Illner</snm><fnm>D</fnm></au><au><snm>Thamm</snm><fnm>S</fnm></au><au><snm>Casas-Delucchi</snm><fnm>CS</fnm></au><au><snm>Versteeg</snm><fnm>R</fnm></au><au><snm>Indemans</snm><fnm>M</fnm></au><au><snm>Cremer</snm><fnm>T</fnm></au><au><snm>Cremer</snm><fnm>M</fnm></au></aug><source>Nucleus</source><pubdate>2011</pubdate><volume>2</volume><fpage>465</fpage><lpage>477</lpage><xrefbib><pubidlist><pubid idtype="doi">10.4161/nucl.2.5.17862</pubid><pubid idtype="pmpid" link="fulltext">21970989</pubid></pubidlist></xrefbib></bibl><bibl id="B30"><title><p>Characterization of DXZ4 conservation in primates implies important functional roles for CTCF binding, array expression and tandem repeat organization on the X chromosome.</p></title><aug><au><snm>McLaughlin</snm><fnm>CR</fnm></au><au><snm>Chadwick</snm><fnm>BP</fnm></au></aug><source>Genome Biol</source><pubdate>2011</pubdate><volume>12</volume><fpage>R37</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2011-12-4-r37</pubid><pubid idtype="pmcid">3218863</pubid><pubid idtype="pmpid" link="fulltext">21489251</pubid></pubidlist></xrefbib></bibl><bibl id="B31"><title><p>Gracefully ageing at 50, X-chromosome inactivation becomes a paradigm for RNA and chromatin control.</p></title><aug><au><snm>Lee</snm><fnm>JT</fnm></au></aug><source>Nat Rev Mol Cell Biol</source><pubdate>2011</pubdate><volume>12</volume><fpage>815</fpage><lpage>826</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nrm3231</pubid><pubid idtype="pmpid" link="fulltext">22108600</pubid></pubidlist></xrefbib></bibl><bibl id="B32"><title><p>Eutherian mammals use diverse strategies to initiate X-chromosome inactivation during development.</p></title><aug><au><snm>Okamoto</snm><fnm>I</fnm></au><au><snm>Patrat</snm><fnm>C</fnm></au><au><snm>Thepot</snm><fnm>D</fnm></au><au><snm>Peynot</snm><fnm>N</fnm></au><au><snm>Fauque</snm><fnm>P</fnm></au><au><snm>Daniel</snm><fnm>N</fnm></au><au><snm>Diabangouaya</snm><fnm>P</fnm></au><au><snm>Wolf</snm><fnm>JP</fnm></au><au><snm>Renard</snm><fnm>JP</fnm></au><au><snm>Duranthon</snm><fnm>V</fnm></au><au><snm>Heard</snm><fnm>E</fnm></au></aug><source>Nature</source><pubdate>2011</pubdate><volume>472</volume><fpage>370</fpage><lpage>374</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature09872</pubid><pubid idtype="pmpid" link="fulltext">21471966</pubid></pubidlist></xrefbib></bibl><bibl id="B33"><title><p>X-inactivation profile reveals extensive variability in X-linked gene expression in females.</p></title><aug><au><snm>Carrel</snm><fnm>L</fnm></au><au><snm>Willard</snm><fnm>HF</fnm></au></aug><source>Nature</source><pubdate>2005</pubdate><volume>434</volume><fpage>400</fpage><lpage>404</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature03479</pubid><pubid idtype="pmpid" link="fulltext">15772666</pubid></pubidlist></xrefbib></bibl><bibl id="B34"><title><p>Global survey of escape from X inactivation by RNA-sequencing in mouse.</p></title><aug><au><snm>Yang</snm><fnm>F</fnm></au><au><snm>Babak</snm><fnm>T</fnm></au><au><snm>Shendure</snm><fnm>J</fnm></au><au><snm>Disteche</snm><fnm>CM</fnm></au></aug><source>Genome Res</source><pubdate>2010</pubdate><volume>20</volume><fpage>614</fpage><lpage>622</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.103200.109</pubid><pubid idtype="pmcid">2860163</pubid><pubid idtype="pmpid" link="fulltext">20363980</pubid></pubidlist></xrefbib></bibl><bibl id="B35"><title><p>Ensembl Genome Browser.</p></title><url>http://ensembl.org</url></bibl><bibl id="B36"><title><p>UCSC Genome Browser.</p></title><url>http://genome.ucsc.edu</url></bibl><bibl id="B37"><title><p>Human/mouse homology relationships.</p></title><aug><au><snm>DeBry</snm><fnm>RW</fnm></au><au><snm>Seldin</snm><fnm>MF</fnm></au></aug><source>Genomics</source><pubdate>1996</pubdate><volume>33</volume><fpage>337</fpage><lpage>351</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1006/geno.1996.0209</pubid><pubid idtype="pmpid" link="fulltext">8660993</pubid></pubidlist></xrefbib></bibl><bibl id="B38"><title><p>Macrosatellite epigenetics: the two faces of DXZ4 and D4Z4.</p></title><aug><au><snm>Chadwick</snm><fnm>BP</fnm></au></aug><source>Chromosoma</source><pubdate>2009</pubdate><volume>118</volume><fpage>675</fpage><lpage>681</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1007/s00412-009-0233-5</pubid><pubid idtype="pmpid" link="fulltext">19690880</pubid></pubidlist></xrefbib></bibl><bibl id="B39"><title><p>Repetitive DNA and next-generation sequencing: computational challenges and solutions.</p></title><aug><au><snm>Treangen</snm><fnm>TJ</fnm></au><au><snm>Salzberg</snm><fnm>SL</fnm></au></aug><source>Nat Rev Genet</source><pubdate>2012</pubdate><volume>13</volume><fpage>36</fpage><lpage>46</lpage></bibl><bibl id="B40"><title><p>The product of the mouse Xist gene is a 15 kb inactive X-specific transcript containing no conserved ORF and located in the nucleus.</p></title><aug><au><snm>Brockdorff</snm><fnm>N</fnm></au><au><snm>Ashworth</snm><fnm>A</fnm></au><au><snm>Kay</snm><fnm>GF</fnm></au><au><snm>McCabe</snm><fnm>VM</fnm></au><au><snm>Norris</snm><fnm>DP</fnm></au><au><snm>Cooper</snm><fnm>PJ</fnm></au><au><snm>Swift</snm><fnm>S</fnm></au><au><snm>Rastan</snm><fnm>S</fnm></au></aug><source>Cell</source><pubdate>1992</pubdate><volume>71</volume><fpage>515</fpage><lpage>526</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0092-8674(92)90519-I</pubid><pubid idtype="pmpid" link="fulltext">1423610</pubid></pubidlist></xrefbib></bibl><bibl id="B41"><title><p>The human XIST gene: analysis of a 17 kb inactive X-specific RNA that contains conserved repeats and is highly localized within the nucleus.</p></title><aug><au><snm>Brown</snm><fnm>CJ</fnm></au><au><snm>Hendrich</snm><fnm>BD</fnm></au><au><snm>Rupert</snm><fnm>JL</fnm></au><au><snm>Lafreniere</snm><fnm>RG</fnm></au><au><snm>Xing</snm><fnm>Y</fnm></au><au><snm>Lawrence</snm><fnm>J</fnm></au><au><snm>Willard</snm><fnm>HF</fnm></au></aug><source>Cell</source><pubdate>1992</pubdate><volume>71</volume><fpage>527</fpage><lpage>542</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0092-8674(92)90520-M</pubid><pubid idtype="pmpid" link="fulltext">1423611</pubid></pubidlist></xrefbib></bibl><bibl id="B42"><title><p>A user's guide to the Encyclopedia of DNA Elements (ENCODE).</p></title><aug><au><snm>Consortium</snm><fnm>TEP</fnm></au></aug><source>PLoS Biol</source><pubdate>2011</pubdate><volume>9</volume><fpage>e1001046</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pbio.1001046</pubid><pubid idtype="pmcid">3079585</pubid><pubid idtype="pmpid" link="fulltext">21526222</pubid></pubidlist></xrefbib></bibl><bibl id="B43"><title><p>Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing.</p></title><aug><au><snm>Robertson</snm><fnm>G</fnm></au><au><snm>Hirst</snm><fnm>M</fnm></au><au><snm>Bainbridge</snm><fnm>M</fnm></au><au><snm>Bilenky</snm><fnm>M</fnm></au><au><snm>Zhao</snm><fnm>Y</fnm></au><au><snm>Zeng</snm><fnm>T</fnm></au><au><snm>Euskirchen</snm><fnm>G</fnm></au><au><snm>Bernier</snm><fnm>B</fnm></au><au><snm>Varhol</snm><fnm>R</fnm></au><au><snm>Delaney</snm><fnm>A</fnm></au><au><snm>Thiessen</snm><fnm>N</fnm></au><au><snm>Griffith</snm><fnm>OL</fnm></au><au><snm>He</snm><fnm>A</fnm></au><au><snm>Marra</snm><fnm>M</fnm></au><au><snm>Snyder</snm><fnm>M</fnm></au><au><snm>Jones</snm><fnm>S</fnm></au></aug><source>Nat Methods</source><pubdate>2007</pubdate><volume>4</volume><fpage>651</fpage><lpage>657</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nmeth1068</pubid><pubid idtype="pmpid" link="fulltext">17558387</pubid></pubidlist></xrefbib></bibl><bibl id="B44"><title><p>High-resolution profiling of histone methylations in the human genome.</p></title><aug><au><snm>Barski</snm><fnm>A</fnm></au><au><snm>Cuddapah</snm><fnm>S</fnm></au><au><snm>Cui</snm><fnm>K</fnm></au><au><snm>Roh</snm><fnm>TY</fnm></au><au><snm>Schones</snm><fnm>DE</fnm></au><au><snm>Wang</snm><fnm>Z</fnm></au><au><snm>Wei</snm><fnm>G</fnm></au><au><snm>Chepelev</snm><fnm>I</fnm></au><au><snm>Zhao</snm><fnm>K</fnm></au></aug><source>Cell</source><pubdate>2007</pubdate><volume>129</volume><fpage>823</fpage><lpage>837</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.cell.2007.05.009</pubid><pubid idtype="pmpid" link="fulltext">17512414</pubid></pubidlist></xrefbib></bibl><bibl id="B45"><title><p>Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome.</p></title><aug><au><snm>Kim</snm><fnm>TH</fnm></au><au><snm>Abdullaev</snm><fnm>ZK</fnm></au><au><snm>Smith</snm><fnm>AD</fnm></au><au><snm>Ching</snm><fnm>KA</fnm></au><au><snm>Loukinov</snm><fnm>DI</fnm></au><au><snm>Green</snm><fnm>RD</fnm></au><au><snm>Zhang</snm><fnm>MQ</fnm></au><au><snm>Lobanenkov</snm><fnm>VV</fnm></au><au><snm>Ren</snm><fnm>B</fnm></au></aug><source>Cell</source><pubdate>2007</pubdate><volume>128</volume><fpage>1231</fpage><lpage>1245</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.cell.2006.12.048</pubid><pubid idtype="pmcid">2572726</pubid><pubid idtype="pmpid" link="fulltext">17382889</pubid></pubidlist></xrefbib></bibl><bibl id="B46"><title><p>Genome-scale mapping of DNase I sensitivity in vivo using tiling DNA microarrays.</p></title><aug><au><snm>Sabo</snm><fnm>PJ</fnm></au><au><snm>Kuehn</snm><fnm>MS</fnm></au><au><snm>Thurman</snm><fnm>R</fnm></au><au><snm>Johnson</snm><fnm>BE</fnm></au><au><snm>Johnson</snm><fnm>EM</fnm></au><au><snm>Cao</snm><fnm>H</fnm></au><au><snm>Yu</snm><fnm>M</fnm></au><au><snm>Rosenzweig</snm><fnm>E</fnm></au><au><snm>Goldy</snm><fnm>J</fnm></au><au><snm>Haydock</snm><fnm>A</fnm></au><au><snm>Weaver</snm><fnm>M</fnm></au><au><snm>Shafer</snm><fnm>A</fnm></au><au><snm>Lee</snm><fnm>K</fnm></au><au><snm>Neri</snm><fnm>F</fnm></au><au><snm>Humbert</snm><fnm>R</fnm></au><au><snm>Singer</snm><fnm>MA</fnm></au><au><snm>Richmond</snm><fnm>TA</fnm></au><au><snm>O Dorschner</snm><fnm>M</fnm></au><au><snm>McArthur</snm><fnm>M</fnm></au><au><snm>Hawrylycz</snm><fnm>M</fnm></au><au><snm>Green</snm><fnm>RD</fnm></au><au><snm>Navas</snm><fnm>PA</fnm></au><au><snm>Noble</snm><fnm>WS</fnm></au><au><snm>Stamatoyannopoulos</snm><fnm>JA</fnm></au></aug><source>Nat Methods</source><pubdate>2006</pubdate><volume>3</volume><fpage>511</fpage><lpage>518</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nmeth890</pubid><pubid idtype="pmpid" link="fulltext">16791208</pubid></pubidlist></xrefbib></bibl><bibl id="B47"><title><p>Genome-wide CTCF distribution in vertebrates defines equivalent sites that aid the identification of disease-associated genes.</p></title><aug><au><snm>Martin</snm><fnm>D</fnm></au><au><snm>Pantoja</snm><fnm>C</fnm></au><au><snm>Fernandez Minan</snm><fnm>A</fnm></au><au><snm>Valdes-Quezada</snm><fnm>C</fnm></au><au><snm>Molto</snm><fnm>E</fnm></au><au><snm>Matesanz</snm><fnm>F</fnm></au><au><snm>Bogdanovic</snm><fnm>O</fnm></au><au><snm>de la Calle-Mustienes</snm><fnm>E</fnm></au><au><snm>Dominguez</snm><fnm>O</fnm></au><au><snm>Taher</snm><fnm>L</fnm></au><au><snm>Furlan-Magaril</snm><fnm>M</fnm></au><au><snm>Alcina</snm><fnm>A</fnm></au><au><snm>Canon</snm><fnm>S</fnm></au><au><snm>Fedetz</snm><fnm>M</fnm></au><au><snm>Blasco</snm><fnm>MA</fnm></au><au><snm>Pereira</snm><fnm>PS</fnm></au><au><snm>Ovcharenko</snm><fnm>I</fnm></au><au><snm>Recillas-Targa</snm><fnm>F</fnm></au><au><snm>Montoliu</snm><fnm>L</fnm></au><au><snm>Manzanares</snm><fnm>M</fnm></au><au><snm>Guigo</snm><fnm>R</fnm></au><au><snm>Serrano</snm><fnm>M</fnm></au><au><snm>Casares</snm><fnm>F</fnm></au><au><snm>Gomez-Skarmeta</snm><fnm>JL</fnm></au></aug><source>Nat Struct Mol Biol</source><pubdate>2011</pubdate><volume>18</volume><fpage>708</fpage><lpage>714</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nsmb.2059</pubid><pubid idtype="pmcid">3196567</pubid><pubid idtype="pmpid" link="fulltext">21602820</pubid></pubidlist></xrefbib></bibl><bibl id="B48"><title><p>CTCF mediates methylation-sensitive enhancer-blocking activity at the H19/Igf2 locus.</p></title><aug><au><snm>Hark</snm><fnm>AT</fnm></au><au><snm>Schoenherr</snm><fnm>CJ</fnm></au><au><snm>Katz</snm><fnm>DJ</fnm></au><au><snm>Ingram</snm><fnm>RS</fnm></au><au><snm>Levorse</snm><fnm>JM</fnm></au><au><snm>Tilghman</snm><fnm>SM</fnm></au></aug><source>Nature</source><pubdate>2000</pubdate><volume>405</volume><fpage>486</fpage><lpage>489</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/35013106</pubid><pubid idtype="pmpid" link="fulltext">10839547</pubid></pubidlist></xrefbib></bibl><bibl id="B49"><title><p>The expected equilibrium of the CpG dinucleotide in vertebrate genomes under a mutation model.</p></title><aug><au><snm>Sved</snm><fnm>J</fnm></au><au><snm>Bird</snm><fnm>A</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>1990</pubdate><volume>87</volume><fpage>4692</fpage><lpage>4696</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.87.12.4692</pubid><pubid idtype="pmcid">54183</pubid><pubid idtype="pmpid" link="fulltext">2352943</pubid></pubidlist></xrefbib></bibl><bibl id="B50"><title><p>Primate CpG islands are maintained by heterogeneous evolutionary regimes involving minimal selection.</p></title><aug><au><snm>Cohen</snm><fnm>NM</fnm></au><au><snm>Kenigsberg</snm><fnm>E</fnm></au><au><snm>Tanay</snm><fnm>A</fnm></au></aug><source>Cell</source><pubdate>2011</pubdate><volume>145</volume><fpage>773</fpage><lpage>786</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.cell.2011.04.024</pubid><pubid idtype="pmpid" link="fulltext">21620139</pubid></pubidlist></xrefbib></bibl><bibl id="B51"><title><p>Role of histone H3 lysine 27 methylation in X inactivation.</p></title><aug><au><snm>Plath</snm><fnm>K</fnm></au><au><snm>Fang</snm><fnm>J</fnm></au><au><snm>Mlynarczyk-Evans</snm><fnm>SK</fnm></au><au><snm>Cao</snm><fnm>R</fnm></au><au><snm>Worringer</snm><fnm>KA</fnm></au><au><snm>Wang</snm><fnm>H</fnm></au><au><snm>de la Cruz</snm><fnm>CC</fnm></au><au><snm>Otte</snm><fnm>AP</fnm></au><au><snm>Panning</snm><fnm>B</fnm></au><au><snm>Zhang</snm><fnm>Y</fnm></au></aug><source>Science</source><pubdate>2003</pubdate><volume>300</volume><fpage>131</fpage><lpage>135</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1084274</pubid><pubid idtype="pmpid" link="fulltext">12649488</pubid></pubidlist></xrefbib></bibl><bibl id="B52"><title><p>Establishment of histone h3 methylation on the inactive X chromosome requires transient recruitment of eed-enx1 polycomb group complexes.</p></title><aug><au><snm>Silva</snm><fnm>J</fnm></au><au><snm>Mak</snm><fnm>W</fnm></au><au><snm>Zvetkova</snm><fnm>I</fnm></au><au><snm>Appanah</snm><fnm>R</fnm></au><au><snm>Nesterova</snm><fnm>TB</fnm></au><au><snm>Webster</snm><fnm>Z</fnm></au><au><snm>Peters</snm><fnm>AH</fnm></au><au><snm>Jenuwein</snm><fnm>T</fnm></au><au><snm>Otte</snm><fnm>AP</fnm></au><au><snm>Brockdorff</snm><fnm>N</fnm></au></aug><source>Dev Cell</source><pubdate>2003</pubdate><volume>4</volume><fpage>481</fpage><lpage>495</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S1534-5807(03)00068-6</pubid><pubid idtype="pmpid" link="fulltext">12689588</pubid></pubidlist></xrefbib></bibl><bibl id="B53"><title><p>YY1 associates with the macrosatellite DXZ4 on the inactive X chromosome and binds with CTCF to a hypomethylated form in some male carcinomas.</p></title><aug><au><snm>Moseley</snm><fnm>SC</fnm></au><au><snm>Rizkallah</snm><fnm>R</fnm></au><au><snm>Tremblay</snm><fnm>DC</fnm></au><au><snm>Anderson</snm><fnm>BR</fnm></au><au><snm>Hurt</snm><fnm>MM</fnm></au><au><snm>Chadwick</snm><fnm>BP</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2012</pubdate><volume>40</volume><fpage>1596</fpage><lpage>1608</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkr964</pubid><pubid idtype="pmcid">3287207</pubid><pubid idtype="pmpid" link="fulltext">22064860</pubid></pubidlist></xrefbib></bibl><bibl id="B54"><title><p>Preferential inactivation of the paternally derived X chromosome in the extraembryonic membranes of the mouse.</p></title><aug><au><snm>Takagi</snm><fnm>N</fnm></au><au><snm>Sasaki</snm><fnm>M</fnm></au></aug><source>Nature</source><pubdate>1975</pubdate><volume>256</volume><fpage>640</fpage><lpage>642</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/256640a0</pubid><pubid idtype="pmpid">1152998</pubid></pubidlist></xrefbib></bibl><bibl id="B55"><title><p>WebLogo: a sequence logo generator.</p></title><aug><au><snm>Crooks</snm><fnm>GE</fnm></au><au><snm>Hon</snm><fnm>G</fnm></au><au><snm>Chandonia</snm><fnm>JM</fnm></au><au><snm>Brenner</snm><fnm>SE</fnm></au></aug><source>Genome Res</source><pubdate>2004</pubdate><volume>14</volume><fpage>1188</fpage><lpage>1190</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.849004</pubid><pubid idtype="pmcid">419797</pubid><pubid idtype="pmpid" link="fulltext">15173120</pubid></pubidlist></xrefbib></bibl><bibl id="B56"><title><p>JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update.</p></title><aug><au><snm>Bryne</snm><fnm>JC</fnm></au><au><snm>Valen</snm><fnm>E</fnm></au><au><snm>Tang</snm><fnm>MH</fnm></au><au><snm>Marstrand</snm><fnm>T</fnm></au><au><snm>Winther</snm><fnm>O</fnm></au><au><snm>da Piedade</snm><fnm>I</fnm></au><au><snm>Krogh</snm><fnm>A</fnm></au><au><snm>Lenhard</snm><fnm>B</fnm></au><au><snm>Sandelin</snm><fnm>A</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2008</pubdate><volume>36</volume><fpage>D102</fpage><lpage>106</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkn449</pubid><pubid idtype="pmcid">2238834</pubid><pubid idtype="pmpid" link="fulltext">18006571</pubid></pubidlist></xrefbib></bibl><bibl id="B57"><title><p>Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding.</p></title><aug><au><snm>Schmidt</snm><fnm>D</fnm></au><au><snm>Wilson</snm><fnm>MD</fnm></au><au><snm>Ballester</snm><fnm>B</fnm></au><au><snm>Schwalie</snm><fnm>PC</fnm></au><au><snm>Brown</snm><fnm>GD</fnm></au><au><snm>Marshall</snm><fnm>A</fnm></au><au><snm>Kutter</snm><fnm>C</fnm></au><au><snm>Watt</snm><fnm>S</fnm></au><au><snm>Martinez-Jimenez</snm><fnm>CP</fnm></au><au><snm>Mackay</snm><fnm>S</fnm></au><au><snm>Talianidis</snm><fnm>I</fnm></au><au><snm>Flicek</snm><fnm>P</fnm></au><au><snm>Odom</snm><fnm>DT</fnm></au></aug><source>Science</source><pubdate>2010</pubdate><volume>328</volume><fpage>1036</fpage><lpage>1040</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1186176</pubid><pubid idtype="pmcid">3008766</pubid><pubid idtype="pmpid" link="fulltext">20378774</pubid></pubidlist></xrefbib></bibl><bibl id="B58"><title><p>The ETS-domain transcription factors Elk-1 and SAP-1 exhibit differential DNA binding specificities.</p></title><aug><au><snm>Shore</snm><fnm>P</fnm></au><au><snm>Sharrocks</snm><fnm>AD</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>1995</pubdate><volume>23</volume><fpage>4698</fpage><lpage>4706</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/23.22.4698</pubid><pubid idtype="pmcid">307446</pubid><pubid idtype="pmpid" link="fulltext">8524663</pubid></pubidlist></xrefbib></bibl><bibl id="B59"><title><p>Transcriptional regulation by DNA methylation.</p></title><aug><au><snm>Poetsch</snm><fnm>AR</fnm></au><au><snm>Plass</snm><fnm>C</fnm></au></aug><source>Cancer Treat Rev</source><pubdate>2011</pubdate><volume>37</volume><issue>Suppl 1</issue><fpage>S8</fpage><lpage>12</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">21601364</pubid></xrefbib></bibl><bibl id="B60"><title><p>Impaired energy homeostasis in C/EBP alpha knockout mice.</p></title><aug><au><snm>Wang</snm><fnm>ND</fnm></au><au><snm>Finegold</snm><fnm>MJ</fnm></au><au><snm>Bradley</snm><fnm>A</fnm></au><au><snm>Ou</snm><fnm>CN</fnm></au><au><snm>Abdelsayed</snm><fnm>SV</fnm></au><au><snm>Wilde</snm><fnm>MD</fnm></au><au><snm>Taylor</snm><fnm>LR</fnm></au><au><snm>Wilson</snm><fnm>DR</fnm></au><au><snm>Darlington</snm><fnm>GJ</fnm></au></aug><source>Science</source><pubdate>1995</pubdate><volume>269</volume><fpage>1108</fpage><lpage>1112</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.7652557</pubid><pubid idtype="pmpid" link="fulltext">7652557</pubid></pubidlist></xrefbib></bibl><bibl id="B61"><title><p>Characterization of SAP-1, a protein recruited by serum response factor to the c-fos serum response element.</p></title><aug><au><snm>Dalton</snm><fnm>S</fnm></au><au><snm>Treisman</snm><fnm>R</fnm></au></aug><source>Cell</source><pubdate>1992</pubdate><volume>68</volume><fpage>597</fpage><lpage>612</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0092-8674(92)90194-H</pubid><pubid idtype="pmpid" link="fulltext">1339307</pubid></pubidlist></xrefbib></bibl><bibl id="B62"><title><p>Serum response factor binding sites differ in three human cell types.</p></title><aug><au><snm>Cooper</snm><fnm>SJ</fnm></au><au><snm>Trinklein</snm><fnm>ND</fnm></au><au><snm>Nguyen</snm><fnm>L</fnm></au><au><snm>Myers</snm><fnm>RM</fnm></au></aug><source>Genome Res</source><pubdate>2007</pubdate><volume>17</volume><fpage>136</fpage><lpage>144</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.5875007</pubid><pubid idtype="pmcid">1781345</pubid><pubid idtype="pmpid" link="fulltext">17200232</pubid></pubidlist></xrefbib></bibl><bibl id="B63"><title><p>A high-resolution map of human evolutionary constraint using 29 mammals.</p></title><aug><au><snm>Lindblad-Toh</snm><fnm>K</fnm></au><au><snm>Garber</snm><fnm>M</fnm></au><au><snm>Zuk</snm><fnm>O</fnm></au><au><snm>Lin</snm><fnm>MF</fnm></au><au><snm>Parker</snm><fnm>BJ</fnm></au><au><snm>Washietl</snm><fnm>S</fnm></au><au><snm>Kheradpour</snm><fnm>P</fnm></au><au><snm>Ernst</snm><fnm>J</fnm></au><au><snm>Jordan</snm><fnm>G</fnm></au><au><snm>Mauceli</snm><fnm>E</fnm></au><au><snm>Ward</snm><fnm>LD</fnm></au><au><snm>Lowe</snm><fnm>CB</fnm></au><au><snm>Holloway</snm><fnm>AK</fnm></au><au><snm>Clamp</snm><fnm>M</fnm></au><au><snm>Gnerre</snm><fnm>S</fnm></au><au><snm>Alfoldi</snm><fnm>J</fnm></au><au><snm>Beal</snm><fnm>K</fnm></au><au><snm>Chang</snm><fnm>J</fnm></au><au><snm>Clawson</snm><fnm>H</fnm></au><au><snm>Cuff</snm><fnm>J</fnm></au><au><snm>Di Palma</snm><fnm>F</fnm></au><au><snm>Fitzgerald</snm><fnm>S</fnm></au><au><snm>Flicek</snm><fnm>P</fnm></au><au><snm>Guttman</snm><fnm>M</fnm></au><au><snm>Hubisz</snm><fnm>MJ</fnm></au><au><snm>Jaffe</snm><fnm>DB</fnm></au><au><snm>Jungreis</snm><fnm>I</fnm></au><au><snm>Kent</snm><fnm>WJ</fnm></au><au><snm>Kostka</snm><fnm>D</fnm></au><au><snm>Lara</snm><fnm>M</fnm></au><etal/></aug><source>Nature</source><pubdate>2011</pubdate><volume>478</volume><fpage>476</fpage><lpage>482</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature10530</pubid><pubid idtype="pmcid">3207357</pubid><pubid idtype="pmpid" link="fulltext">21993624</pubid></pubidlist></xrefbib></bibl><bibl id="B64"><title><p>Complex gene conversion events in germline mutation at human minisatellites.</p></title><aug><au><snm>Jeffreys</snm><fnm>AJ</fnm></au><au><snm>Tamaki</snm><fnm>K</fnm></au><au><snm>MacLeod</snm><fnm>A</fnm></au><au><snm>Monckton</snm><fnm>DG</fnm></au><au><snm>Neil</snm><fnm>DL</fnm></au><au><snm>Armour</snm><fnm>JA</fnm></au></aug><source>Nat Genet</source><pubdate>1994</pubdate><volume>6</volume><fpage>136</fpage><lpage>145</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/ng0294-136</pubid><pubid idtype="pmpid" link="fulltext">8162067</pubid></pubidlist></xrefbib></bibl><bibl id="B65"><title><p>Epigenetic regulation of heterochromatic DNA stability.</p></title><aug><au><snm>Peng</snm><fnm>JC</fnm></au><au><snm>Karpen</snm><fnm>GH</fnm></au></aug><source>Curr Opin Genet Dev</source><pubdate>2008</pubdate><volume>18</volume><fpage>204</fpage><lpage>211</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.gde.2008.01.021</pubid><pubid idtype="pmcid">2814359</pubid><pubid idtype="pmpid" link="fulltext">18372168</pubid></pubidlist></xrefbib></bibl><bibl id="B66"><title><p>CpG mutation rates in the human genome are highly dependent on local GC content.</p></title><aug><au><snm>Fryxell</snm><fnm>KJ</fnm></au><au><snm>Moon</snm><fnm>WJ</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2005</pubdate><volume>22</volume><fpage>650</fpage><lpage>658</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">15537806</pubid></xrefbib></bibl><bibl id="B67"><title><p>Plastin 3 is a protective modifier of autosomal recessive spinal muscular atrophy.</p></title><aug><au><snm>Oprea</snm><fnm>GE</fnm></au><au><snm>Krober</snm><fnm>S</fnm></au><au><snm>McWhorter</snm><fnm>ML</fnm></au><au><snm>Rossoll</snm><fnm>W</fnm></au><au><snm>Muller</snm><fnm>S</fnm></au><au><snm>Krawczak</snm><fnm>M</fnm></au><au><snm>Bassell</snm><fnm>GJ</fnm></au><au><snm>Beattie</snm><fnm>CE</fnm></au><au><snm>Wirth</snm><fnm>B</fnm></au></aug><source>Science</source><pubdate>2008</pubdate><volume>320</volume><fpage>524</fpage><lpage>527</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1155085</pubid><pubid idtype="pmpid" link="fulltext">18440926</pubid></pubidlist></xrefbib></bibl><bibl id="B68"><title><p>Topological domains in mammalian genomes identified by analysis of chromatin interactions.</p></title><aug><au><snm>Dixon</snm><fnm>JR</fnm></au><au><snm>Selvaraj</snm><fnm>S</fnm></au><au><snm>Yue</snm><fnm>F</fnm></au><au><snm>Kim</snm><fnm>A</fnm></au><au><snm>Li</snm><fnm>Y</fnm></au><au><snm>Shen</snm><fnm>Y</fnm></au><au><snm>Hu</snm><fnm>M</fnm></au><au><snm>Liu</snm><fnm>JS</fnm></au><au><snm>Ren</snm><fnm>B</fnm></au></aug><source>Nature</source><pubdate>2012</pubdate><volume>485</volume><fpage>376</fpage><lpage>380</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature11082</pubid><pubid idtype="pmpid" link="fulltext">22495300</pubid></pubidlist></xrefbib></bibl><bibl id="B69"><aug><au><snm>Nagy</snm><fnm>A</fnm></au></aug><source>Manipulating the Mouse Embryo: A Laboratory Manual</source><publisher>Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press</publisher><edition>3</edition><pubdate>2003</pubdate></bibl><bibl id="B70"><title><p>Bisulfite sequencing data presentation and compilation (BDPC) web server - a useful tool for DNA methylation analysis.</p></title><aug><au><snm>Rohde</snm><fnm>D</fnm></au><au><snm>Zhang</snm><fnm>Y</fnm></au><au><snm>Jukowski</snm><fnm>TP</fnm></au><au><snm>Stamerjohanns</snm><fnm>H</fnm></au><au><snm>Reinhardt</snm><fnm>R</fnm></au><au><snm>Jeltsch</snm><fnm>A</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2008</pubdate><volume>36</volume><fpage>e34</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkn083</pubid><pubid idtype="pmcid">2275153</pubid><pubid idtype="pmpid" link="fulltext">18296484</pubid></pubidlist></xrefbib></bibl><bibl id="B71"><title><p>YASS: enhancing the sensitivity of DNA similarity search.</p></title><aug><au><snm>Noe</snm><fnm>L</fnm></au><au><snm>Kucherov</snm><fnm>G</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2005</pubdate><volume>33</volume><fpage>W540</fpage><lpage>543</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gki478</pubid><pubid idtype="pmcid">1160238</pubid><pubid idtype="pmpid" link="fulltext">15980530</pubid></pubidlist></xrefbib></bibl><bibl id="B72"><title><p>MUSCLE: multiple sequence alignment with high accuracy and high throughput.</p></title><aug><au><snm>Edgar</snm><fnm>RC</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2004</pubdate><volume>32</volume><fpage>1792</fpage><lpage>1797</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkh340</pubid><pubid idtype="pmcid">390337</pubid><pubid idtype="pmpid" link="fulltext">15034147</pubid></pubidlist></xrefbib></bibl><bibl id="B73"><title><p>Multiple sequence alignment with the Clustal series of programs.</p></title><aug><au><snm>Chenna</snm><fnm>R</fnm></au><au><snm>Sugawara</snm><fnm>H</fnm></au><au><snm>Koike</snm><fnm>T</fnm></au><au><snm>Lopez</snm><fnm>R</fnm></au><au><snm>Gibson</snm><fnm>TJ</fnm></au><au><snm>Higgins</snm><fnm>DG</fnm></au><au><snm>Thompson</snm><fnm>JD</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2003</pubdate><volume>31</volume><fpage>3497</fpage><lpage>3500</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkg500</pubid><pubid idtype="pmcid">168907</pubid><pubid idtype="pmpid" link="fulltext">12824352</pubid></pubidlist></xrefbib></bibl></refgrp>
</bm>
</art>