<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2006-7-11-r112</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Recurrent insertion and duplication generate networks of transposable element sequences in the <it>Drosophila melanogaster </it>genome</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Bergman</snm>
               <mi>M</mi>
               <fnm>Casey</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>casey.bergman@manchester.ac.uk</email>
            </au>
            <au id="A2">
               <snm>Quesneville</snm>
               <fnm>Hadi</fnm>
               <insr iid="I3"/>
               <email>hq@ccr.jussieu.fr</email>
            </au>
            <au id="A3">
               <snm>Anxolab&#233;h&#232;re</snm>
               <fnm>Dominique</fnm>
               <insr iid="I4"/>
               <email>anxo@ccr.jussieu.fr</email>
            </au>
            <au id="A4">
               <snm>Ashburner</snm>
               <fnm>Michael</fnm>
               <insr iid="I1"/>
               <email>ma11@gen.cam.ac.uk</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK</p>
            </ins>
            <ins id="I2">
               <p>Faculty of Life Sciences, University of Manchester, Manchester M13 9PT, UK</p>
            </ins>
            <ins id="I3">
               <p>Laboratoire de Bioinformatique et G&#233;nomique, Institut Jacques Monod, place Jussieu, 75251 Paris cedex 05, France</p>
            </ins>
            <ins id="I4">
               <p>Laboratoire Dynamique du G&#233;nome et &#201;volution, Institut Jacques Monod, place Jussieu, 75251 Paris cedex 05, France</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2006</pubdate>
         <volume>7</volume>
         <issue>11</issue>
         <fpage>R112</fpage>
         <url>http://genomebiology.com/2006/7/11/R112</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17134480</pubid>
               <pubid idtype="doi">10.1186/gb-2006-7-11-r112</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>31</day>
               <month>7</month>
               <year>2006</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>13</day>
               <month>11</month>
               <year>2006</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>29</day>
               <month>11</month>
               <year>2006</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>29</day>
               <month>11</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>Bergman et al.; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <shorttitle>
         <p>Networks of transposable elements in fly</p>
      </shorttitle>
      <shortabs>
         <p>An analysis of high-resolution transposable element annotations in Drosophila melanogaster suggests the existence of a global surveillance system against the majority of transposable elements families in the fly.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The recent availability of genome sequences has provided unparalleled insights into the broad-scale patterns of transposable element (TE) sequences in eukaryotic genomes. Nevertheless, the difficulties that TEs pose for genome assembly and annotation have prevented detailed, quantitative inferences about the contribution of TEs to genomes sequences.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Using a high-resolution annotation of TEs in Release 4 genome sequence, we revise estimates of TE abundance in <it>Drosophila melanogaster</it>. We show that TEs are non-randomly distributed within regions of high and low TE abundance, and that pericentromeric regions with high TE abundance are mosaics of distinct regions of extreme and normal TE density. Comparative analysis revealed that this punctate pattern evolves jointly by transposition and duplication, but not by inversion of TE-rich regions from unsequenced heterochromatin. Analysis of genome-wide patterns of TE nesting revealed a 'nesting network' that includes virtually all of the known TE families in the genome. Numerous directed cycles exist among TE families in the nesting network, implying concurrent or overlapping periods of transpositional activity.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Rapid restructuring of the genomic landscape by transposition and duplication has recently added hundreds of kilobases of TE sequence to pericentromeric regions in <it>D. melanogaster</it>. These events create ragged transitions between unique and repetitive sequences in the zone between euchromatic and beta-heterochromatic regions. Complex relationships of TE nesting in beta-heterochromatic regions raise the possibility of a co-suppression network that may act as a global surveillance system against the majority of TE families in <it>D. melanogaster</it>.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Nearly all eukaryotic genomes contain a substantial fraction of middle repetitive, transposable element (TE) sequences interspersed with the unique sequences encoding genes and <it>cis</it>-regulatory elements. The broad-scale patterns of TE abundance and distribution in various model organisms have become increasingly well-understood with the recent availability of essentially complete genome sequences (for example, <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>). Despite these general advances, however, a detailed understanding of the evolutionary forces that control the abundance and distribution of TEs remains elusive, owing in part to the dynamic nature of this component of the genome as well as to the inherent problems that TE sequences present for genome assembly and annotation.</p>
         <p>As with all unfinished whole-genome shotgun assemblies, uncertainty in the assembly of repetitive DNA in the first two releases of the <it>Drosophila melanogaster </it>genome sequence posed difficulties for analysis of TE sequences <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. The improved assembly of repetitive regions in the <it>D. melanogaster </it>Release 3 genome sequence presented the first opportunity to study TEs in a finished whole genome shotgun sequence <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B9">9</abbr></abbrgrp>, revealing the true challenge that these sequences pose for their systematic annotation <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>. With further improvements in the Release 4 genome sequence made possible by the efforts of the Berkeley <it>Drosophila </it>Genome Project <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> (especially in regions of high TE density where several gaps have been completed), we are now in a position to establish more stable trends in TE abundance for <it>D. melanogaster</it>. In addition to having access to improved genome sequence data, we have recently developed an improved TE annotation pipeline that uses the combined evidence of multiple computational methods to predict 'TE models' in genome sequences <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. We have shown that this pipeline identifies a large number of predicted TEs that were omitted from the Release 3 genome annotations, and subsequently applied this system to the <it>D. melanogaster </it>Release 4 sequence <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. Here we analyze the results of this effort in detail, which allows an extremely high-resolution view of the structure and location of TEs in one of the highest quality metazoan genome sequences currently available.</p>
         <p>We first revised baseline estimates of the TE abundance in the <it>Drosophila </it>genome sequence, based on the fact that TEs show a strikingly non-random distribution across the genome. We then used this baseline to identify specific regions of extremely high TE density in the genome sequence. This analysis showed that regions of the genome broadly known to have high TE abundance, such as pericentromeric regions and the fourth chromosome, are in fact often characterized by distinctly localized regions of extremely high TE density interrupted by regions of lower TE density. Comparative sequence analysis showed that this punctate pattern is unlikely to have arisen in the <it>D. melanogaster </it>genome by inversion of TE-rich heterochromatic sequences, but can evolve <it>in situ </it>by the joint action of recurrent transposition and duplication. Finally, we analyzed in detail the patterns of TE nesting in the genome sequence, taking advantage of the improved joining of fragments from the same TE insertion event in our new annotation. We framed the process of TE nesting as a directed graph and borrowed techniques from network analysis to study genome-wide patterns of TE nesting. This work demonstrates the added value of high-resolution annotations for understanding how TEs impact genome organization and evolution, and preludes the interpretation of TE-rich heterochromatic regions currently being sequenced by the <it>Drosophila </it>Heterochromatin Genome Project <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Abundance and distribution of TEs in the Release 4 genome sequence</p>
            </st>
            <p>Using a recently completed combined-evidence annotation of the Release 4 genome sequence <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, we revised estimates of the overall abundance of TE sequences in <it>D. melanogaster </it>(Table <tblr tid="T1">1</tblr>) from those based on the Release 3 sequence <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Excluding foreign elements based on query sequences from other species (see Materials and methods), the estimated number of TEs in the <it>D. melanogaster </it>Release 4 genome sequence (<it>n </it>= 5,390) is over three-fold higher than in Release 3 (<it>n </it>= 1,572). In contrast, the amount of sequence annotated as TE increased by only approximately 44% in Release 4 (6.51 Mb, 5.50% of genome) relative to Release 3 (4.51 Mb, 3.86% of genome). (We note that the proportion of the Release 4 genome estimated here as TE is calculated as the sum of non-redundant annotation spans including unique sequences inserted into TEs; this procedure differs slightly from our previous estimates for Release 4, which only included sequences strictly homologous to TE query sequences <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>.) The discrepant changes in these two metrics of TE abundance across releases results from the fact that almost all new TEs in Release 4 are either small fragments and/or annotations of the highly abundant but degenerated <it>INE-1 </it>element (also known as <it>DINE-1 </it>or DNAREP1_DM) <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, a family that was omitted from the Release 3 annotation. The inclusion of these new small fragments is also reflected in the fact that the proportion of TEs estimated to be full-length (defined as &#177; 3% of the canonical element including the length of inserted sequences) has declined from 30.5% in Release 3 to 9.83% in Release 4. The number of TEs involved in nests (<it>n </it>= 785) has more than doubled in Release 4 relative to Release 3 because of newly annotated sequences and improved joining of TE fragments belonging to the same insertion, although the estimated proportion of TEs involved in nests (14.6%) in Release 4 has decreased relative to Release 3 as a consequence of the increased total number of TEs annotated.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Abundance of <it>D. melanogaster </it>TEs annotated in Release 4 genome sequence by genomic region</p>
               </caption>
               <tblbdy cols="10">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Class</p>
                     </c>
                     <c ca="center">
                        <p>Total bp TE</p>
                     </c>
                     <c ca="center">
                        <p>% TE</p>
                     </c>
                     <c ca="center">
                        <p>No. of TEs</p>
                     </c>
                     <c ca="center">
                        <p>No. of TE per Mbp</p>
                     </c>
                     <c ca="center">
                        <p>No. of TE full length</p>
                     </c>
                     <c ca="center">
                        <p>% TE full length</p>
                     </c>
                     <c ca="center">
                        <p>No. of TE nested</p>
                     </c>
                     <c ca="center">
                        <p>% TE nested</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="10">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Genome</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                     <c ca="center">
                        <p>3,896,903</p>
                     </c>
                     <c ca="center">
                        <p>3.29</p>
                     </c>
                     <c ca="center">
                        <p>1,321</p>
                     </c>
                     <c ca="center">
                        <p>11.16</p>
                     </c>
                     <c ca="center">
                        <p>325</p>
                     </c>
                     <c ca="center">
                        <p>24.60</p>
                     </c>
                     <c ca="center">
                        <p>327</p>
                     </c>
                     <c ca="center">
                        <p>24.75</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Non-LTR</p>
                     </c>
                     <c ca="center">
                        <p>1,502,997</p>
                     </c>
                     <c ca="center">
                        <p>1.27</p>
                     </c>
                     <c ca="center">
                        <p>1,019</p>
                     </c>
                     <c ca="center">
                        <p>8.61</p>
                     </c>
                     <c ca="center">
                        <p>121</p>
                     </c>
                     <c ca="center">
                        <p>11.87</p>
                     </c>
                     <c ca="center">
                        <p>197</p>
                     </c>
                     <c ca="center">
                        <p>19.33</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TIR</p>
                     </c>
                     <c ca="center">
                        <p>559,234</p>
                     </c>
                     <c ca="center">
                        <p>0.47</p>
                     </c>
                     <c ca="center">
                        <p>752</p>
                     </c>
                     <c ca="center">
                        <p>6.35</p>
                     </c>
                     <c ca="center">
                        <p>57</p>
                     </c>
                     <c ca="center">
                        <p>7.58</p>
                     </c>
                     <c ca="center">
                        <p>157</p>
                     </c>
                     <c ca="center">
                        <p>20.88</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <it>INE-1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>490,996</p>
                     </c>
                     <c ca="center">
                        <p>0.41</p>
                     </c>
                     <c ca="center">
                        <p>2,238</p>
                     </c>
                     <c ca="center">
                        <p>18.91</p>
                     </c>
                     <c ca="center">
                        <p>26</p>
                     </c>
                     <c ca="center">
                        <p>1.16</p>
                     </c>
                     <c ca="center">
                        <p>91</p>
                     </c>
                     <c ca="center">
                        <p>4.07</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <it>FB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>60,509</p>
                     </c>
                     <c ca="center">
                        <p>0.05</p>
                     </c>
                     <c ca="center">
                        <p>60</p>
                     </c>
                     <c ca="center">
                        <p>0.51</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1.67</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>21.67</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="center">
                        <p>6,510,639</p>
                     </c>
                     <c ca="center">
                        <p>5.50</p>
                     </c>
                     <c ca="center">
                        <p>5,390</p>
                     </c>
                     <c ca="center">
                        <p>45.54</p>
                     </c>
                     <c ca="center">
                        <p>530</p>
                     </c>
                     <c ca="center">
                        <p>9.83</p>
                     </c>
                     <c ca="center">
                        <p>785</p>
                     </c>
                     <c ca="center">
                        <p>14.56</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Non-pericentromeric</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                     <c ca="center">
                        <p>2,510,569</p>
                     </c>
                     <c ca="center">
                        <p>2.42</p>
                     </c>
                     <c ca="center">
                        <p>515</p>
                     </c>
                     <c ca="center">
                        <p>4.96</p>
                     </c>
                     <c ca="center">
                        <p>250</p>
                     </c>
                     <c ca="center">
                        <p>48.54</p>
                     </c>
                     <c ca="center">
                        <p>80</p>
                     </c>
                     <c ca="center">
                        <p>15.53</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Non-LTR</p>
                     </c>
                     <c ca="center">
                        <p>646,020</p>
                     </c>
                     <c ca="center">
                        <p>0.62</p>
                     </c>
                     <c ca="center">
                        <p>336</p>
                     </c>
                     <c ca="center">
                        <p>3.24</p>
                     </c>
                     <c ca="center">
                        <p>80</p>
                     </c>
                     <c ca="center">
                        <p>22.92</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>2.68</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TIR</p>
                     </c>
                     <c ca="center">
                        <p>151,997</p>
                     </c>
                     <c ca="center">
                        <p>0.15</p>
                     </c>
                     <c ca="center">
                        <p>214</p>
                     </c>
                     <c ca="center">
                        <p>2.06</p>
                     </c>
                     <c ca="center">
                        <p>25</p>
                     </c>
                     <c ca="center">
                        <p>11.68</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>5.61</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <it>INE-1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>106,597</p>
                     </c>
                     <c ca="center">
                        <p>0.10</p>
                     </c>
                     <c ca="center">
                        <p>660</p>
                     </c>
                     <c ca="center">
                        <p>6.36</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0.76</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>1.21</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <it>FB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>28,125</p>
                     </c>
                     <c ca="center">
                        <p>0.03</p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                     <c ca="center">
                        <p>0.22</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>4.35</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>13.04</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="center">
                        <p>3,443,308</p>
                     </c>
                     <c ca="center">
                        <p>3.32</p>
                     </c>
                     <c ca="center">
                        <p>1,748</p>
                     </c>
                     <c ca="center">
                        <p>16.85</p>
                     </c>
                     <c ca="center">
                        <p>361</p>
                     </c>
                     <c ca="center">
                        <p>20.48</p>
                     </c>
                     <c ca="center">
                        <p>112</p>
                     </c>
                     <c ca="center">
                        <p>6.41</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Pericentromeric</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                     <c ca="center">
                        <p>1,324,428</p>
                     </c>
                     <c ca="center">
                        <p>9.94</p>
                     </c>
                     <c ca="center">
                        <p>776</p>
                     </c>
                     <c ca="center">
                        <p>58.24</p>
                     </c>
                     <c ca="center">
                        <p>70</p>
                     </c>
                     <c ca="center">
                        <p>9.02</p>
                     </c>
                     <c ca="center">
                        <p>241</p>
                     </c>
                     <c ca="center">
                        <p>31.06</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Non-LTR</p>
                     </c>
                     <c ca="center">
                        <p>802,040</p>
                     </c>
                     <c ca="center">
                        <p>6.02</p>
                     </c>
                     <c ca="center">
                        <p>623</p>
                     </c>
                     <c ca="center">
                        <p>46.75</p>
                     </c>
                     <c ca="center">
                        <p>42</p>
                     </c>
                     <c ca="center">
                        <p>6.58</p>
                     </c>
                     <c ca="center">
                        <p>169</p>
                     </c>
                     <c ca="center">
                        <p>27.13</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TIR</p>
                     </c>
                     <c ca="center">
                        <p>323,226</p>
                     </c>
                     <c ca="center">
                        <p>2.43</p>
                     </c>
                     <c ca="center">
                        <p>436</p>
                     </c>
                     <c ca="center">
                        <p>32.72</p>
                     </c>
                     <c ca="center">
                        <p>29</p>
                     </c>
                     <c ca="center">
                        <p>6.65</p>
                     </c>
                     <c ca="center">
                        <p>115</p>
                     </c>
                     <c ca="center">
                        <p>26.38</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <it>INE-1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>300,615</p>
                     </c>
                     <c ca="center">
                        <p>2.26</p>
                     </c>
                     <c ca="center">
                        <p>1,234</p>
                     </c>
                     <c ca="center">
                        <p>92.61</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>1.38</p>
                     </c>
                     <c ca="center">
                        <p>71</p>
                     </c>
                     <c ca="center">
                        <p>5.75</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <it>FB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>27,773</p>
                     </c>
                     <c ca="center">
                        <p>0.21</p>
                     </c>
                     <c ca="center">
                        <p>32</p>
                     </c>
                     <c ca="center">
                        <p>2.40</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.00</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>28.13</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="center">
                        <p>2,778,082</p>
                     </c>
                     <c ca="center">
                        <p>20.85</p>
                     </c>
                     <c ca="center">
                        <p>3,101</p>
                     </c>
                     <c ca="center">
                        <p>232.72</p>
                     </c>
                     <c ca="center">
                        <p>158</p>
                     </c>
                     <c ca="center">
                        <p>5.06</p>
                     </c>
                     <c ca="center">
                        <p>605</p>
                     </c>
                     <c ca="center">
                        <p>19.51</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Chromosome 4</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                     <c ca="center">
                        <p>61,906</p>
                     </c>
                     <c ca="center">
                        <p>4.83</p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                     <c ca="center">
                        <p>23.41</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>16.67</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>20.00</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Non-LTR</p>
                     </c>
                     <c ca="center">
                        <p>54,937</p>
                     </c>
                     <c ca="center">
                        <p>4.29</p>
                     </c>
                     <c ca="center">
                        <p>60</p>
                     </c>
                     <c ca="center">
                        <p>46.82</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>5.00</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                     <c ca="center">
                        <p>31.67</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TIR</p>
                     </c>
                     <c ca="center">
                        <p>84,011</p>
                     </c>
                     <c ca="center">
                        <p>6.55</p>
                     </c>
                     <c ca="center">
                        <p>102</p>
                     </c>
                     <c ca="center">
                        <p>79.59</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>2.94</p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                     <c ca="center">
                        <p>29.41</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <it>INE-1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>83,784</p>
                     </c>
                     <c ca="center">
                        <p>6.54</p>
                     </c>
                     <c ca="center">
                        <p>344</p>
                     </c>
                     <c ca="center">
                        <p>268.41</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>1.16</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>3.49</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <it>FB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4,611</p>
                     </c>
                     <c ca="center">
                        <p>0.36</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>3.90</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.00</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>20.00</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="center">
                        <p>289,249</p>
                     </c>
                     <c ca="center">
                        <p>22.57</p>
                     </c>
                     <c ca="center">
                        <p>541</p>
                     </c>
                     <c ca="center">
                        <p>422.12</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>2.77</p>
                     </c>
                     <c ca="center">
                        <p>68</p>
                     </c>
                     <c ca="center">
                        <p>12.57</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Overall abundance was partitioned into pericentromeric and non-pericentromeric regions according to the text. Full-length elements were defined as &#177; 3% of the canonical element. Both inner and outer components of a TE nest were considered nested.</p>
               </tblfn>
            </tbl>
            <p>The major patterns of TE abundance identified in previous releases of the <it>D. melanogaster </it>genome sequence <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp> are also observed in Release 4, suggesting that these trends are stable features of the <it>D. melanogaster </it>genomic landscape. As shown in Figure <figr fid="F1">1</figr>, both the pericentromeric regions of the major chromosome arms and the entirety of chromosome 4 have higher densities of TE insertions, relative to non-pericentromeric regions <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B7">7</abbr><abbr bid="B15">15</abbr></abbrgrp>. Densities over the non-pericentromeric regions are roughly equal, with no general increase in TE density in telomeric regions (Figure <figr fid="F1">1</figr>) <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B15">15</abbr></abbrgrp>, excluding TEs that are directly involved in telomere structure/function or in the subtelomeric arrays (see below). There is no general decrease in the abundance of TEs on the X chromosome <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B15">15</abbr></abbrgrp>, as expected if TE insertions generate deleterious recessive mutations <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Long terminal repeat (LTR) retrotransposons occupy the greatest proportion of the genome sequence (3.29%), as has been observed previously <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B7">7</abbr></abbrgrp>, but the current annotation reveals that the <it>INE-1 </it>family is the most numerous category of TEs (<it>n </it>= 2,238) in the <it>D. melanogaster </it>genome <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. (We note that throughout this work, non-LTR retrotransposon is abbreviated as 'non-LTR', which is referred to as <it>LINE</it>-like in <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B7">7</abbr></abbrgrp>.) <it>INE-1 </it>has previously been suggested to be a retrotransposon on the basis of homology to the <it>D. virilis Penelope </it>element <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>; however, we found that this reported homology between <it>Penelope </it>and <it>INE-1 </it>is spurious and restricted to flanking sequences in GenBank:<ext-link ext-link-type="gen" ext-link-id="U49102">U49102</ext-link> (see also <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>). From the percent genome sequence occupied, our analysis indicates that <it>INE-1 </it>distribution most closely fits the terminal inverted repeat (TIR) transposon class of TEs (Table <tblr tid="T1">1</tblr>), supporting the conclusion that <it>INE-1 </it>is a TIR element based on structural features of an improved consensus sequence <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Distribution of TEs along the <it>D. melanogaster </it>Release 4 chromosome arms</p>
               </caption>
               <text>
                  <p>Distribution of TEs along the <it>D. melanogaster </it>Release 4 chromosome arms. Numbers of TEs per 50 Kb window are plotted as a function of position along a chromosome arm. Abundance for all families excluding the <it>INE-1 </it>is shown in black for the main and inset panels, and in blue for the <it>INE-1 </it>family in inset panels. Positions of the cytologically estimated boundaries between euchromatin and heterochromatin in pericentromeric regions are shown as red triangles. Positions of genetically estimated boundaries between high and reduced recombination, and between reduced and null recombination, in pericentromeric regions are shown as green and orange triangles respectively. Filled circles indicate centromeric regions that are currently not included in the Release 4 genome sequence. HDRs on the major chromosome arms are numbered in purple.</p>
               </text>
               <graphic file="gb-2006-7-11-r112-1"/>
            </fig>
            <p>This set of 5,390 TEs defined 4,684 TE-free regions (TFRs) <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> in the Release 4 genome sequence; 94.5% (111.9 Mb of 118.4 Mb) of the Release 4 genome sequence can be found in TFRs, with 89.8% (106.2 Mb) and 56.1% (66.4 Mb) of the genome found in TFRs of greater than 10 Kb (<it>n </it>= 1,393) and 100 Kb (<it>n </it>= 357), respectively. The longest TFR in <it>D. melanogaster </it>is 855,890 base-pairs (bp) in length on chromosome 2R from 14,374,883-15,230,772, contains 106 genes, and is over 10 times longer than the longest TFR in the human genome <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. The mean TFR length of 23,878 bp is consistent with the genome-wide minimum estimate of the distance between middle-repetitive interspersed repeats (>13 Kb) based on reassociation kinetics <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>; however, the median TFR length of 1,992 bp is much smaller. The distribution of TFR lengths departs significantly from an exponential distribution parameterized on this mean length using an adjusted Kolmogorov-Smirnov test (D = 0.4513, p &lt; 0.001), which is based on the maximal difference between observed and expected cumulative distributions and accounts for the fact that the rate parameter for the exponential distribution has been estimated from the data <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Similar results are obtained if the rate parameter for the exponential is calculated from the number of TE insertions divided by the total length of TFRs (as in <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>), both including (adjusted Kolmogorov-Smirnov test, D = 0.4719, p &lt; 0.001) or excluding (adjusted Kolmogorov-Smirnov test, D = 0.4456, p &lt; 0.001) TEs nested in other TEs. These results are not simply a result of a high density in pericentromeric regions (see below) and demonstrate that the location of TEs is non-randomly distributed at the level of the complete <it>D. melanogaster </it>genome sequence, confirming previous results <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B15">15</abbr></abbrgrp>. We note that TFRs in the <it>D. melanogaster </it>genome are likely to vary among individuals since most TE insertions are not fixed in the species <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>; however, these results should be representative of other strains to the extent that the TE composition of the genome sequence reflects general properties of the species <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Pericentromeric regions, non-pericentromeric regions and the fourth chromosome differ drastically in TE content</p>
            </st>
            <p>Since non-random distribution of TEs can lead to greater than one order of magnitude differences in TE abundance in pericentromeric and non-pericentromeric regions <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B15">15</abbr><abbr bid="B24">24</abbr></abbrgrp>, overall genome-wide summary statistics do not accurately reflect TE abundance for any region of the genome sequence. To account for this heterogeneity, we attempted to partition the major chromosome arms into regions of high (pericentromeric) and low (non-pericentromeric) TE density using an independent criterion that is not based on TE content. Our primary goal here was to estimate the TE content in non-pericentromeric regions of the genome as accurately as possible, to understand baseline levels of TE abundance throughout the majority of the genome. Initially we investigated using a partition based on the cytologically defined boundaries between euchromatin and &#946;-heterochromatin estimated in Hoskins <it>et al</it>. <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. As shown in Figure <figr fid="F1">1</figr> (red triangles), the cytologically defined limits of the euchromatin/&#946;-heterochromatin boundaries correspond almost exactly to the most distal pericentromeric region of high TE density on chromosome arms 3L and 3R. However, on chromosome arms 2L, 2R and X the most distal pericentromeric regions of extreme TE density are up to 2 Mb from the estimated euchromatin/&#946;-heterochromatin boundary. Thus, using this cytological criterion to partition the genome into regions of high and low TE density still leads to an over-estimate of the true TE abundance for the majority of the genome.</p>
            <p>We next evaluated whether genetically defined regions of different recombination rates estimated by Charlesworth <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> could partition the genome into high and low TE density regions. For all chromosome arms (excluding the fourth chromosome), we found that the estimated boundaries between 'reduced' and 'null' (that is, very low) recombination rates in pericentromeric regions (Figure <figr fid="F1">1</figr>, orange triangles) were located extremely close to the cytologically defined boundaries between euchromatin and &#946;-heterochromatin. Thus, the same tendency to bias estimates of TE abundance exists if the boundary between reduced and null recombination rates is used to partition the genome as for the cytological criterion above. In contrast, the estimated transitions between 'high' and 'reduced' recombination rates in pericentromeric regions (Figure <figr fid="F1">1</figr>, green triangles) are approximately 1 to 2 Mb distal to estimated euchromatin/&#946;-heterochromatin boundaries for all major chromosome arms. Virtually all regions with high TE density were included in the 11% of the genome sequence labeled under this definition as 'pericentromeric' (Figure <figr fid="F1">1</figr>), and, therefore, this partition was used to estimate TE abundance in different regions of <it>D. melanogaster </it>genome. Because our aim was to estimate the TE content in non-pericentromeric regions as a baseline to identify regions of extremely high TE content elsewhere in the genome, the inclusion of some low TE content regions in pericentromeric regions on chromosome arms 3L and 3R using this partition should not bias estimates of the background TE abundance throughout the euchromatin.</p>
            <sec>
               <st>
                  <p>Non-pericentromeric regions</p>
               </st>
               <p>A 'typical' region of the <it>D. melanogaster </it>Release 4 genome sequence (that is, the 88% of the genome in non-pericentromeric, high recombination regions on the major chromosome arms) contains approximately 3.32% TE sequences, with an average of 16.9 TEs per Mb (Table <tblr tid="T1">1</tblr>). Previous estimates based on Release 1 and 2 are not meaningful because of assembly errors <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B15">15</abbr></abbrgrp>, and those based on Releases 3 and 4 were computed across the entire genome <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B10">10</abbr></abbrgrp>, thus the current figures represent the first unbiased estimates of TE content for the majority of the <it>D. melanogaster </it>genome sequence. As observed in previous releases of the <it>D. melanogaster </it>genome sequence <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B7">7</abbr></abbrgrp>, the rank order of abundance of major TE classes in non-pericentromeric regions is: LTR elements (2.42%, 4.96/Mb) > non-LTR elements (0.62%, 3.24/Mb) > TIR elements (0.15%, 2.06/Mb). <it>INE-1 </it>elements account for only 0.10% of a typical region of the <it>D. melanogaster </it>genome, but contribute 6.36 TEs/Mb. Approximately 20.5% of the TEs in non-pericentromeric regions are estimated to be full-length (&#177; 3% of the canonical element including the length of inserted sequences), although this value will undoubtedly change with different definitions of what constitutes a full-length element. Virtually every TE in non-pericentromeric regions exists as an individual insertion, with only 6.41% involved in nests of TEs inserted into other TEs. The majority of TE families (97/121, 80.2%) present in the genome sequence have copies in non-pericentromeric regions.</p>
            </sec>
            <sec>
               <st>
                  <p>Pericentromeric regions</p>
               </st>
               <p>In stark contrast, the 11% of the genome sequence in pericentromeric, low-recombination regions on major chromosome arms contains 57.5% (<it>n </it>= 3,101) of the 5,390 TEs annotated and 42.7% (2.78 Mb) of the 6.51 Mb of sequence annotated as TE. On average, pericentromeric regions are composed of 20.9% TE sequences, with 233 TEs/Mb (Table <tblr tid="T1">1</tblr>). Overall, there is approximately 6-fold enrichment in amount of DNA and a 14-fold increase in TE density in pericentromeric regions relative to non-pericentromeric regions. It must be noted, however, that average values of TE content for pericentromeric regions are more variable than for non-pericentromeric regions, because of heterogeneity both within a given pericentromeric region (Figure <figr fid="F1">1</figr>, see below) and among pericentromeric regions on different chromosome arms. For example, the pericentromeric region of chromosome arm 3R had a much lower TE density than other chromosome arms, perhaps relating to the lack of &#946;-heterochromatic sequences in polytene chromosomes at the base of this chromosome arm <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>. TE abundance in the pericentromeric region of the X chromosome is likely to be underestimated because of an unsized and unsequenced physical gap in cytological division 20 <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B12">12</abbr></abbrgrp>, which is embedded in a region of extremely high TE density. Because of these effects and the inclusion of some low TE content regions on 3L and 3R that arise from our use of the high-reduced recombination rate boundary (see above), estimates of TE abundance in pericentromeric regions should be treated as approximate. The rank order of abundance for the major classes of TEs is the same in the pericentromeric regions as in non-pericentromeric regions (% TE sequence: LTR > non-LTR > TIR > <it>INE-1</it>; number of TEs/Mb: <it>INE-1 </it>> LTR > non-LTR > TIR). Four-fold fewer pericentromeric TEs were full-length (5.1%) relative to non-pericentromeric regions, with 3-fold greater numbers involved in nests (19.5%) (see Table <tblr tid="T1">1</tblr>). Virtually all TE families (118/121, 97.5%) present in the genome sequence have copies in pericentromeric regions.</p>
            </sec>
            <sec>
               <st>
                  <p>Chromosome 4</p>
               </st>
               <p>Like pericentromeric regions, the fourth chromosome has a much higher TE abundance than is typical of the genome as a whole: although the fourth chromosome is only 1% of the genome sequence, approximately 10% of TEs annotated are found on chromosome 4. Overall, there is approximately 7-fold enrichment in amount of DNA and a 25-fold increase in TE density on the fourth chromosome relative to regions of normal TE abundance. Important differences in TE abundance between pericentromeric regions and the fourth chromosome were also observed <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B7">7</abbr></abbrgrp> (Table <tblr tid="T1">1</tblr>). Relative to pericentromeric regions, the fourth chromosome has a higher number of TEs per unit of physical distance (422 TEs/MB), but a similar proportion of genome sequence annotated as TE (22.6%). As noted previously <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B7">7</abbr></abbrgrp>, the rank order abundance of the major TE classes on chromosome 4 differs from the rest of the genome, with TIR elements as the most abundant class of TE (% TE sequence: TIR ~ <it>INE-1 </it>> LTR > non-LTR; number of TEs/Mb: <it>INE-1 </it>> TIR > non-LTR > LTR). To test the robustness of this pattern, we removed the most numerous family from each of the major TE classes on the fourth chromosome: LTR, <it>297 </it>(<it>n </it>= 3); non-LTR, <it>Cr1a </it>(<it>n </it>= 17); TIR, <it>1360 </it>(<it>n </it>= 62). In the absence of these three highly abundant families, the rank order percent TE sequence (<it>INE-1 </it>> LTR > non-LTR > TIR) and number of TEs/Mb (<it>INE-1 </it>> TIR ~ non-LTR > LTR) change for the fourth chromosome. This result indicates that patterns of abundance by class on the fourth chromosome are heavily influenced by a few highly abundant families, suggesting that <it>Cr1a </it>in addition to <it>INE-1 </it>and <it>1360 </it>may play an important role in defining the unusual features of this chromosome <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B29">29</abbr></abbrgrp>. Fewer TEs on the fourth chromosome are full-length (2.77%) relative to pericentromeric regions, and a lower proportion of TEs are involved in nests (12.6%). Less than half of all TE families (55/121, 45.5%) present in the genome sequence have copies on the fourth chromosome.</p>
               <p>Clear differences were also observed in the distribution of TFRs in these three genomic compartments. Consistent with TE densities, non-pericentromeric regions have on average the largest uninterrupted regions of unique sequence (mean 60,320 bp; median 29,280 bp; <it>n </it>= 1,663), relative to pericentromeric regions (mean 4,147 bp; median 726 bp; <it>n </it>= 2,541) and the fourth chromosome (mean 2,067 bp; median 1,150 bp; <it>n </it>= 480). Nevertheless, separate analyses of TFR distributions within each compartment revealed non-random distribution of TEs based on mean TFR lengths in non-pericentromeric regions (adjusted Kolmogorov-Smirnov test, D = 0.1627, p &lt; 0.001), pericentromeric regions (adjusted Kolmogorov-Smirnov test, D = 0.3501, p &lt; 0.001) and chromosome 4 (adjusted Kolmogorov-Smirnov test, D = 0.1541, p &lt; 0.001). We note that finding of non-random distribution of TEs in non-pericentromeric regions in the genome sequence differs from previous conclusions based on cytological estimates <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. Our results indicate that the non-random distribution of TEs across the entire genome is not explained solely by overall differences in TE abundance between genomic compartments and suggest that the mechanisms that determine the location of TE insertions, such as gene density and ectopic recombination <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B15">15</abbr><abbr bid="B31">31</abbr></abbrgrp>, may be decoupled from overall TE abundance.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Localized regions of extremely high TE density</p>
            </st>
            <p>With this improved calibration of the background TE abundance that is typical of the major chromosome arms, we sought to identify specific regions of the genome with an extremely high local TE density (we abbreviate such high-density regions as HDRs). We omitted <it>INE-1 </it>from this analysis to prevent this very abundant family from dominating the overall genomic trends. Additionally, since it has been postulated that <it>INE-1 </it>underwent a burst of transposition prior to speciation and has subsequently become immobilized <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B32">32</abbr></abbrgrp>, <it>INE-1 </it>elements are predicted to be fixed (barring subsequent deletion). As such, their distribution in the sequenced strain should represent a more stable baseline of ancestral TE content to compare with other more recently active TE families. We identified 24 HDRs containing 10 or more (non-<it>INE-1</it>) TEs in a 50 Kb window, a cut-off of roughly 20-fold higher density of TEs than the majority of the genome (Figure <figr fid="F1">1</figr>, Table <tblr tid="T2">2</tblr>). Two HDRs have been previously reported: HDR8 at cytological division 38 <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> and HDR3 at cytological division 20A, which is likely to be fixed in <it>D. melanogaster </it><abbrgrp><abbr bid="B34">34</abbr></abbrgrp>.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Regions with extreme TE density in the <it>D. melanogaster </it>Release 4 genome sequence</p>
               </caption>
               <tblbdy cols="10">
                  <r>
                     <c ca="left">
                        <p>HDR</p>
                     </c>
                     <c ca="left">
                        <p>Chromosome</p>
                     </c>
                     <c ca="center">
                        <p>Start</p>
                     </c>
                     <c ca="center">
                        <p>End</p>
                     </c>
                     <c ca="center">
                        <p>No. of families</p>
                     </c>
                     <c ca="center">
                        <p>No. of TEs</p>
                     </c>
                     <c ca="center">
                        <p>No. nested</p>
                     </c>
                     <c ca="center">
                        <p>Duplicated TEs</p>
                     </c>
                     <c ca="center">
                        <p>Collinear</p>
                     </c>
                     <c ca="center">
                        <p>Genes</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="10">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>X</p>
                     </c>
                     <c ca="center">
                        <p>19,744,508</p>
                     </c>
                     <c ca="center">
                        <p>19,790,060</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>2 (8)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>X</p>
                     </c>
                     <c ca="center">
                        <p>20,958,143</p>
                     </c>
                     <c ca="center">
                        <p>20,988,686</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>3*</p>
                     </c>
                     <c ca="left">
                        <p>X</p>
                     </c>
                     <c ca="center">
                        <p>21,332,555</p>
                     </c>
                     <c ca="center">
                        <p>21,366,773</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>4<sup>&#8224;</sup></p>
                     </c>
                     <c ca="left">
                        <p>X</p>
                     </c>
                     <c ca="center">
                        <p>21,434,542</p>
                     </c>
                     <c ca="center">
                        <p>21,663,556</p>
                     </c>
                     <c ca="center">
                        <p>42</p>
                     </c>
                     <c ca="center">
                        <p>104</p>
                     </c>
                     <c ca="center">
                        <p>39</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>X</p>
                     </c>
                     <c ca="center">
                        <p>21,726,082</p>
                     </c>
                     <c ca="center">
                        <p>21,780,371</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>X</p>
                     </c>
                     <c ca="center">
                        <p>21,883,728</p>
                     </c>
                     <c ca="center">
                        <p>21,974,732</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>X</p>
                     </c>
                     <c ca="center">
                        <p>22,085,438</p>
                     </c>
                     <c ca="center">
                        <p>22,224,390</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                     <c ca="center">
                        <p>38</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>Base</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>2L</p>
                     </c>
                     <c ca="center">
                        <p>20,100,865</p>
                     </c>
                     <c ca="center">
                        <p>20,210,447</p>
                     </c>
                     <c ca="center">
                        <p>27</p>
                     </c>
                     <c ca="center">
                        <p>61</p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>9<sup>&#8225;&#167;</sup></p>
                     </c>
                     <c ca="left">
                        <p>2L</p>
                     </c>
                     <c ca="center">
                        <p>21,312,749</p>
                     </c>
                     <c ca="center">
                        <p>21,403,782</p>
                     </c>
                     <c ca="center">
                        <p>20</p>
                     </c>
                     <c ca="center">
                        <p>29</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>7 (3)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>10<sup>&#8225;&#167;</sup></p>
                     </c>
                     <c ca="left">
                        <p>2L</p>
                     </c>
                     <c ca="center">
                        <p>21,527,053</p>
                     </c>
                     <c ca="center">
                        <p>21,725,165</p>
                     </c>
                     <c ca="center">
                        <p>36</p>
                     </c>
                     <c ca="center">
                        <p>55</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>10 (1)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>2L</p>
                     </c>
                     <c ca="center">
                        <p>22,064,386</p>
                     </c>
                     <c ca="center">
                        <p>22,407,834</p>
                     </c>
                     <c ca="center">
                        <p>61</p>
                     </c>
                     <c ca="center">
                        <p>157</p>
                     </c>
                     <c ca="center">
                        <p>52</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>Base</p>
                     </c>
                     <c ca="center">
                        <p>19 (1)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>12*</p>
                     </c>
                     <c ca="left">
                        <p>2R</p>
                     </c>
                     <c ca="center">
                        <p>387</p>
                     </c>
                     <c ca="center">
                        <p>1,185,590</p>
                     </c>
                     <c ca="center">
                        <p>103</p>
                     </c>
                     <c ca="center">
                        <p>571</p>
                     </c>
                     <c ca="center">
                        <p>156</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>Base</p>
                     </c>
                     <c ca="center">
                        <p>45</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>13<sup>&#167;</sup></p>
                     </c>
                     <c ca="left">
                        <p>2R</p>
                     </c>
                     <c ca="center">
                        <p>1,744,145</p>
                     </c>
                     <c ca="center">
                        <p>2,011,104</p>
                     </c>
                     <c ca="center">
                        <p>42</p>
                     </c>
                     <c ca="center">
                        <p>92</p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>14</p>
                     </c>
                     <c ca="left">
                        <p>3L</p>
                     </c>
                     <c ca="center">
                        <p>22,910,473</p>
                     </c>
                     <c ca="center">
                        <p>23,771,865</p>
                     </c>
                     <c ca="center">
                        <p>91</p>
                     </c>
                     <c ca="center">
                        <p>411</p>
                     </c>
                     <c ca="center">
                        <p>128</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>Base</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>15</p>
                     </c>
                     <c ca="left">
                        <p>3R</p>
                     </c>
                     <c ca="center">
                        <p>310,015</p>
                     </c>
                     <c ca="center">
                        <p>436,430</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                     <c ca="center">
                        <p>37</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>16*</p>
                     </c>
                     <c ca="left">
                        <p>3R</p>
                     </c>
                     <c ca="center">
                        <p>8,294,200</p>
                     </c>
                     <c ca="center">
                        <p>8,327,684</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>38</p>
                     </c>
                     <c ca="center">
                        <p>33</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>17</p>
                     </c>
                     <c ca="left">
                        <p>3R</p>
                     </c>
                     <c ca="center">
                        <p>27,888,358</p>
                     </c>
                     <c ca="center">
                        <p>27,905,053</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>20</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>Tip</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>18</p>
                     </c>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>46,860</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>Base</p>
                     </c>
                     <c ca="center">
                        <p>2 (2)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>19</p>
                     </c>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>201,177</p>
                     </c>
                     <c ca="center">
                        <p>269,428</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>20</p>
                     </c>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>303,028</p>
                     </c>
                     <c ca="center">
                        <p>348,412</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>21</p>
                     </c>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>433,967</p>
                     </c>
                     <c ca="center">
                        <p>496,527</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>20</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>22</p>
                     </c>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>926,385</p>
                     </c>
                     <c ca="center">
                        <p>997,041</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>23</p>
                     </c>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>1,163,173</p>
                     </c>
                     <c ca="center">
                        <p>1,281,586</p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>44</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>Tip</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>HDRs were defined as having >10 non-<it>INE-1 </it>TEs in a 50 Kb window. Numbers of distinct families, numbers of TEs, number of TEs involved in nests, and the presence of duplicated TEs all exclude <it>INE-1</it>. A plus indicates that unique sequences flanking a HDR are in the collinear orientation in the <it>D. yakuba </it>genome. Orthologous regions could not be obtained for both flanking regions for HDRs at the tip or base of chromosome arms. Numbers of genes include coding and non-coding genes, with numbers of pseudogenes indicated in parentheses. *Likely to be fixed in <it>D. melanogaster</it>. <sup>&#8224;</sup>Physical gap present in HDR. <sup>&#8225;</sup>HDRs 9 and 10 flank the <it>Histone </it>gene cluster and likely represent a single HDR. <sup>&#167;</sup>'Weak points' in polytene chromosomes.</p>
               </tblfn>
            </tbl>
            <p>As expected, nearly all HDRs are located in pericentromeric regions or on chromosome 4, consistent with the general observation that heterochromatic and/or low-recombination rate regions of the genome sequence have high TE densities (see above) <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B7">7</abbr><abbr bid="B15">15</abbr></abbrgrp>. Three HDRs (1, 16, 17) on the major chromosome arms are located in regions not defined as pericentromeric; however, HDR1 on the X-chromosome is found very close to the boundary demarcating these regions and could probably be classified as pericentromeric. HDRs total 4.27 Mb of sequence and, therefore, comprise only 3.6% of the genome, but contain one-third (1,822/5,390; 33.8%) of annotated TEs. Interestingly, one of the most extreme regions of localized TE density in the <it>D. melanogaster </it>genome sequence (HDR4) contains the insertion site for a <it>P</it>-element induced allele (<it>flam</it><sup><it>py</it>+(<it>P</it>)</sup>) of the as-yet-uncharacterized gene <it>flamenco </it><abbrgrp><abbr bid="B35">35</abbr></abbrgrp>, one of the few genetic loci shown to regulate the activity of transposable elements in <it>Drosophila </it><abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. HDR4 (which includes the physical gap in cytological division 20) occupies over 230 Kb of DNA and contains at least 104 TEs and 6 genes, including <it>DIP1</it>, which has been excluded as being the gene that is causal for the <it>flamenco </it>mutation <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. We note that the <it>COM </it>locus also in 20A2-3, which is known to regulate the <it>ZAM </it>and <it>Idefix </it>families of LTR elements, is genetically separable from <it>flamenco </it><abbrgrp><abbr bid="B37">37</abbr></abbrgrp> and, therefore, unlikely to correspond to the same region.</p>
            <p>Two exceptional HDRs are found on chromosome arm 3R. HDR16 contains a set of duplicated, nested TEs in the intergenic region between <it>Hsp70Ba </it>and <it>Hsp70Bb </it>in division 87C (Figure <figr fid="F2">2a</figr>). This region contains the <it>&#945;&#946; </it>repeat <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, which our results indicate corresponds to a duplicated nest of <it>Dm88 </it>and <it>invader1 </it>sequences (see also <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B39">39</abbr></abbrgrp>. The fact that the <it>&#945;&#946; </it>repeat is composed of TE sequences, as predicted by Hackett and Lis <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>, explains the observation that components of the <it>&#945;&#946; </it>repeat are dispersed in multiple heterochromatic locations <abbrgrp><abbr bid="B40">40</abbr></abbrgrp> and share homology with 'clustered, scrambled' arrangements of middle repetitive DNA located elsewhere in the genome <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. This region also contains the non-coding RNA gene known as the <it>&#945;&#947;</it>-<it>element</it>, which is transcribed in response to heat shock <abbrgrp><abbr bid="B38">38</abbr><abbr bid="B42">42</abbr></abbrgrp> and is a chimeric transcript composed of <it>Dm88 </it>and <it>invader1 </it>sequences emanating from a fragment of the <it>Hsp70 </it>promoter <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. It is likely that the unusually high abundance of TE insertions in this region has arisen in part because of the unusual chromatin architecture of heat-shock promoters <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr></abbrgrp>. The peculiarity of this region is underscored by the fact that <it>&#945;&#946; </it>repeat has evolved since the divergence of <it>D. melanogaster </it>from its sister species <it>D. simulans </it><abbrgrp><abbr bid="B42">42</abbr><abbr bid="B46">46</abbr></abbrgrp>, but yet appears to be fixed in <it>D. melanogaster </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp>.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Example regions of extreme TE density</p>
               </caption>
               <text>
                  <p>Example regions of extreme TE density. <b>(a) </b>Structure of HDR16 in the <it>Hsp70B </it>region showing tandem arrays of an <it>invader1&#8594;DM88 </it>nest interrupted by <it>1360 </it>and <it>micropia </it>insertions and flanked by <it>S-element </it>insertions. Duplicate <it>Hsp70 </it>genes are shown at the bottom of the panel along with the non-coding RNA <it>&#945;&#947;-element</it>. <b>(b) </b>Structure of HDR1 showing tandem arrays of clustered <it>jockey</it>+<it>Rt1c </it>and <it>Stalker4</it>+<it>invader3 </it>elements interrupted by <it>invader2</it>, <it>F-element </it>and <it>mdg3 </it>insertions. This region also generates eight <it>CG32821</it>-like gene duplicates. Note that colors for TE families differ in (a,b).</p>
               </text>
               <graphic file="gb-2006-7-11-r112-2"/>
            </fig>
            <p>The second exceptional HDR (17) on chromosome arm 3R corresponds to a tandemly duplicated array of <it>invader4 </it>elements embedded within the sub-telomeric mini-satellites called telomere-associated sequences ('TAS'). We also found that TAS repeats from chromosome arm 2R <abbrgrp><abbr bid="B48">48</abbr></abbrgrp> and the original TAS repeat derived from the <it>Dp1187 </it>X-minichromosome <abbrgrp><abbr bid="B49">49</abbr></abbrgrp> also contain <it>invader4 </it>sequences (results not shown), although no homology to <it>invader4 </it>(or any other TE) is observed in the TAS repeat derived from chromosome arms 2L or 3L <abbrgrp><abbr bid="B48">48</abbr><abbr bid="B50">50</abbr></abbrgrp>, suggesting that TE sequences are not functionally constitutive components of TAS repeats. The presence of mobile TE sequences in TAS repeats may explain non-telomeric hybridization signal to TAS probes in the chromocenter and basal euchromatic locations <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. No HDRs are observed at the ends of other chromosome arms, despite the fact that, in <it>Drosophila</it>, the retrotransposons <it>Het-A</it>, <it>TART </it>and <it>TAHRE </it>function as telomeric repeats to ensure proper integrity of the chromosome ends <abbrgrp><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr><abbr bid="B53">53</abbr></abbrgrp>. In the Release 4 sequence, only the X chromosome and fourth chromosome <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> terminate with small clusters of telomeric TE sequences.</p>
         </sec>
         <sec>
            <st>
               <p>Mechanisms that generate localized regions of high TE density</p>
            </st>
            <p>Surprisingly, the improved resolution provided by our new annotation showed that TE density is not uniformly high in pericentromeric regions, nor is TE density simply an increasing function of proximity to centromeric regions (Figure <figr fid="F1">1</figr>, inset panels). This is especially true for chromosome arms X, 2L and 2R, where pericentromeric HDRs are interspersed with regions of normal TE density, creating a ragged, punctate increase in TE abundance in the direction of the centromere. Chromosome 4 also exhibits discrete regions of different TE density (Table <tblr tid="T2">2</tblr>), despite a higher overall level of TE abundance. Some HDRs (for example, 1, 8, 13, 16) clearly occur in regions of low <it>INE-1 </it>density, which suggests a recent origin for the high TE density in these regions, assuming that <it>INE-1 </it>represents the ancestral TE distribution at the time of its major burst activity prior to the split of <it>D. melanogaster </it>from its sister species <it>D. simulans </it><abbrgrp><abbr bid="B16">16</abbr><abbr bid="B32">32</abbr></abbrgrp>. Other HDRs (9, 10, 15 and those on the fourth chromosome) co-occur with regions of high <it>INE-1 </it>density, suggesting these regions of the genome have permitted a high density of TEs, at least as far back as the ancestor of the <it>D. melanogaster </it>species subgroup <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B32">32</abbr></abbrgrp>. This also is likely to hold true for HDRs 11, 12 and 14 at the bases of chromosome arms 2L, 2R and 3L, where non-<it>INE-1 </it>TEs occupy virtually all of the sequence, creating an apparent negative association with <it>INE-1 </it>density.</p>
            <p>What evolutionary mechanisms cause such a localized pattern of extreme TE density? Clearly, transposition is the ultimate source of all TE insertions in the genome, and accordingly HDRs typically contain a mix of different TE families and nested elements (Table <tblr tid="T2">2</tblr>), both hallmarks of recurrent transposition. However, it is possible that other mechanisms of genome evolution - such as inversion or duplication - might have contributed to the origin of HDRs. To investigate whether this punctate pattern of HDRs arose from chromosomal inversions that bring TE-rich, heterochromatic DNA into euchromatic regions, we extracted orthologous regions from the <it>D. yakuba </it>genome sequence and assayed whether the unique sequences flanking HDRs are collinear in the two species. We found that unique sequences flanking HDRs were collinear for 15 of the 16 HDRs (93.8%) that are internal to the ends of the chromosome arms, for which both flanking sequences can unambiguously be identified (Table <tblr tid="T2">2</tblr>, Figure <figr fid="F3">3a,b</figr>). Intriguingly, HDR 13 does occur in the same region as an inversion breakpoint between <it>D. melanogaster </it>and <it>D. yakuba</it>, but outgroup analyses place this inversion event on the <it>D. yakuba </it>lineage, not the <it>D. melanogaster </it>lineage (JM Ranz, D Maurin, YS Chan, LW Hillier, J Roote, M Ashburner and CM Bergman, personal communication). Thus, we found no evidence indicating that inversions carrying TE-rich DNA from heterochromatic regions generate HDRs, but remarkably we did find evidence that a region of the <it>D. melanogaster </it>genome that permits a high TE density can tolerate inversion breakpoints in other <it>Drosophila </it>lineages. It is important to note, however, that the majority of HDRs do not correspond to inversion breakpoint regions and <it>vice versa</it>.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Comparative sequence analysis of two regions of extreme TE density</p>
               </caption>
               <text>
                  <p>Comparative sequence analysis of two regions of extreme TE density. <b>(a,b) </b>Pairwise comparison of <it>D. melanogaster </it>HDRs with the orthologous segments from the <it>D. yakuba </it>genome. <b>(c,d) </b>Self-comparison of <it>D. melanogaster </it>HDRs. Note that the flanking sequences between species are collinear (a,b) and the presence of complex duplicated sequences (c,d).</p>
               </text>
               <graphic file="gb-2006-7-11-r112-3"/>
            </fig>
            <p>We did, however, find a relatively high incidence of duplicated sequences in HDRs, suggesting that tandem or segmental duplication plays an important role in the genesis of TE-rich regions of the genome: 13 of 23 HDRs show evidence of duplication (Table <tblr tid="T2">2</tblr>, Figures <figr fid="F2">2</figr> and <figr fid="F3">3c,d</figr>). Duplications in HDRs can contain multiple TEs from different families, often nested, sometimes with different copies of the duplicated region containing additional TE insertions (Figure <figr fid="F2">2</figr>). Duplications in HDRs also amplified cellular genes as well as TE sequences: for example, eight partial and complete duplicates of the gene <it>CG32381 </it>are present in HDR1 (Figure <figr fid="F2">2b</figr>). HDRs may also include retrotransposed gene duplicates, such as the <it>Mgst1</it>-like <it>CG12628 </it><abbrgrp><abbr bid="B54">54</abbr></abbrgrp>, which is found in a nest of TEs in HDR11. The series of events leading to tandem duplication of TEs in HDRs is often highly complex, with repeat structures present at different scales (Figure <figr fid="F3">3c,d</figr>). Duplication of TE sequences could also be observed in other regions of the genome with lower TE density, such as duplication of <it>Rt1c </it>elements interspersed between the <it>SDIC </it>gene duplicates <abbrgrp><abbr bid="B55">55</abbr><abbr bid="B56">56</abbr></abbrgrp>. A more thorough analysis of the interplay between TEs and segmental duplications will be the subject of a separate study (A-S Fiston, D Anxolabehere and H Quesneville, personal communication).</p>
         </sec>
         <sec>
            <st>
               <p>A graph-based approach to analyze patterns of TE nesting</p>
            </st>
            <p>Regions of extremely high TE density typically contain a high proportion of TEs inserted into other TEs, and our new annotation allowed us to examine patterns of TE nesting in greater detail than has previously been possible. Few methods exist to analyze TE nesting, partly because of limitations in accurately joining fragments of a TE insertion that become separated in the genome by a subsequent nested TE insertion, and partly because analysis of TE nesting is complicated by the redundancies inherent in complex nesting relationships. For example, if one TE (A) is nested within a second (B) that is in turn nested within a third (C), simply analyzing overlapping ranges of TEs in the genome will erroneously yield three nesting events (A&#8594;B, A&#8594;C, and B&#8594;C), when only two occurred historically (A&#8594;B and B&#8594;C). We found that complex nesting relationships could best be analyzed by identifying 'primary' nesting relationships (A&#8594;B and B&#8594;C in the example above) and assembly of these simple binary events into more complex nesting relationships by applying concepts from network analysis to describe and quantify patterns of TE nesting. In this formulation of the problem, TE nesting relationships are represented as a graph having TEs as nodes and transposition events as directed edges. The directed nature of this graph implies both the spatial relationships of nested TEs in the genome as well as temporal relationships implied in TE nesting resulting from the fact that the outer TE in a nest must have existed in the genome prior to the insertion of the inner TE <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>. This 'nesting graph' is amenable to standard computation and can be recast in several forms, since each annotated TE node can be analyzed at the individual, family or class level. (We chose not to analyze the degree of distribution of nesting graphs for 'small-world' properties because of biases resulting from duplicated nests, and because the subgraphs in the sequenced portion of the genome may not reflect properties of the entire nesting graph <abbrgrp><abbr bid="B58">58</abbr></abbrgrp>.)</p>
            <p>At the individual TE level, nesting relationships form a sparse, acyclic graph of 785 nodes and 491 edges that provides a detailed overview of the global pattern of TE nesting in the <it>D. melanogaster </it>genome (Figure <figr fid="F4">4</figr>). These 785 TEs (14.6% of all 5,390 TEs annotated) are found in 294 distinct nests, which can be calculated from the number of sink TEs (nodes in the graph that have an out-degree of zero). These 294 nests are formed by 448 source TEs (nodes in the graph with an in-degree of zero), and 43 TEs that act as internal nodes in the graph (with both non-zero in-degree and out-degree). The vast majority of TE nests in <it>D. melanogaster </it>(263/294, 89.4%) are composed of simple 'primary' nests with a maximal path length of one, consisting mainly of one (203/263, 77.2%) or sometimes greater than one (60/263, 22.8%) inner TE nesting into a single outer TE. Of the 31 nests with more complex nesting relationships, 25 have a maximal path length of two ('secondary' nests), and only 6 have a maximal path length of three ('tertiary' nests) (Figure <figr fid="F4">4</figr>). Relative to the proportion of the genome in each compartment, nests are highly enriched in pericentromeric regions (215/294, 73.1%), as well as on the fourth chromosome (27/294, 9.2%), but rare in non-pericentromeric regions (52/294, 17.7%).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Global nesting graph at the level of individual TEs</p>
               </caption>
               <text>
                  <p>Global nesting graph at the level of individual TEs. Nesting relationships among TEs are depicted as a directed, acyclic graph. Nodes (blue circles) represent individual TEs and directed edges (green arrows) represent transposition events that create primary nesting relationships, with complex nesting events represented as connected components of the graph. The majority of nests in the genome are characterized by one or more primary nesting relationships, while some larger nests are composed of secondary or tertiary nesting relationships. The largest nest (*) currently annotated in the genome is found on chromosome 2R at coordinates 1,763,561-1,829,561. The second largest nest (**) currently annotated in the genome has been described in detail previously by Maside <it>et al</it>. [34] and is found on chromosome X at coordinates 21,366,773-21,333,853.</p>
               </text>
               <graphic file="gb-2006-7-11-r112-4"/>
            </fig>
            <p>The nesting graph at the individual level provides details about the structures of all TE nests in the genome, but since individual TEs are members of distinct families nesting relationships can also be analyzed at the family level by relabeling nodes with family identifiers and collapsing redundant edges. Recasting the same set of TEs as a nesting graph at the family level provides novel means to study the physical proximity and historical co-existence of all TE families at a global level. Nesting relationships at the family level form a highly connected cyclic graph of 110 nodes and 334 edges (Figure <figr fid="F5">5</figr>), involving the vast majority (90.1%) of the 121 TE families represented in the Release 4 genome annotation. This result implies that nested TEs provide paths of sequences that connect virtually all families, and that a large diversity of novel chimeric sequences between different families exists in the junction regions between TEs in nests. Most TE families (80/110, 72%) are internal nodes in the graph acting as both inner and outer components of nests, with only 22 source families and 8 sink families. The majority of families (97/110, 88%) also form nested relationships with more than one other family. Fifteen families have members that transpose into another member of same family, forming self-loops (or cycles) in the graph. Self-nests require a genomic copy to be present into which another family member can insert, and are consistent with multiple bursts of transposition for a given family or a burst that extends over multiple host generations.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Global nesting graph at the level of TE families</p>
               </caption>
               <text>
                  <p>Global nesting graph at the level of TE families. Nodes (blue circles) represent TE families and directed edges (green arrows) represent observed instances of primary nesting relationships. Redundant edges that arise from the different instances in the genome of the same primary nesting event are not shown. Essentially all families of TEs form a single connected component. Note that cycles within and between families at the family level are formed from nests of individuals from different genomic locations.</p>
               </text>
               <graphic file="gb-2006-7-11-r112-5"/>
            </fig>
            <p>Directed cycles other than those from self-nests were also observed in the family-level nesting graph, clearly indicating either continuous or discrete periods of overlapping transpositional activity for different TE families in the lineage leading to <it>D. melanogaster</it>. Exhaustive enumeration detected 12 distinct cycles of length two (A&#8594;B&#8594;A), and 43 distinct cycles of length three (A&#8594;B&#8594;C&#8594;A), in the family-level nesting graph, with tens of thousands of distinct cycles of length less than ten. The complexity of the family-level nesting graph is such that it is not feasible to enumerate all cycles in reasonable time; however, a set of independent cycles that do not use the same edge can be extracted efficiently. Figure <figr fid="F6">6</figr> shows the set of edge-disjoint cycles of length greater than three in the family-level nesting graph, and provides examples of the complex periods of contemporaneous TE activity that must be invoked to explain the global pattern of nesting at the family level. These procedures detect many novel examples of nesting among families, in addition to classical examples such as <it>NOF&#8594;FB </it>nesting <abbrgrp><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr></abbrgrp>.</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Directed cycles in the family-level TE nesting graph</p>
               </caption>
               <text>
                  <p>Directed cycles in the family-level TE nesting graph. Shown are the set of edge-disjoint directed cycles of path length greater than three. Nodes (blue circles) represent TE families and directed edges (green arrows) represent observed instances of primary nesting relationships. Note that many thousands of distinct directed cycles that share edges in common can be enumerated in the family-level nesting graph in addition to those shown here.</p>
               </text>
               <graphic file="gb-2006-7-11-r112-6"/>
            </fig>
            <p>The complexity of nesting among TEs observed at the individual and family levels simplifies at the class level (Table <tblr tid="T3">3</tblr>). The nesting graph at the class level is complete save for events involving the rare Foldback (<it>FB</it>) class of TEs, with instances of all possible types of nesting between LTR, non-LTR, TIR and <it>INE-1 </it>elements observed in the genome. The most frequent type of nesting event at the class level is LTR&#8594;LTR (151/491, 30.7%) and LTR elements form both the most frequent inner (233/491, 47.4%) and outer (207/491, 42.1%) components of nests, extending the finding based on Release 3 that LTR elements are most often involved in nests or clusters <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. The rank order of abundance for both inner and outer members of nests is LTR > non-LTR > TIR > <it>INE-1 </it>> <it>FB</it>, which follows the trend for amount of TE sequence in the genome by class rather than number of TEs, indicating that target size influences class level nesting patterns (Table <tblr tid="T1">1</tblr>). The observed number of nests for pairwise combinations of classes departs significantly from the random expectation based on the marginal counts of inner and outer nests for each class (&#967;<sup>2 </sup>= 144.9, 16 degrees of freedom (df), p &lt; 10<sup>-16</sup>). Non-random patterns of nesting are observed just for the three major classes of TEs (&#967;<sup>2 </sup>= 81.8, 4 df, p &lt; 10<sup>-16</sup>), suggesting that neither the low <it>FB </it>counts nor undue influence by <it>INE-1 </it>causes this pattern. The non-random pattern of nesting appears in large part to be the result of preferences for TEs to nest in other TEs of the same class, which may represent some sort of a 'homing effect' mediated through protein complexes shared by the TEs belonging to the same class. Alternatively, the non-random pattern of nesting among TE classes may also result from complex historical factors, including the total amount of pre-existing TE sequence present in the genome as targets for insertion, and/or a non-random order of transpositional events among families and classes of TEs.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Patterns of nesting among different classes of TE in the <it>D. melanogaster </it>Release 4 genome sequence</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>LTR</p>
                     </c>
                     <c ca="center">
                        <p>Non-LTR</p>
                     </c>
                     <c ca="center">
                        <p>TIR</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>INE-1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>FB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>Outer total</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                     <c ca="center">
                        <p>151</p>
                     </c>
                     <c ca="center">
                        <p>39</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>207</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Non-LTR</p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>119</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TIR</p>
                     </c>
                     <c ca="center">
                        <p>28</p>
                     </c>
                     <c ca="center">
                        <p>29</p>
                     </c>
                     <c ca="center">
                        <p>32</p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>113</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>INE-1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>47</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>FB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Inner total</p>
                     </c>
                     <c ca="center">
                        <p>233</p>
                     </c>
                     <c ca="center">
                        <p>137</p>
                     </c>
                     <c ca="center">
                        <p>62</p>
                     </c>
                     <c ca="center">
                        <p>49</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Observed numbers of 25 possible categories of TE nests from 5 classes of TEs (LTR, non-LTR, TIR, <it>INE-1 </it>and <it>FB</it>) from 491 total nests.</p>
               </tblfn>
            </tbl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>Organization of TEs in &#946;-heterochromatic regions</p>
            </st>
            <p>The nature of the transition zone between euchromatin and heterochromatin in <it>D. melanogaster </it>has been the subject of much controversy, in part because heterochromatic regions (as defined in mitotic chromosomes) can be further subdivided into &#945;-heterochromatin and &#946;-heterochromatin <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>. &#946;-heterochromatic regions are cytologically visible in polytene chromosomes, although their banding pattern is 'diffuse' or 'mesh-like,' suggesting under-replication relative to the finely banded euchromatic regions (reviewed in <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>). Under-replicated regions are observed elsewhere in polytene chromosomes and co-localize with regions referred to as 'weak points' or 'intercalary heterochromatin' that form ectopic contacts and are subject to chromosome breakage <abbrgrp><abbr bid="B62">62</abbr><abbr bid="B63">63</abbr></abbrgrp>. The amount and degree of polytenization in &#946;-heterochromatic regions is subject to both environmental and genetic factors <abbrgrp><abbr bid="B64">64</abbr></abbrgrp>, as most conclusively shown by the appearance of several large banded regions in the chromocenter of salivary gland chromosomes of the <it>Su</it>(<it>UR</it>) mutant <abbrgrp><abbr bid="B65">65</abbr></abbrgrp>. Charlesworth <it>et al</it>. estimate that 10% of the <it>D. melanogaster </it>genome is composed of &#946;-heterochromatin <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> and large amounts of &#946;-heterochromatic DNA are found in pericentromeric regions of most (but not all) chromosome arms <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>, a fraction of which is captured in the Release 4 genome sequence (Figure <figr fid="F1">1</figr>).</p>
            <p>Analysis of the first draft of the <it>D. melanogaster </it>genome sequence offered the first glimpse of the contiguous molecular organization of &#946;-heterochromatin, and suggested that "there is no clear boundary between heterochromatin and euchromatin" but rather that the transition is characterized by "a gradual increase in the density of transposable elements and other repeats" <abbrgrp><abbr bid="B66">66</abbr></abbrgrp>. The view that the &#946;-heterochromatic regions exhibit a gradual increase in TE density has been subsequently reiterated <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B67">67</abbr></abbrgrp>, although our results call this view into question for three of the five major chromosome arms. Far from a gradual transition, our high-resolution TE annotation provides evidence for discretely localized regions of extremely high TE density at the base of chromosome arms X, 2L and 2R overlain on a background of increased TE abundance, such that the increase in TE content is not monotonic in the direction of the centromere. This result represents the inverse of, and provides an explanation for, previous observations that the distribution of genes on these chromosome arms alternates between low and high density in the centromere proximal direction <abbrgrp><abbr bid="B66">66</abbr><abbr bid="B67">67</abbr></abbrgrp>. We note that the alternating pattern of high and low TE (versus unique) sequences reported here in &#946;-heterochromatic regions differs from the 'islands' of complex (TE) sequences surrounded by 'seas' of satellite DNA observed deeper in &#945;-heterochromatic regions <abbrgrp><abbr bid="B68">68</abbr></abbrgrp>.</p>
            <p>How does the ragged, punctate pattern of TE density affect the interpretation of the transition zone between euchromatin and heterochromatin? If TE density is directly responsible for heterochromatin formation, then such discrete regions of extreme TE density may argue against a gradual transition between euchromatin and heterochromatin, and would support the model of Lifschytz <abbrgrp><abbr bid="B69">69</abbr></abbrgrp>, who suggested several distinct transitions between euchromatin and heterochromatin in cytological divisions 19-20 of the X chromosome. However, as noted by Yamamoto <it>et al</it>. <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>, the interpretation of multiple, discrete transitions between euchromatin and heterochromatin by Lifschytz <abbrgrp><abbr bid="B69">69</abbr></abbrgrp> was based indirectly on the distribution of X-ray induced deletions, rather than direct cytological evidence for heterochromatic properties. We integrated our annotation of HDRs in the Release 4 genome sequence with the genetic map of Lifschytz <abbrgrp><abbr bid="B69">69</abbr></abbrgrp> and found a striking correspondence between our HDRs and 'hotspots' for X-ray induced deletions in his analysis. Based on the few complementation groups that can be mapped to the genome <it>(A112 </it>= <it>Hlc</it>, <it>M67 </it>= <it>fog</it>, and <it>X-3 </it>= <it>stn</it>), we hypothesize that our HDRs 2, 3, 4, 5, 6, and 7 correspond to hotspot intervals 18, 12, 11, 9, 7, and 6, respectively, in <abbrgrp><abbr bid="B69">69</abbr></abbrgrp> (Figure <figr fid="F7">7</figr>). The major hot-spot for X-ray induced breakage in this region (interval 11 in 20A) most likely corresponds to HDR4, which we find to be the region of highest TE density in the genome sequence. Together, these results suggest that the Lifschytz <abbrgrp><abbr bid="B69">69</abbr></abbrgrp> data may simply reflect preferential breakpoint use in TE-rich regions devoid of genic function, rather than multiple distinct transitions between euchromatin and heterochromatin. These conclusions support those of Ashburner <it>et al</it>. <abbrgrp><abbr bid="B71">71</abbr></abbrgrp>, who showed that the distribution of rearrangement breakpoints in the <it>Adh </it>region correlates with the amount of DNA in an interval rather than with any property of the sequence itself.</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>HDRs are hotspots for X-ray induced deletion</p>
               </caption>
               <text>
                  <p>HDRs are hotspots for X-ray induced deletion. Alignment of the genetic map adapted from Figure 1 of Lifschytz [69] and the Release 4 genome annotation in the interval from <it>Hlc </it>(= <it>A112</it>) to <it>fog </it>(= <it>M67</it>) shows a one-to-one correspondence between HDRs 3, 4, 5 and 6 with X-ray hotspot intervals 12, 11, 9 and 7, respectively. Additional HDRs and X-ray hotspots discussed in the text are omitted for clarity.</p>
               </text>
               <graphic file="gb-2006-7-11-r112-7"/>
            </fig>
            <p>Further evidence that discrete regions of extreme TE density outside of &#946;-heterochromatic regions may have unusual cytological properties can be found on chromosome 2. Discrete HDRs can be observed in the vicinity of the <it>Histone </it>cluster in 39E (HDRs 9+10) and just distal to the major tRNA cluster at 42A (HDR13). Both of these regions are known to be 'weak points' in polytene chromosomes, which form breaks and ectopic contacts with other weak points in the genome that are alleviated by the <it>Su</it>(<it>UR</it>) mutation, suggesting that these regions are under-replicated in polytene chromosomes <abbrgrp><abbr bid="B65">65</abbr></abbrgrp>. These observations, together with the generally poor banding patterns in high TE density pericentromeric regions and on the fourth chromosome, suggest that high TE density may directly interfere with the process of polytenization, either through stalling replication forks <abbrgrp><abbr bid="B72">72</abbr></abbrgrp> or through DNA elimination <abbrgrp><abbr bid="B73">73</abbr></abbrgrp>. Thus, high TE densities may not be directly responsible for heterochromatin formation <it>per se</it>, but may simply inhibit the ability to detect <it>bona fide </it>euchromatic regions that are TE dense, at least in salivary gland polytene chromosomes. The formation of large blocks of TE-rich, banded material deep in heterochromatic regions in under-replication suppressing strains like <it>Su</it>(<it>UR</it>) supports this view <abbrgrp><abbr bid="B65">65</abbr><abbr bid="B74">74</abbr></abbrgrp>. Moreover, if regions of high TE density affect polytenization, ectopic contact among 'weak points' may occur <it>via </it>homology between sequences of the same TE family. Additionally, the inherent mobility of TEs provides a mechanism to explain differences in the presence or absence of &#946;-heterochromatin on homologous chromosome arms among <it>Drosophila </it>species <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Origin of 'clustered scrambled repeats'</p>
            </st>
            <p>Although the predominant organization of middle repetitive DNA such as TE sequences in <it>D. melanogaster </it>is characterized by individu