<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2006-7-10-r97</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Software</dochead>
      <bibl>
         <title>
            <p>Unraveling transcriptional control and <it>cis</it>-regulatory codes using the software suite GeneACT</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Cheung</snm>
               <mnm>Hiu</mnm>
               <fnm>Tom</fnm>
               <insr iid="I1"/>
               <email>Tom.Cheung@Colorado.edu</email>
            </au>
            <au id="A2">
               <snm>Kwan</snm>
               <mnm>Lam</mnm>
               <fnm>Yin</fnm>
               <insr iid="I2"/>
               <email>Kwan.y@dharmacon.com</email>
            </au>
            <au id="A3">
               <snm>Hamady</snm>
               <fnm>Micah</fnm>
               <insr iid="I2"/>
               <email>Hamady@Colorado.edu</email>
            </au>
            <au id="A4" ca="yes">
               <snm>Liu</snm>
               <fnm>Xuedong</fnm>
               <insr iid="I1"/>
               <email>xuedong.liu@colorado.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Chemistry and Biochemistry, University of Colorado, 215 UCB, Boulder, Colorado 80309, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Computer Science, University of Colorado, 430 UCB, Boulder, Colorado 80309, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2006</pubdate>
         <volume>7</volume>
         <issue>10</issue>
         <fpage>R97</fpage>
         <url>http://genomebiology.com/2006/7/10/R97</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17064417</pubid>
               <pubid idtype="doi">10.1186/gb-2006-7-10-r97</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>16</day>
               <month>6</month>
               <year>2006</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>18</day>
               <month>9</month>
               <year>2006</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>25</day>
               <month>10</month>
               <year>2006</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>25</day>
               <month>10</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>Cheung et al.; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <shorttitle>
         <p><it>Cis</it>-regulatory code browser</p>
      </shorttitle>
      <shortabs>
         <p>GENEACT, a new software suite for the detection of evolutionarily conserved transcription factor binding sites or microRNAs from differentially expressed genes from DNA microarray data, is described.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>Deciphering gene regulatory networks requires the systematic identification of functional <it>cis</it>-acting regulatory elements. We present a suite of web-based bioinformatics tools, called GeneACT <url>http://promoter.colorado.edu</url>, that can rapidly detect evolutionarily conserved transcription factor binding sites or microRNA target sites that are either unique or over-represented in differentially expressed genes from DNA microarray data. GeneACT provides graphic visualization and extraction of common regulatory sequence elements in the promoters and 3'-untranslated regions that are conserved across multiple mammalian species.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Rationale</p>
         </st>
         <p>Cell type and tissue specific gene expression patterns are primarily governed by the <it>cis</it>-regulatory sequence elements embedded in the noncoding regions of the genome. These <it>cis</it>-regulatory elements are often recognized in a sequence-specific manner by regulatory proteins or nucleic acids, which regulate the expression of the corresponding gene. In particular, activation and repression of gene transcription typically involves the binding of transcription factors to their cognate binding sites. The levels of mRNA transcript can also be modulated by microRNAs (miRNA), which tend to bind specific sequences in the 3'-untranslated region (UTR) of the transcript. Identification and characterization of <it>cis</it>-regulatory sequence elements that control gene expression are crucial to our understanding of the molecular basis of cell proliferation and differentiation.</p>
         <p>Until recently, identification of <it>cis</it>-regulatory sequences was conducted experimentally on an individual gene basis, using time-consuming procedures such as promoter cloning, chromatin immunoprecipitation (ChIP) assays, and reporter gene assays using truncated and/or mutated DNA sequences. Given that hundreds of transcription factors regulate the expression of thousands of genes in the human genome, more high-throughput procedures are desired. The sequencing of several genomes, DNA microarray assays, and the rise of bioinformatics represent major steps forward in this regard.</p>
         <p>Sequencing of the human, mouse, and rat genomes has made it possible to perform genome-wide analyses of regulatory sequence motifs across these species. Such a comparative genomics analysis is powerful because functional transcription factor binding sites are likely to be under stronger evolutionary constraints than random DNA sequences. Therefore, reliable and effective identification of regulatory elements could be achieved using interspecies sequence alignments of orthologous genes <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. Indeed, cross-species conservation has been employed to predict conserved transcription factor binding sites and to annotate promoters in mammals <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>. In these cases, the comparative genomics information improved the accuracy of predicting biologically relevant transcription factor binding sites.</p>
         <p>DNA microarray technology is used to profile relative mRNA transcript levels between samples exposed to different experimental conditions. DNA microarrays represent a high-throughput, genome-wide experimental platform that enables analyses of differential gene expression. Differences in transcript levels could be caused by several mechanisms, most notably the differential activities of transcription factors and miRNA. The interpretation of DNA microarray results requires deciphering which transcription factors and/or miRNA are likely to mediate the observed changes in transcript levels. We expect that co-expressed genes may share similar <it>cis</it>-acting regulatory elements, which suggests that such elements are likely to be over-represented in co-regulated genes more than would be expected by random chance. Flanking sequences for each gene are known from sequencing efforts, and many of the sequences to which individual transcription factors tend to bind have been determined experimentally and catalogued in databases such as the Transcription Factor Database (TFD) <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and TRANSFAC <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>; therefore, the systematic, high-throughput prediction of specific <it>cis</it>-regulatory mechanisms important in a given biologic context is now possible. Indeed, a number of computational programs have been developed to reveal transcription factor binding sites that are statistically over-represented in co-regulated genes <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>.</p>
         <p>Several deficiencies exist in currently available software for predicting <it>cis</it>-regulatory elements. Most importantly, there is no program currently available that incorporates search tools for both transcription factor and miRNA binding sites. Recent studies with miRNA suggest that differential miRNA expression could be responsible for differential mRNA expression observed by DNA microarray data <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>. Therefore, it is imperative to investigate both transcription factor binding sites and miRNA binding sites in order to gain a more comprehensive understanding of the molecular basis of differential gene expression patterns. Second, an integrated web-based <it>cis</it>-acting element browser for rapid identification of over-represented potential transcription factor binding sites and putative miRNA target sites has yet to be developed. The lack of an easy-to-navigate graphical web interface has hindered verification of computational predictions by experimental biologists who may be less comfortable with less accessible interfaces.</p>
         <p>In this report we describe a suite of web-based, open source bioinformatics software tools (GeneACT) that graphically display transcription factor binding sites and microRNA target sites in the regulatory regions of human, mouse, and rat genomes. In addition, we present a unique method to identify quickly transcription factor binding sites or miRNA target elements that are over-represented in differentially expressed genes based on DNA microarray data. Thus, GeneACT enables the identification of putative <it>cis</it>-acting elements that are evolutionarily conserved across species for a specified set of genes, which can be used to unravel transcriptional regulatory networks that are likely to be involved in differential gene expression.</p>
      </sec>
      <sec>
         <st>
            <p>Development of GeneACT</p>
         </st>
         <p>GeneACT, an overview of which is given in Figure <figr fid="F1">1</figr>, is a suite of web-based bioinformatics tools including four useful search interfaces: differential binding site search (DBSS), potential binding site search (PBSS), genomic sequence retrieval, and TFD search. All tools are designed to characterize the regulatory regions of a specified set of genes employing the technique of comparative genomics. Genomic sequence data from human (May 2004 release), mouse (May 2004 release), and rat (June 2003 release) were downloaded from the NCBI (National Center for Biotechnology Information) ftp site <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. TFD <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> and ortholog information (National Center for Biotechnology Information [NCBI] HomoloGene build 37.2) <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> were also downloaded from the NCBI ftp sites and employed as described below.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Overview of the GeneACT architecture and method</p>
            </caption>
            <text>
               <p>Overview of the GeneACT architecture and method.</p>
            </text>
            <graphic file="gb-2006-7-10-r97-1"/>
         </fig>
         <p>Detailed documentation of each of the tools in GeneACT can be found on the GeneACT website <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. GeneACT is mainly written using Java and makes use of Tomcat as the web server. The web front end communicates with the back end via Java server page. Genomic and pre-processed data are stored in a postgreSQL database. Tutorials for GeneACT can be found on the website <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>Differential binding site search</p>
         </st>
         <p>Pre-processing of sequence data underlying the GeneACT tools was carried out as follows. DBSS, the interface of which is shown in Figure <figr fid="F2">2</figr>, offers a choice of three searchable regions. The first region is denoted 'upstream of start codon', and to facilitate this search we stored the occupancies of all the binding sites in our regulatory sequence database (approximately 7000 known binding sites) in each gene found in a HomoloGene group that spans all three species up to 10,000 base pairs (bp) upstream from the start codon. We define a conserved binding site as one that is found in each of the three species within the search region, and only those binding sites that are conserved are stored for DBSS. Although promoters are frequently found near the 5'-UTRs, it is often the case that regulatory regions can be thousands of base pairs away from the transcriptional start site (for example, distal enhancers) <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>. As a result, we extended our search region up to 10,000 bp away from the start codon in order to cover the region of the 5'-UTR and regions that might contain these distal enhancers.</p>
         <fig id="F2">
            <title>
               <p>Figure 2</p>
            </title>
            <caption>
               <p>Web interface of the differential binding site search</p>
            </caption>
            <text>
               <p>Web interface of the differential binding site search. Gene IDs from control gene set (unchanged in DNA microarray data) and regulated gene set (upregulated or downregulated from microarray data) are pasted into respective windows. The threshold of binding site ratio is defined by the user. The user can specify a range of interest with three choices of regions (upstream from the transcription start site, upstream from the start codon, or downstream from the stop codon). TF, transcription factor.</p>
            </text>
            <graphic file="gb-2006-7-10-r97-2"/>
         </fig>
         <p>The second option for searchable region is 'downstream of stop codon'. Similar pre-processing was done for the downstream region from -2000 to +100 (2000 bp downstream of the transcript end) with respect to the stop codon. All incidences of transcription factor binding sites spanning all three species were also stored for this region. Finally, we offer a search option dedicated to detecting the occurrences of miRNA binding sites. In this case, the 3'-UTRs, defined as the region between the stop codon and the polyA signal, were extracted from the genome assemblies, and we employed miRanda <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, which is an algorithm for finding miRNA targets sites in 3'-UTRs <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. This algorithm is based on a modified version of the Smith-Waterman algorithm <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Instead of building an alignment based on matching nucleotides, its score is based on the complementarity of nucleotides; this also allows G = U 'wobble' pairs, which are important for RNA:RNA duplex formation <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. In addition, free energy is also calculated to estimate the energetics of the RNA:RNA complexes using the Vienna library. This feature makes the algorithm a preferred choice in searching for miRNA recognition sites because miRNAs form imperfect base pairs with the target mRNA <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. To provide more stringent search results, we deposited into our database only the mature miRNA sequences from the miRBase database <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> that are absolutely conserved in all three species.</p>
         <p>3'-UTRs from all three mammalian genomes are extracted and individually searched for potential miRNA target sites. Using the approach developed by Enright and coworkers <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>, we pre-processed all three genomes individually for potential miRNA target sites. In order to count as a potential miRNA target site, we required the miRNA target sites to be found in each of the three genomes. Furthermore, it is speculated that multiple occurrences of the same miRNA target sequence in the 3'-UTR of a given mRNA increases the probability of it being regulated by that miRNA. Therefore, we introduced customizable searches by filtering the target sites into three categories based on the number of conserved matches found. In the first case, at least one conserved match must be present in the 3'-UTR of the target mRNA. For the second and the third cases, at least two or three conserved matches of the same miRNA must be present in the same target mRNA 3'-UTR, respectively. To qualify as a potential target site, the miRNA target site must be conserved across all three genomes. Users can access the database via the GeneACT web interface <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>Potential binding site search</p>
         </st>
         <p>In order to display the presence of consensus transcription factor binding site sequences on a promoter that spans multiple species, we developed a novel Scalable Vector Graphic (SVG)-based graphical interface to display this information in a promoter-oriented way. Using the PBSS, regulatory regions of genes in multiple species along with the consensus TFD binding site information can be quickly visualized. The interface of PBSS is shown in Figure <figr fid="F3">3a</figr>. PBSS takes as input a set of NCBI Entrez gene IDs or gene names and the selected region to visualize. PBSS automatically retrieves the specified region for each gene in the input set based on the corresponding genome annotation. There are three specific regions that can be searched: the regulatory region of a gene upstream of the transcription start site, upstream from the start codon, and downstream from the stop codon. Alternatively, custom sequences can be specified. Along with the use of TFD, users can also enter arbitrary binding site IUPAC (International Union of Pure and Applied Chemistry) degenerate sequences. If the 'across genomes' option is selected, then only the binding sites that span the selected genomes are reported. In addition to the SVG graphical display, users can also choose to generate tab-delimited text, which can be readily imported into other programs such as Microsoft Excel. A sample SVG graphical output for the gene <it>CDC2 </it>(cell division cycle 2) is shown in Figure <figr fid="F3">3b</figr>.</p>
         <fig id="F3">
            <title>
               <p>Figure 3</p>
            </title>
            <caption>
               <p>Web interface of the potential binding site search</p>
            </caption>
            <text>
               <p>Web interface of the potential binding site search. <b>(a) </b>Web interface of potential binding site search. Gene IDs can be input in the form of either gene names (synonyms supported) or NCBI Entrez gene ID. There are currently three species to choose from (human, mouse, and rat) and it is optional to display whether the binding site sequence goes across genomes or to display all binding sites regardless of conservation across species. The user can specify a range of interest with three choices of regions (upstream from the transcription start site, upstream from the start codon, or downstream from the stop codon). Other than binding sites in the Transcription Factor Database (TFD), the user can input binding site sequences using standard IUB/IUPAC nucleic acid codes. For output option, the user can choose the visualization option for the promoter browser or a text file output. <b>(b) </b>Visualization of the <it>CDC2 </it>upstream region using GeneACT promoter browser. <it>CDC2 </it>upstream region (-500 to +100 base pairs) is shown, where +1 is the transcription start site. Only binding site sequences that go across all genomes are shown. Chromosomal locations of the binding site sequences and the full sequences are available in text file format via the 'download result' and 'download FASTA file' links. <b>(c) </b>Visualization of elongation factor-2 (E2F)-binding sites in the <it>CDC2 </it>upstream region. It is the same region as is shown in Figure 3b, with only the E2F sites highlighted. Other binding sites were suppressed by the toggle.</p>
            </text>
            <graphic file="gb-2006-7-10-r97-3"/>
         </fig>
         <p>The benefits of the SVG graphical display of the regulatory regions of genes, presented in a regulatory motif-oriented fashion for each species, are numerous (Figure <figr fid="F3">3b</figr>). One major advantage of the SVG graphical display is that it provides dynamic controls such that the user can switch on and off the display for each binding site and change the range of the location. Furthermore, in moving the cursor over individual binding sites, additional information, such as the binding site sequence pattern and the location of the binding site, can be displayed. Interestingly, the <it>CDC2 </it>motifs are conserved around the -150 bp region, of which two of the binding sites are elongation factor-2s (E2Fs). In Figure <figr fid="F3">3c</figr>, the same region is displayed with only the E2F-binding sites highlighted. Indeed, this regulatory region has been cloned by Zhu and coworkers <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>, and the region was shown to be responsive to E2Fs. Using the GeneACT promoter browser, the arrangement of the binding sites across genomes can be easily visualized. Based on this analysis, the user can identify a potential regulatory region in a faster and more educated fashion than the traditional method of arbitrary sequential deletion analysis. The ease of use and clear presentation should be an attractive feature for experimental biologists.</p>
      </sec>
      <sec>
         <st>
            <p>Genomic sequence retrieval and Transcription Factor Database search</p>
         </st>
         <p>GeneACT also provides other tools to make promoter analysis easier. The genomic sequence retrieval tool allows the user to retrieve genomic sequences in a FASTA format using relative position with respect to the transcription start site, start codon, or stop codon. When the input has more than one gene name or gene ID, sequences are returned in a concatenated FASTA file. Information about the sequence such as the chromosomal location, gene name, synonyms, and gene ID are printed in the header of the FASTA file. For the genes that are annotated to be on the reverse complement strand, this tool returns the sequence on the reverse complement strand.</p>
         <p>TFD search can be used to perform a query in the TFD dataset for binding site sequence or transcription factor name (Figure <figr fid="F4">4</figr>). Other than transcription factor binding sites, miRNA-binding sequences are also important for regulation of gene expression. To keep the database contents up to date, the user can submit putative novel binding site sequences via this tool. All submissions will be curated and deposited into our database. These new binding sites will then be included for the next round of pre-processing for DBSS such that they will be available for searches within all tools in GeneACT. In this way, GeneACT will remain relevant to the current literature. For the most in-depth information on how to use GeneACT, help documentation is available on the website <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>.</p>
         <fig id="F4">
            <title>
               <p>Figure 4</p>
            </title>
            <caption>
               <p>Search transcription factor binding site database</p>
            </caption>
            <text>
               <p>Search transcription factor binding site database. <b>(a) </b>Custom transcription factor database based on Transcription Factor Database (TFD). Database can be queried by sequence and name. New entries into the database can be added by the system administrator. <b>(b) </b>Display of the search result of a transcription factor binding site. The literature information of the binding site is shown.</p>
            </text>
            <graphic file="gb-2006-7-10-r97-4"/>
         </fig>
      </sec>
      <sec>
         <st>
            <p>Mining gene expression data using differential binding site search</p>
         </st>
         <p>The use of microarrays to elucidate genome-wide gene expression patterns is now standard practice. These microarray experiments generate large sets of differentially expressed genes, but the actual mechanism that controls the differential gene expression cannot readily be deduced using this technique alone. To ascertain the <it>cis</it>-regulatory elements that could mediate the differential gene expression patterns, we developed the DBSS tool to explore the distributions of regulatory sequence elements between the differentially expressed genes compared with those of the control genes. A corollary to the importance of <it>cis</it>-acting regulatory elements to generating differential gene expression patterns is that some of the co-expressed genes may share a common subset of these elements, and the observed frequency of these elements in the upregulated or downregulated gene set should be greater than in the unchanged gene set.</p>
         <p>DBSS tracks the frequencies of <it>cis</it>-acting elements conserved in human, mouse, and rat in a given set of genes and reveals the over-represented <it>cis</it>-acting elements in comparison with a control gene set. DBSS takes as input two sets of genes: a control set and a regulated set. For the purposes of identifying over-represented transcription factor binding sites in the regulated set, the regulatory regions of each gene in both sets are searched for transcription factor binding sites that are conserved across each genome. At present, we have pre-processed each gene that contains ortholog information in NCBI HomoloGene for the -10,000 bp to +100 bp region centered on the start codon and the -2000 bp to +100 bp region centered on the stop codon for the purposes of looking for enriched transcription factor binding sites. Restricting the binding sites solely to those that span multiple genomes is intended to reduce background noise. However, certain short degenerate binding site sequences may still appear as false positives. Thus, we use the control set of genes to reduce further the false-positive rate because these types of binding sites are also expected to appear with high frequency in this dataset as well.</p>
         <p>Specifically, the DBSS calculates the frequency at which each binding site occurs in genes from both the regulated set and control set. The fold change in frequency of each binding site between the regulated and control gene sets is calculated in order to find binding sites that are enriched in the regulated set. For binding sites that do not contribute to the regulation of a particular gene, we expect there to be no relative change in frequency. These genes are then filtered from the results by specifying a lower bound for the 'binding site ratio' option on the search interface. For example, to keep only the binding sites that have three times the frequency in the regulated set versus the control set, one would specify a lower bound of three. By looking at the binding sites that have a large ratio (fold change) between the regulated set genes and control set genes, the binding site sequences that are potentially important to the regulation of a given system under specific conditions or treatments can quickly be determined. In this way, the regulatory mechanism of how the transcription factors regulate a given system can be inferred from the enriched binding site sequences.</p>
      </sec>
      <sec>
         <st>
            <p>Discovering potential transcription factor participants in a system using differential binding site search</p>
         </st>
         <p>To test whether mining of DNA microarray datasets using DBSS can generate novel insights into the key transcription factors operating in differential gene expression, we downloaded a microarray dataset (GSE1692) deposited in the NCBI Gene Expression Omnibus <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> database by Cam and coworkers <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. Those investigators investigated cell cycle dependent gene expression in T98G fibrosarcoma cells. They performed gene expression and ChIP-chip analyses of asynchronous cells compared with quiescent cells prepared by removal of serum for 3 days. To analyze the same dataset independently, we first performed <it>t</it>-tests for each gene in this dataset and set our threshold at <it>P </it>&lt; 0.05 to define genes that were differentially expressed; there were a total of 670 genes in this regulated gene set. We chose the genes that had <it>P </it>> 0.7 as our controls; there were a total of 612 genes in this control gene set. The actual <it>P </it>values for individual genes are reported in Additional data file 1. Using the DBSS, we analyzed the promoter regions of these genes in the -10,000 bp to +100 bp region relative to the start codon and filtered the results to those binding sites with a threefold change in frequency. As shown in Table <tblr tid="T1">1</tblr>, E2F-related binding sites dominated the list of search results, suggesting that the E2F family of transcription factors may be involved in the observed difference in gene expression profiles between quiescent and proliferating cells. Indeed, our results were in good agreement with those of Cam and coworkers <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>.</p>
         <tbl id="T1" hint_layout="double">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>Binding site sequences that are enriched in quiescent T98G cells versus asynchronous T98G cells from DBSS</p>
            </caption>
            <tblbdy cols="6">
               <r>
                  <c ca="left">
                     <p>Name of binding site</p>
                  </c>
                  <c ca="left">
                     <p>Transcription factor</p>
                  </c>
                  <c ca="left">
                     <p>Sequence</p>
                  </c>
                  <c ca="left">
                     <p>Ratio</p>
                  </c>
                  <c ca="left">
                     <p>Regulated gene frequency</p>
                  </c>
                  <c ca="left">
                     <p>Control gene frequency</p>
                  </c>
               </r>
               <r>
                  <c cspan="6">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p><sup>a</sup>E2F4/DP_consensus</p>
                  </c>
                  <c ca="left">
                     <p>E2F4/DP</p>
                  </c>
                  <c ca="left">
                     <p>TTTSGCGCS</p>
                  </c>
                  <c ca="left">
                     <p>8.221</p>
                  </c>
                  <c ca="left">
                     <p>9</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>element_II_rs-4</p>
                  </c>
                  <c ca="left">
                     <p>element_II_rs-4</p>
                  </c>
                  <c ca="left">
                     <p>TTTCGCG</p>
                  </c>
                  <c ca="left">
                     <p>7.307</p>
                  </c>
                  <c ca="left">
                     <p>8</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p><sup>a</sup>E2F_CS</p>
                  </c>
                  <c ca="left">
                     <p>E2F</p>
                  </c>
                  <c ca="left">
                     <p>TTTTSSCGS</p>
                  </c>
                  <c ca="left">
                     <p>7.307</p>
                  </c>
                  <c ca="left">
                     <p>8</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>AP-1-erk1</p>
                  </c>
                  <c ca="left">
                     <p>AP-1</p>
                  </c>
                  <c ca="left">
                     <p>CAGACTAA</p>
                  </c>
                  <c ca="left">
                     <p>6.394</p>
                  </c>
                  <c ca="left">
                     <p>7</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>m4-AP-1_site</p>
                  </c>
                  <c ca="left">
                     <p>AP-1</p>
                  </c>
                  <c ca="left">
                     <p>GTGAGTAA</p>
                  </c>
                  <c ca="left">
                     <p>5.481</p>
                  </c>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>E1A-BS4</p>
                  </c>
                  <c ca="left">
                     <p>E1A-BS4</p>
                  </c>
                  <c ca="left">
                     <p>GTCAAAGT</p>
                  </c>
                  <c ca="left">
                     <p>5.481</p>
                  </c>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p><sup>a</sup>E2F_site_(2)</p>
                  </c>
                  <c ca="left">
                     <p>E2F</p>
                  </c>
                  <c ca="left">
                     <p>TTTGGCGC</p>
                  </c>
                  <c ca="left">
                     <p>5.481</p>
                  </c>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>E1A-BS5</p>
                  </c>
                  <c ca="left">
                     <p>E1A-BS5</p>
                  </c>
                  <c ca="left">
                     <p>TCTCAGGTG</p>
                  </c>
                  <c ca="left">
                     <p>5.481</p>
                  </c>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>epsilon-NRA-FP2</p>
                  </c>
                  <c ca="left">
                     <p>undefined</p>
                  </c>
                  <c ca="left">
                     <p>GAGATACC</p>
                  </c>
                  <c ca="left">
                     <p>5.481</p>
                  </c>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>HC5</p>
                  </c>
                  <c ca="left">
                     <p>HC5</p>
                  </c>
                  <c ca="left">
                     <p>CCGAAAC</p>
                  </c>
                  <c ca="left">
                     <p>4.567</p>
                  </c>
                  <c ca="left">
                     <p>10</p>
                  </c>
                  <c ca="left">
                     <p>2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TB3</p>
                  </c>
                  <c ca="left">
                     <p>NF-IL-6</p>
                  </c>
                  <c ca="left">
                     <p>AACTGGAAA</p>
                  </c>
                  <c ca="left">
                     <p>4.567</p>
                  </c>
                  <c ca="left">
                     <p>5</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>GCN4_CS1</p>
                  </c>
                  <c ca="left">
                     <p>GCN4</p>
                  </c>
                  <c ca="left">
                     <p>ATGASTCAT</p>
                  </c>
                  <c ca="left">
                     <p>4.567</p>
                  </c>
                  <c ca="left">
                     <p>10</p>
                  </c>
                  <c ca="left">
                     <p>2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>MyoD-MLC_(1)</p>
                  </c>
                  <c ca="left">
                     <p>MyoD</p>
                  </c>
                  <c ca="left">
                     <p>CCAGCTGGC</p>
                  </c>
                  <c ca="left">
                     <p>4.567</p>
                  </c>
                  <c ca="left">
                     <p>5</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Sp1-t-PA</p>
                  </c>
                  <c ca="left">
                     <p>Sp1</p>
                  </c>
                  <c ca="left">
                     <p>ACCCCGCCC</p>
                  </c>
                  <c ca="left">
                     <p>4.567</p>
                  </c>
                  <c ca="left">
                     <p>10</p>
                  </c>
                  <c ca="left">
                     <p>2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CArG_CS</p>
                  </c>
                  <c ca="left">
                     <p>SRF</p>
                  </c>
                  <c ca="left">
                     <p>CCAWATWWGG</p>
                  </c>
                  <c ca="left">
                     <p>4.567</p>
                  </c>
                  <c ca="left">
                     <p>5</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>RC1/RC2-CYC</p>
                  </c>
                  <c ca="left">
                     <p>RC1/RC2</p>
                  </c>
                  <c ca="left">
                     <p>TGACCGA</p>
                  </c>
                  <c ca="left">
                     <p>4.567</p>
                  </c>
                  <c ca="left">
                     <p>5</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DHFR-undefined-site-1</p>
                  </c>
                  <c ca="left">
                     <p>DHFR-undefined-site-1</p>
                  </c>
                  <c ca="left">
                     <p>GGATTGGC</p>
                  </c>
                  <c ca="left">
                     <p>4.110</p>
                  </c>
                  <c ca="left">
                     <p>9</p>
                  </c>
                  <c ca="left">
                     <p>2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TopoII_RS</p>
                  </c>
                  <c ca="left">
                     <p>Topoisomerase II</p>
                  </c>
                  <c ca="left">
                     <p>RNYNNCNNGYNGKTNYCY</p>
                  </c>
                  <c ca="left">
                     <p>4.110</p>
                  </c>
                  <c ca="left">
                     <p>9</p>
                  </c>
                  <c ca="left">
                     <p>2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>element_II-rs-1</p>
                  </c>
                  <c ca="left">
                     <p>element_II-rs-1</p>
                  </c>
                  <c ca="left">
                     <p>GGCGTAA</p>
                  </c>
                  <c ca="left">
                     <p>3.654</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>C/EBP-TTRS3</p>
                  </c>
                  <c ca="left">
                     <p>C/EBP</p>
                  </c>
                  <c ca="left">
                     <p>TCTTACTC</p>
                  </c>
                  <c ca="left">
                     <p>3.654</p>
                  </c>
                  <c ca="left">
                     <p>8</p>
                  </c>
                  <c ca="left">
                     <p>2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Sp1-Vdac2</p>
                  </c>
                  <c ca="left">
                     <p>Sp1</p>
                  </c>
                  <c ca="left">
                     <p>CCTCGCCTC</p>
                  </c>
                  <c ca="left">
                     <p>3.654</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>glide/gcm_CS</p>
                  </c>
                  <c ca="left">
                     <p>glide/gcm</p>
                  </c>
                  <c ca="left">
                     <p>ATRCGGGY</p>
                  </c>
                  <c ca="left">
                     <p>3.654</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>spB-4bp</p>
                  </c>
                  <c ca="left">
                     <p>STAT3</p>
                  </c>
                  <c ca="left">
                     <p>TTCCGGAA</p>
                  </c>
                  <c ca="left">
                     <p>3.654</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>C/EBP-AT-Site-C.2</p>
                  </c>
                  <c ca="left">
                     <p>C/EBP</p>
                  </c>
                  <c ca="left">
                     <p>TCTTAAGC</p>
                  </c>
                  <c ca="left">
                     <p>3.654</p>
                  </c>
                  <c ca="left">
                     <p>8</p>
                  </c>
                  <c ca="left">
                     <p>2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>PUT2_UAS2; PUT2_UAS.2</p>
                  </c>
                  <c ca="left">
                     <p>PUT3</p>
                  </c>
                  <c ca="left">
                     <p>GAAGCCGA</p>
                  </c>
                  <c ca="left">
                     <p>3.654</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NFkB_CS2</p>
                  </c>
                  <c ca="left">
                     <p>NF-kB</p>
                  </c>
                  <c ca="left">
                     <p>RGGGRMTYYCC</p>
                  </c>
                  <c ca="left">
                     <p>3.349</p>
                  </c>
                  <c ca="left">
                     <p>11</p>
                  </c>
                  <c ca="left">
                     <p>3</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p><sup>a</sup>E2F_site_(3)</p>
                  </c>
                  <c ca="left">
                     <p>E2F</p>
                  </c>
                  <c ca="left">
                     <p>TTGGCGC</p>
                  </c>
                  <c ca="left">
                     <p>3.288</p>
                  </c>
                  <c ca="left">
                     <p>18</p>
                  </c>
                  <c ca="left">
                     <p>5</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NF-E2_CS</p>
                  </c>
                  <c ca="left">
                     <p>NF-E2_CS</p>
                  </c>
                  <c ca="left">
                     <p>TGACTCAGC</p>
                  </c>
                  <c ca="left">
                     <p>3.197</p>
                  </c>
                  <c ca="left">
                     <p>7</p>
                  </c>
                  <c ca="left">
                     <p>2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p><sup>a</sup>E2F_CS.2</p>
                  </c>
                  <c ca="left">
                     <p>E2F</p>
                  </c>
                  <c ca="left">
                     <p>SCGSGAAAA</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>7</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>AluA</p>
                  </c>
                  <c ca="left">
                     <p>AluA</p>
                  </c>
                  <c ca="left">
                     <p>GGAGGCTGAGGCA</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p><sup>a</sup>E2F_CS.1</p>
                  </c>
                  <c ca="left">
                     <p>E2F</p>
                  </c>
                  <c ca="left">
                     <p>TTTCGCGC</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>5</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Swi4-mdscan-motif-3</p>
                  </c>
                  <c ca="left">
                     <p>Swi6</p>
                  </c>
                  <c ca="left">
                     <p>AAACGCG</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>5</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>E-box/ATF/CREB_site</p>
                  </c>
                  <c ca="left">
                     <p>Ebox protein/ATF/CREB</p>
                  </c>
                  <c ca="left">
                     <p>GTGACGCA</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>5</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>GCN4-his3-189</p>
                  </c>
                  <c ca="left">
                     <p>GCN4</p>
                  </c>
                  <c ca="left">
                     <p>ATGACTCAT</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>GCF-beta-actin_(2)</p>
                  </c>
                  <c ca="left">
                     <p>GCF</p>
                  </c>
                  <c ca="left">
                     <p>GCGCGGGCCG</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Sp1-XIST_(1)</p>
                  </c>
                  <c ca="left">
                     <p>Sp1</p>
                  </c>
                  <c ca="left">
                     <p>GGCCACGCC</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>rMT-III-motif-9</p>
                  </c>
                  <c ca="left">
                     <p>undefined</p>
                  </c>
                  <c ca="left">
                     <p>CAGGCACCT</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DBP-CS</p>
                  </c>
                  <c ca="left">
                     <p>DBP</p>
                  </c>
                  <c ca="left">
                     <p>RTTAYGTAAR</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CDF1_RS</p>
                  </c>
                  <c ca="left">
                     <p>CDF1</p>
                  </c>
                  <c ca="left">
                     <p>CTAAATAC</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>alphaA-crystallin-PE2A</p>
                  </c>
                  <c ca="left">
                     <p>AP-1</p>
                  </c>
                  <c ca="left">
                     <p>CTGACTCAC</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p><sup>a</sup>E2F-myc</p>
                  </c>
                  <c ca="left">
                     <p>E2F</p>
                  </c>
                  <c ca="left">
                     <p>GCGGGAAAA</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>A selected list is shown here; see Additional data file 1 for the full list. Only binding site sequences with a fold change in frequency of occurrence of greater than three are shown. <sup>a</sup>E2F-binding sites are highlighted in grey. Ratio of 'N/A' denotes binding site sequences that can only be found in either the control or regulated gene set. DBSS, differential binding site search; E2F, elongation factor-2.</p>
            </tblfn>
         </tbl>
         <p>To demonstrate independently that some of the genes appearing in our list predicted to contain over-represented E2F binding sites are indeed bound by E2F1 or E2F4 <it>in vivo</it>, we conducted a ChIP assay. We used E2F1 and E2F4 antibodies to analyze the occupancies of these two transcription factors on five different promoters in both synchronized and quiescent T98G cells. A brief description of our ChIP methodology is as follows. Approximately 1 &#215; 10<sup>7 </sup>T98G cells were fixed with formaldehyde (1% final concentration) at room temperature for 10 min. Fixation was stopped by the addition of glycine for 5 min. Cells were washed once with ice-cold phosphate-buffered saline supplemented with protease inhibitors (1 &#956;g/ml phenylmethylsulfonyl fluoride, 1 &#956;g/ml aprotinin, 1 &#956;g/ml pepstatin). Cells were scraped and pelleted in the same buffer. Cell pellets were lysed in 0.5 ml lysis buffer (1% sodium dodecyl sulfate; 10 mmol/l EDTA; 50 mmol/l Tris-HCl [pH 8.0]). Soluble chromatin was prepared by sonication of the cell lysates. Subsequent immunopreciptation and analysis were performed essentially according to the method proposed by Lambert and coworkers <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>, except that antibodies against E2F-1 (sc-193; Santa Cruz Biotechnology, Santa Cruz, CA, USA) and E2F-4 (sc-1082; Santa Cruz Biotechnology) were used; 0.1% of total input chromatin was used in the polymerase chain reactions in the input lane.</p>
         <p>As shown in Figure <figr fid="F5">5</figr>, all five promoters are indeed targeted by E2F1 or E2F4, although the pattern of binding varies among these five genes. Whereas our ChIP data on <it>DHFR</it>, <it>CDC6</it>, <it>CDC25A</it>, and <it>MCM3 </it>are consistent with published results, binding of E2F1 and E2F4 to <it>DUSP4 </it>is a novel finding. Thus, based on the results of DBSS, we can gain biological insights similar to those obtained by ChIP-chip analysis.</p>
         <fig id="F5">
            <title>
               <p>Figure 5</p>
            </title>
            <caption>
               <p>E2F1 and E2F4 occupancies in different promoter regions predicted by differential binding site search</p>
            </caption>
            <text>
               <p>E2F1 and E2F4 occupancies in different promoter regions predicted by differential binding site search. A chromatin immunoprecipitation experiment was performed as described in the text. Mock experiments were done using no antibodies (No Ab), which served as a negative control for the experiment. Input lane represents polymerase chain reactions using 0.1% of total input chromatin. E2F, elongation factor-2.</p>
            </text>
            <graphic file="gb-2006-7-10-r97-5"/>
         </fig>
         <p>To demonstrate the visualization capabilities of GeneACT, we use the example of serum response factor (SRF), whose binding sites were highly enriched in the regulated gene set. The increased presence of SRF binding sites implies that genes containing this site might be regulated by SRF when cells enter G<sub>1 </sub>from G<sub>0</sub>. Indeed, one of the differentially expressed genes that contributes to the SRF ranking, namely <it>EGR1</it>, has been independently shown to be activated by SRF <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. Genes that contain either E2F or SRF binding sites are listed in Additional data file 3. The location of the putative E2F-binding sites can easily be tracked down using the GeneACT graphical interface of PBSS. The promoter regions (-600 bp to +100 bp) of <it>MCM5 </it>(Figure <figr fid="F6">6a</figr>) and <it>DHFR </it>(Figure <figr fid="F7">7a</figr>) are shown in the promoter browser using PBSS. Figures <figr fid="F6">6b</figr> and <figr fid="F7">7b</figr> highlight just the E2F binding sites conserved in these promoter regions, respectively. Taken together, our results suggest that DBSS in GeneACT can be a simple but very useful tool to gain novel insights from microarray data quickly.</p>
         <fig id="F6">
            <title>
               <p>Figure 6</p>
            </title>
            <caption>
               <p>Graphic display of transcription binding sites in the MCM5 promoter region</p>
            </caption>
            <text>
               <p>Graphic display of transcription binding sites in the MCM5 promoter region. <b>(a) </b>Visualization of the <it>MCM5 </it>upstream region. <it>MCM5 </it>upstream region (-600 to +100 base pairs) is shown, where +1 is the transcription start site. Only binding site sequences that go across genomes are shown. <b>(b) </b>The same region as is shown in panel a, with only the E2F sites highlighted.</p>
            </text>
            <graphic file="gb-2006-7-10-r97-6"/>
         </fig>
         <fig id="F7">
            <title>
               <p>Figure 7</p>
            </title>
            <caption>
               <p>Graphic display of transcription binding sites in the DHFR promoter region</p>
            </caption>
            <text>
               <p>Graphic display of transcription binding sites in the DHFR promoter region. <b>(a) </b>Visualization of <it>DHFR </it>upstream region. Same parameters were used as for <it>MCM5 </it>(see Figure 6). <b>(b) </b>Same region as shown in panel a, with only the E2F sites highlighted. E2F, elongation factor-2.</p>
            </text>
            <graphic file="gb-2006-7-10-r97-7"/>
         </fig>
      </sec>
      <sec>
         <st>
            <p>Discovering potential microRNA participants in a system using differential binding site search</p>
         </st>
         <p>If the abundance of mRNA is regulated by miRNA, then we would expect that expression levels of miRNAs and their authentic targets should be anti-correlated. Accordingly, computational identification of over-represented miRNA target sites shared among co-regulated genes from DNA microarray data in theory should provide valuable leads to uncover the biologically relevant miRNAs responsible for differential gene expression. To test this hypothesis in a well characterized system, we downloaded and analyzed the dataset created by Lim and coworkers <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. This investigation was to identify the targets of miR-1 and miR-124 in HeLa cells by overexpression of these two miRNAs independently followed by profiling mRNA transcript levels by DNA microarray analysis. They found that 96 and 174 annotated genes were downregulated by miR-1 and miR-124, respectively. If over-representation of miRNA target sites among co-regulated genes can be exploited to unravel the controlling miRNAs in differential gene expression, then searching the list of 96 or 174 genes using the 3'-UTR search function with the DBSS tool is expected to reveal over-representation of miR-1 or miR-124 target sites, respectively, among these two group of genes. miR-1 and miR-124 are noted for their tissue specificity in mammals. miR-1 is known to be preferentially expressed in heart and skeletal muscle, whereas miR-124 is known to be preferentially expressed in brain <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>. Because they are tissue-specific miRNAs, we used each of the datasets as a control for the other.</p>
         <p>The results are summarized in Table <tblr tid="T2">2</tblr> and Additional data file 4. As predicted, miR-124 target sites ranked among the top of the list in the search result when the regulated gene set input was the miR-124 overexpression experiment. As for miR-1, we found that miR-1 was excluded from our analysis because of the missing orthologous miR-1 mature miRNA sequence in rat, and so it is not discussed further. We note that the target sites for many other miRNAs were also enriched in addition to the miR-124 target sites. This implies that genes that are downregulated by miR-124 also contain miRNA target sites for other miRNAs. It is possible that multiple miRNAs might act on similar sets of genes that are downregulated by miR-124 in the HeLa cell line. Recapturing miR-124 from the DBSS search in GeneACT using the corresponding list of genes determined by DNA microarray analysis suggests that this is a potentially very productive approach to zero in on the miRNAs responsible, at least in part, for a given expression profile.</p>
         <tbl id="T2" hint_layout="double">
            <title>
               <p>Table 2</p>
            </title>
            <caption>
               <p>Summary of the search results for the miRNA target sites enriched in the HeLa cells transfected with miR-124 vs. miR-1</p>
            </caption>
            <tblbdy cols="5">
               <r>
                  <c ca="left">
                     <p>miRNA name</p>
                  </c>
                  <c ca="left">
                     <p>Sequence</p>
                  </c>
                  <c ca="left">
                     <p>Ratio</p>
                  </c>
                  <c ca="left">
                     <p>Number of miR-124 target sites</p>
                  </c>
                  <c ca="left">
                     <p>Number of miR-1 target sites (control)</p>
                  </c>
               </r>
               <r>
                  <c cspan="5">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>miR-185</p>
                  </c>
                  <c ca="left">
                     <p>UGGAGAGAAAGGCAGUUC</p>
                  </c>
                  <c ca="left">
                     <p>3.5409836</p>
                  </c>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p><sup><ul>a</ul></sup>miR-124a</p>
                  </c>
                  <c ca="left">
                     <p>UUAAGGCACGCGGUGAAUGCCA</p>
                  </c>
                  <c ca="left">
                     <p>3.5409836</p>
                  </c>
                  <c ca="left">
                     <p>12</p>
                  </c>
                  <c ca="left">
                     <p>2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>miR-145</p>
                  </c>
                  <c ca="left">
                     <p>GUCCAGUUUUCCCAGGAAUCCCUU</p>
                  </c>
                  <c ca="left">
                     <p>3.5409836</p>
                  </c>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>miR-22</p>
                  </c>
                  <c ca="left">
                     <p>AAGCUGCCAGUUGAAGAACUGU</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>8</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>miR-337</p>
                  </c>
                  <c ca="left">
                     <p>UCCAGCUCCUAUAUGAUGCCUUU</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>7</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>miR-150</p>
                  </c>
                  <c ca="left">
                     <p>UCUCCCAACCCUUGUACCAGUG</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>miR-141</p>
                  </c>
                  <c ca="left">
                     <p>UAACACUGUCUGGUAAAGAUGG</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>miR-19b</p>
                  </c>
                  <c ca="left">
                     <p>UGUGCAAAUCCAUGCAAAACUGA</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>miR-200a</p>
                  </c>
                  <c ca="left">
                     <p>UAACACUGUCUGGUAACGAUGU</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>3</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>miR-9</p>
                  </c>
                  <c ca="left">
                     <p>UCUUUGGUUAUCUAGCUGUAUGA</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>3</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>miR-31</p>
                  </c>
                  <c ca="left">
                     <p>GGCAAGAUGCUGGCAUAGCUG</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>3</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>Shown here is a selected list of the most enriched miRNA target sites when HeLa cells are transfected with miR-124; see Additional data file 4 for the full list. <sup>a</sup>miR-124 target sites are highlighted. DBSS, differential binding site search; miRNA, microRNA.</p>
            </tblfn>
         </tbl>
      </sec>
      <sec>
         <st>
            <p>Predicting microRNA participants in skeletal muscle differentiation</p>
         </st>
         <p>Myogenic differentiation is a process that leads to the fusion of muscle precursor cells (myoblasts) into multinucleated myofibers in the animal. The C2C12 myoblast cell line serves as a good <it>in vitro </it>model for studying skeletal muscle differentiation because these cells are able to differentiate terminally into myotubes when serum is withdrawn from the culture medium <abbrgrp><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr></abbrgrp>. To understand the potential involvement of miRNAs in regulating skeletal muscle differentiation and further test our tool, we employed DBSS to analyze a C2C12 differentiation microarray dataset found on NCBI GEO. In this dataset, C2C12 differentiation was studied from day 0 to day 10 of serum withdrawal <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. Our control genes were those that were upregulated at all time points compared with the control undifferentiated myoblasts. We hypothesized that these genes are less likely to be changed by the miRNA because they are upregulated in the time course and the nature of miRNA regulation is to downregulate the expression of mRNA. To perform the analysis, we compared the cells at day 2 of differentiation with those at day 0 (Additional data files 5 and 6).</p>
         <p>The result is summarized in Table <tblr tid="T3">3</tblr>. Our <it>in silico </it>analysis of the C2C12 microarray gene expression profile using DBSS implied that at least 14 miRNA target sites are over-represented in downregulated mRNAs during myogenic differentiation in C2C12 cells, suggesting that some of these microRNAs may be differentially expressed during myogenic differentiation and contribute to the mRNA expression profile. Recently, Chen and colleagues <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> investigated a number of miRNA expression profiles during C2C12 differentiation using a miRNA microarray. Their miRNA array expression data revealed that miR-133a, miR-206, and miR-130a were ranked at the top of the list of a few miRNAs that were upregulated upon myogenic differentiation. In comparing our <it>in silico </it>predictions with their experimental results, we found that our analysis recaptured miR-133a, miR-206, and miR-130a target sites as the most enriched in differentially expressed genes. Therefore, a differential miRNA target site search can generate predictions consistent with experimental results in this system.</p>
         <tbl id="T3" hint_layout="double">
            <title>
               <p>Table 3</p>
            </title>
            <caption>
               <p>Summary of the results for the miRNA target sites that are enriched in C2C12 myogenic differentiation (day 0 versus day 2) from DBSS</p>
            </caption>
            <tblbdy cols="5">
               <r>
                  <c ca="left">
                     <p>miRNA name</p>
                  </c>
                  <c ca="left">
                     <p>Sequence</p>
                  </c>
                  <c ca="left">
                     <p>Ratio</p>
                  </c>
                  <c ca="left">
                     <p>Regulated frequency</p>
                  </c>
                  <c ca="left">
                     <p>Control frequency</p>
                  </c>
               </r>
               <r>
                  <c cspan="5">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>One conserved site</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-206</p>
                  </c>
                  <c ca="left">
                     <p>UGGAAUGUAAGGAAGUGUGUGG</p>
                  </c>
                  <c ca="left">
                     <p>4.075358</p>
                  </c>
                  <c ca="left">
                     <p>26</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-7</p>
                  </c>
                  <c ca="left">
                     <p>UGGAAGACUAGUGAUUUUGUUG</p>
                  </c>
                  <c ca="left">
                     <p>3.1348907</p>
                  </c>
                  <c ca="left">
                     <p>20</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-301</p>
                  </c>
                  <c ca="left">
                     <p>CAGUGCAAUAGUAUUGUCAAAGC</p>
                  </c>
                  <c ca="left">
                     <p>3.1348907</p>
                  </c>
                  <c ca="left">
                     <p>20</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-23a</p>
                  </c>
                  <c ca="left">
                     <p>AUCACAUUGCCAGGGAUUUCC</p>
                  </c>
                  <c ca="left">
                     <p>2.8214017</p>
                  </c>
                  <c ca="left">
                     <p>18</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-138</p>
                  </c>
                  <c ca="left">
                     <p>AGCUGGUGUUGUGAAUC</p>
                  </c>
                  <c ca="left">
                     <p>2.8214017</p>
                  </c>
                  <c ca="left">
                     <p>18</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-211</p>
                  </c>
                  <c ca="left">
                     <p>UUCCCUUUGUCAUCCUUCGCCU</p>
                  </c>
                  <c ca="left">
                     <p>2.2727958</p>
                  </c>
                  <c ca="left">
                     <p>58</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-29c</p>
                  </c>
                  <c ca="left">
                     <p>UAGCACCAUUUGAAAUCGGU</p>
                  </c>
                  <c ca="left">
                     <p>2.037679</p>
                  </c>
                  <c ca="left">
                     <p>13</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-133a</p>
                  </c>
                  <c ca="left">
                     <p>UUGGUCCCCUUCAACCAGCUGU</p>
                  </c>
                  <c ca="left">
                     <p>1.9788998</p>
                  </c>
                  <c ca="left">
                     <p>101</p>
                  </c>
                  <c ca="left">
                     <p>8</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-194</p>
                  </c>
                  <c ca="left">
                     <p>UGUAACAGCAACUCCAUGUGGA</p>
                  </c>
                  <c ca="left">
                     <p>1.9593067</p>
                  </c>
                  <c ca="left">
                     <p>25</p>
                  </c>
                  <c ca="left">
                     <p>2</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-30c</p>
                  </c>
                  <c ca="left">
                     <p>UGUAAACAUCCUACACUCUCAGC</p>
                  </c>
                  <c ca="left">
                     <p>1.8809344</p>
                  </c>
                  <c ca="left">
                     <p>12</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-130a</p>
                  </c>
                  <c ca="left">
                     <p>CAGUGCAAUGUUAAAAGGGCAU</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>18</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-216</p>
                  </c>
                  <c ca="left">
                     <p>UAAUCUCAGCUGGCAACUGUG</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>12</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-196b</p>
                  </c>
                  <c ca="left">
                     <p>UAGGUAGUUUCCUGUUGUUGG</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>10</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-140</p>
                  </c>
                  <c ca="left">
                     <p>AGUGGUUUUACCCUAUGGUAG</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>10</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c cspan="5">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Two conserved sites</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-324-5p</p>
                  </c>
                  <c ca="left">
                     <p>CGCAUCCCCUAGGGCAUUGGUGU</p>
                  </c>
                  <c ca="left">
                     <p>2.5079126</p>
                  </c>
                  <c ca="left">
                     <p>32</p>
                  </c>
                  <c ca="left">
                     <p>2</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-193</p>
                  </c>
                  <c ca="left">
                     <p>AACUGGCCUACAAAGUCCCAG</p>
                  </c>
                  <c ca="left">
                     <p>2.037679</p>
                  </c>
                  <c ca="left">
                     <p>13</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-150</p>
                  </c>
                  <c ca="left">
                     <p>UCUCCCAACCCUUGUACCAGUG</p>
                  </c>
                  <c ca="left">
                     <p>2.037679</p>
                  </c>
                  <c ca="left">
                     <p>13</p>
                  </c>
                  <c ca="left">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-125b</p>
                  </c>
                  <c ca="left">
                     <p>UCCCUGAGACCCUAACUUGUGA</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>18</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-133a</p>
                  </c>
                  <c ca="left">
                     <p>UUGGUCCCCUUCAACCAGCUGU</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>17</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-152</p>
                  </c>
                  <c ca="left">
                     <p>UCAGUGCAUGACAGAACUUGGG</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>11</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-204</p>
                  </c>
                  <c ca="left">
                     <p>UUCCCUUUGUCAUCCUAUGCCU</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>9</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-17</p>
                  </c>
                  <c ca="left">
                     <p>CAAAGUGCUUACAGUGCAGGUAGU</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>8</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-211</p>
                  </c>
                  <c ca="left">
                     <p>UUCCCUUUGUCAUCCUUCGCCU</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>8</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c cspan="5">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Three conserved sites</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-296</p>
                  </c>
                  <c ca="left">
                     <p>AGGGCCCCCCCUCAAUCCUGU</p>
                  </c>
                  <c ca="left">
                     <p>2.037679</p>
                  </c>
                  <c ca="left">
                     <p>26</p>
                  </c>
                  <c ca="left">
                     <p>2</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-34a</p>
                  </c>
                  <c ca="left">
                     <p>UGGCAGUGUCUUAGCUGGUUGUU</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>11</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-125a</p>
                  </c>
                  <c ca="left">
                     <p>UCCCUGAGACCCUUUAACCUGUG</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>7</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-330</p>
                  </c>
                  <c ca="left">
                     <p>GCAAAGCACACGGCCUGCAGAGA</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>7</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-345</p>
                  </c>
                  <c ca="left">
                     <p>UGCUGACUCCUAGUCCAGGGC</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-320</p>
                  </c>
                  <c ca="left">
                     <p>AAAAGCUGGGUUGAGAGGGCGAA</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-125b</p>
                  </c>
                  <c ca="left">
                     <p>UCCCUGAGACCCUAACUUGUGA</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>5</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-449</p>
                  </c>
                  <c ca="left">
                     <p>UGGCAGUGUAUUGUUAGCUGGU</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>5</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-150</p>
                  </c>
                  <c ca="left">
                     <p>UCUCCCAACCCUUGUACCAGUG</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>5</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-133a</p>
                  </c>
                  <c ca="left">
                     <p>UUGGUCCCCUUCAACCAGCUGU</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>5</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-199a</p>
                  </c>
                  <c ca="left">
                     <p>CCCAGUGUUCAGACUACCUGUUC</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-193</p>
                  </c>
                  <c ca="left">
                     <p>AACUGGCCUACAAAGUCCCAG</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
               <r>
                  <c indent="1" ca="left">
                     <p>miR-205</p>
                  </c>
                  <c ca="left">
                     <p>UCCUUCAUUCCACCGGAGUCUG</p>
                  </c>
                  <c ca="left">
                     <p>-1</p>
                  </c>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>0</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>Selected lists are shown here; see Additional data file 6 for the full lists. Findings are subdivided into results for one conserved site, results for two conserved sites, and results for three conserved sites. In all cases, miR-133a remained as one of the top hits. If a given miRNA target site is absent in the control gene list, then we listed the ratio as 'N/A', which denotes the unique presence of this particular target site in the regulated gene list. DBSS, differential binding site search; miRNA, microRNA.</p>
            </tblfn>
         </tbl>
         <p>It has previously been demonstrated <it>in vitro </it>that more than two miRNA target sites in a given 3'-UTR seem to boost the efficacy of miRNA-mediated gene repression <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. To test whether implementing the more stringent requirement that at least two or three conserved sites are present on any one mRNA will improve the accuracy of predicting the miRNA participants in the skeletal muscle differentiation dataset, we compared the output of the more than two target site prediction with the result of the microRNA microarray experiment. As shown in Table <tblr tid="T3">3</tblr>, introduction of this additional constraint did not improve the performance of the prediction when compared with the experimental results. Therefore, it remains to be determined whether multiplicity of miRNA target sites in mRNA can be used as a reliable criterion for predicting the authenticity of miRNA targets.</p>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>GeneACT was developed to display and analyze regulatory regions across human, mouse and rat genomes, and it enables identification of putative <it>cis</it>-acting elements that are evolutionarily conserved across species for all orthologous genes. A comparative, online, web-based, graphically oriented promoter browser was developed for the public domain. Using the DBSS, insights can be gained into a particular system in which transcription factors might be involved. GeneACT enables integration of <it>cis</it>-regulatory sequences identified by a comparative genomics approach with microarray expression profiling data to explore the underlying gene expression regulatory networks.</p>
         <p>To illustrate the uniqueness of GeneACT, we compared GeneACT with different existing software. The comparison is summarized in Table <tblr tid="T4">4</tblr>. There are three distinct features that separate GeneACT from other related programs, the first of which is that GeneACT is the only open source online software that allows identification of over-represented miRNA target sites from a list of genes of interest.</p>
         <tbl id="T4" hint_layout="double">
            <title>
               <p>Table 4</p>
            </title>
            <caption>
               <p>Summary of the comparison between different similar tools in the public domain to that of GeneACT</p>
            </caption>
            <tblbdy cols="7">
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>GeneACT [30]</p>
                  </c>
                  <c ca="left">
                     <p>oPOSSUM [13]</p>
                  </c>
                  <c ca="left">
                     <p>OTFBS [52]</p>
                  </c>
                  <c ca="left">
                     <p>Clover [53]</p>
                  </c>
                  <c ca="left">
                     <p>Whole Genome rVISTA beta [3]</p>
                  </c>
                  <c ca="left">
                     <p>CR&#200;ME [15]</p>
                  </c>
               </r>
               <r>
                  <c cspan="7">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Graphical user interface</p>
                  </c>
                  <c ca="left">
                     <p>Web based</p>
                  </c>
                  <c ca="left">
                     <p>Web based</p>
                  </c>
                  <c ca="left">
                     <p>Web based</p>
                  </c>
                  <c ca="left">
                     <p>Command line tool for Linux, UNIX and Mac OS X</p>
                  </c>
                  <c ca="left">
                     <p>Web based</p>
                  </c>
                  <c ca="left">
                     <p>Web based</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Genomic display</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Type</p>
                  </c>
                  <c ca="left">
                     <p>Promoter specific</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>Gene specific</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Display motifs on custom sequences</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Allow custom sequences input</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Source of the transcription factor binding sites</p>
                  </c>
                  <c ca="left">
                     <p>TFD</p>
                  </c>
                  <c ca="left">
                     <p>JASPAR</p>
                  </c>
                  <c ca="left">
                     <p>TRANSFAC</p>
                  </c>
                  <c ca="left">
                     <p>JASPAR</p>
                  </c>
                  <c ca="left">
                     <p>TRANSFAC</p>
                  </c>
                  <c ca="left">
                     <p>TRANSFAC</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Number of binding sites in the database</p>
                  </c>
                  <c ca="left">
                     <p>>7000</p>
                  </c>
                  <c ca="left">
                     <p>111</p>
                  </c>
                  <c ca="left">
                     <p>>500</p>
                  </c>
                  <c ca="left">
                     <p>111</p>
                  </c>
                  <c ca="left">
                     <p>>500</p>
                  </c>
                  <c ca="left">
                     <p>>500</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Search for over-represented binding site</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>In promoter region</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>In 3'-UTR</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Across genomes</p>
                  </c>
                  <c ca="left">
                     <p>Human, mouse and rat</p>
                  </c>
                  <c ca="left">
                     <p>Human and mouse</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Human</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Species supported (using Locuslink, Entrez gene ID, Ensembl ID, etc.)</p>
                  </c>
                  <c ca="left">
                     <p>Human, mouse and rat</p>
                  </c>
                  <c ca="left">
                     <p>Human and mouse</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>Mouse only</p>
                  </c>
                  <c ca="left">
                     <p>Human</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Number of genes/sequences allowed</p>
                  </c>
                  <c ca="left">
                     <p>Unlimited IDs</p>
                  </c>
                  <c ca="left">
                     <p>Unlimited IDs</p>
                  </c>
                  <c ca="left">
                     <p>200 sequences</p>
                  </c>
                  <c ca="left">
                     <p>Not tested</p>
                  </c>
                  <c ca="left">
                     <p>Unlimited IDs</p>
                  </c>
                  <c ca="left">
                     <p>Unlimited IDs</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Allow custom control gene set</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Correctly predicted E2F4 binding sites in the T98G dataset</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Identified E2F-binding sites in the T98G dataset</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Search for potential miRNA involved in the dataset</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Source for the miRNA seed sequences</p>
                  </c>
                  <c ca="left">
                     <p>miRbase</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Across genomes</p>
                  </c>
                  <c ca="left">
                     <p>Human, mouse and rat</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Species supported</p>
                  </c>
                  <c ca="left">
                     <p>Human, mouse and rat</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Number of genes/sequences allowed</p>
                  </c>
                  <c ca="left">
                     <p>Unlimited</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Allow custom control gene set</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
                  <c ca="left">
                     <p>N/A</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>Only programs that have web interfaces and allow unlimited Gene IDs are tested using the T98G dataset. The website addresses for each of the programs evaluated are given in the references provided in the top row. N/A represents a category that is not available. E2F, elongation factor 2; miRNA, microRNA; UTR, untranslated region.</p>
            </tblfn>
         </tbl>
         <p>Second, GeneACT employs the TFD database and pattern matching for <it>in silico </it>annotation or prediction of potential transcription factor binding sites. Virtually all other programs make use of the position weight matrix (PWM)-based TRANSFAC <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> and related JASPAR databases <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. Because transcription factors tend to bind short and degenerate sequences, the PWM-based approach provides better definition of transcription factor binding properties based on binding affinity. This method has proved to be very effective for <it>in silico </it>prediction of prokaryotic transcription factor binding sites <abbrgrp><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>. However, there are significant limitations for a PWM-based approach for analysis of mammalian transcription factor binding sites <abbrgrp><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr></abbrgrp>. A PWM assumes that the recognition sequence is of fixed length and each base contributes independently to the total binding energy of the transcription factor/DNA complex. In mammalian systems, binding affinity may not be a reliable predictor for biologically relevant binding sites <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. One of the major features of transcriptional regulation in eukaryotic systems is combinatorial control featuring two or more transcription factors binding synergistically to their target sites <abbrgrp><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr></abbrgrp>. The context of the binding site is often more important than individual binding sites. We chose to use the TFD database because almost all of the transcription factor binding sites documented in the database were defined experimentally (for example, by reporter assays). The TFD contains more than 7000 characterized binding sites from a variety of biologic contexts. These binding sites are naturally selected for function during evolution. Thus, using TFD in our <it>in silico </it>analysis provides an alternative and perhaps more relevant approach to identification of putative transcription factor binding sites in the flanking regions of genes of interest. Given the findings that no single transcription factor binding site discovery program is superior from a number of comparative studies and that using multiple independent programs improves the performance of prediction <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>, GeneACT is a valuable addition to existing tools.</p>
         <p>The third and final distinct feature that separates GeneACT from other related programs is that the output of GeneACT is geared toward easy visualization and pattern recognition. It is designed to be a simple, freely available tool for experimental biologists to navigate promoter regions and discover the significance of a given DNA sequence based on comparative genomic analysis and DNA microarray data. Extensive tutorials and help documents are available on our website help page to guide users through different tools on this site. A major feature of GeneACT is the miRNA target site search capability. This is crucial, given that up to one-third of human genes could be targeted for regulation by miRNA <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>, in addition to regulation by transcription factors. It is therefore important to investigate both transcription factors and miRNAs when searching for critical genes that may be responsible for differential gene expression. By integrating both transcription factor binding sites and miRNA target sites into DBSS, we provide a more comprehensive analysis of DNA microarray datasets. Indeed, we showed that GeneACT accurately predicted the involvement of E2F during cell cycle progression and involvement of certain miRNAs during muscle cell differentiation from DNA microarray datasets.</p>
         <p>The quality of predictions of critical <it>cis</it>-regulatory elements involved in differential gene expression depends heavily on the reliability of transcription factor recognition and miRNA target site prediction. Accurate computational prediction of miRNA target sites is still a very challenging task because of insufficient experimental data <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. For example, it is not clear whether the length of the 3'-UTR where the putative miRNA target sites reside contributes to the efficacy of gene repression. A definitive answer to this question is likely to dictate how to factor the length of the 3'-UTR into reliable prediction scores.</p>
         <p>GeneACT is open source online software and is relative easy to upgrade. We expect DBSS will improve significantly as miRNA target site prediction and transcription factor binding site recognition becomes more reliable. Moreover, in the future we plan to add additional genomes to GeneACT as they become available. Even so, it is possible for researchers interested in other species to use GeneACT by taking advantage of the input sequence feature and/or input binding site feature of PBSS. In this way, we expect researchers from different and diverse fields to find a valuable resource in GeneACT.</p>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>The following additional data are available with the online version of this paper. Additional data file <supplr sid="S1">1</supplr> is a table containing the original DNA microarray data generated by Cam and coworkers <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> used for DBSS; the gene IDs of regulated and control gene sets used for the search are listed. Additional data file <supplr sid="S2">2</supplr> is a table containing the full list that is summarized in Table <tblr tid="T1">1</tblr>. Additional data file <supplr sid="S3">3</supplr> is a table containing cell cycle regulated genes containing E2F or SRF binding sites. Additional data file <supplr sid="S4">4</supplr> is a table containing the full list that is summarized in Table <tblr tid="T2">2</tblr>; the gene IDs of the miR-124 dataset used for the search are listed. Additional data file <supplr sid="S5">5</supplr> is a table containing the original DNA microarray data generated by Tomczak and coworkers <abbrgrp><abbr bid="B40">40</abbr></abbrgrp> used for the DBSS; gene IDs of regulated and control gene sets used for the search are listed. Additional data file <supplr sid="S6">6</supplr> is a table containing the full lists summarized in Table <tblr tid="T3">3</tblr>.</p>
         <suppl id="S1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>Table containing the original DNA microarray data generated by Cam and coworkers <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> used for DBSS</p>
            </caption>
            <text>
               <p>Table containing the original DNA microarray data generated by Cam and coworkers <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> used for DBSS; the gene IDs of regulated and control gene sets used for the search are listed.</p>
            </text>
            <file name="gb-2006-7-10-r97-S1.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S2">
            <title>
               <p>Additional data file 2</p>
            </title>
            <caption>
               <p>Table containing the full list that is summarized in Table <tblr tid="T1">1</tblr></p>
            </caption>
            <text>
               <p>Table containing the full list that is summarized in Table <tblr tid="T1">1</tblr>.</p>
            </text>
            <file name="gb-2006-7-10-r97-S2.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S3">
            <title>
               <p>Additional data file 3</p>
            </title>
            <caption>
               <p>Table containing cell cycle regulated genes containing E2F or SRF binding sites</p>
            </caption>
            <text>
               <p>Table containing cell cycle regulated genes containing E2F or SRF binding sites.</p>
            </text>
            <file name="gb-2006-7-10-r97-S3.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S4">
            <title>
               <p>Additional data file 4</p>
            </title>
            <caption>
               <p>Table containing the full list that is summarized in Table <tblr tid="T2">2</tblr></p>
            </caption>
            <text>
               <p>Table containing the full list that is summarized in Table <tblr tid="T2">2</tblr>; the gene IDs of the miR-124 dataset used for the search are listed.</p>
            </text>
            <file name="gb-2006-7-10-r97-S4.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S5">
            <title>
               <p>Additional data file 5</p>
            </title>
            <caption>
               <p>Table containing the original DNA microarray data generated by Tomczak and coworkers <abbrgrp><abbr bid="B40">40</abbr></abbrgrp> used for the DBSS</p>
            </caption>
            <text>
               <p>Table containing the original DNA microarray data generated by Tomczak and coworkers <abbrgrp><abbr bid="B40">40</abbr></abbrgrp> used for the DBSS; gene IDs of regulated and control gene sets used for the search are listed.</p>
            </text>
            <file name="gb-2006-7-10-r97-S5.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S6">
            <title>
               <p>Additional data file 6</p>
            </title>
            <caption>
               <p>Table containing the full lists summarized in Table <tblr tid="T3">3</tblr></p>
            </caption>
            <text>
               <p>Table containing the full lists summarized in Table <tblr tid="T3">3</tblr>.</p>
            </text>
            <file name="gb-2006-7-10-r97-S6.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Rob Knight, Natalie Ahn, Jim Goodrich, Leslie Leinwand, and members of the Liu laboratory for helpful discussions. We thank David Clarke and Kristen Barthel for critical reading and editing of the manuscript and Genevieve Hudak, Mai Sasaki, Jinhua Zhang, and Steve Smithwick for the early stage development of the GeneACT project. Tom H Cheung was supported by a predoctoral training grant from NHLBI (5T32HL07851). This work is supported by grants from NIH (CA095527), US Army Breast Cancer Research Program (DAMD17-02-1-0350), and WM Keck Foundation Initiative in RNA science at the University of Colorado to Xuedong Liu.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Conserved noncoding sequences are reliable guides to regulatory elements.</p>
            </title>
            <aug>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>369</fpage>
            <lpage>372</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(00)02081-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">10973062</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Genomic strategies to identify mammalian regulatory sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Pennacchio</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2001</pubdate>
            <volume>2</volume>
            <fpage>100</fpage>
            <lpage>109</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35052548</pubid>
                  <pubid idtype="pmpid" link="fulltext">11253049</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>rVista for comparative sequence-based discovery of functional transcription factor binding sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Loots</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Ovcharenko</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>832</fpage>
            <lpage>839</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">186580</pubid>
                  <pubid idtype="pmpid" link="fulltext">11997350</pubid>
                  <pubid idtype="doi">10.1101/gr.225502. Article published online before print in April 2002</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Systematic discovery of regulatory motifs i