<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2009-10-1-202</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Review</dochead>
      <bibl>
         <title>
            <p>Integrating sequence, evolution and functional genomics in regulatory genomics</p>
         </title>
         <aug>
            <au ca="yes" id="A1">
               <snm>Vingron</snm>
               <fnm>Martin</fnm>
               <insr iid="I1"/>
               <email>vingron@molgen.mpg.de</email>
            </au>
            <au id="A2">
               <snm>Brazma</snm>
               <fnm>Alvis</fnm>
               <insr iid="I2"/>
            </au>
            <au id="A3">
               <snm>Coulson</snm>
               <fnm>Richard</fnm>
               <insr iid="I2"/>
            </au>
            <au id="A4">
               <snm>van Helden</snm>
               <fnm>Jacques</fnm>
               <insr iid="I3"/>
            </au>
            <au id="A5">
               <snm>Manke</snm>
               <fnm>Thomas</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A6">
               <snm>Palin</snm>
               <fnm>Kimmo</fnm>
               <insr iid="I4"/>
            </au>
            <au id="A7">
               <snm>Sand</snm>
               <fnm>Olivier</fnm>
               <insr iid="I3"/>
            </au>
            <au id="A8">
               <snm>Ukkonen</snm>
               <fnm>Esko</fnm>
               <insr iid="I5"/>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Computational Molecular Biology, Max-Planck-Institut f&#252;r molekulare Genetik, Ihnestrasse 73, D-14195 Berlin, Germany</p>
            </ins>
            <ins id="I2">
               <p>Microarray Group, European Bioinfomatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK</p>
            </ins>
            <ins id="I3">
               <p>BiGRe - Universit&#233; Libre de Bruxelles, Campus Plaine, Bvd du Triomphe - CP263, B-1050 Bruxelles, Belgium</p>
            </ins>
            <ins id="I4">
               <p>Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK</p>
            </ins>
            <ins id="I5">
               <p>Helsinki Institute for Information Technology, Helsinki University of Technology and University of Helsinki, 00014 Helsinki, Finland</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2009</pubdate>
         <volume>10</volume>
         <issue>1</issue>
         <fpage>202</fpage>
         <url>http://genomebiology.com/2009/10/2/202</url>
         <xrefbib>
            
         <pubidlist><pubid idtype="pmpid">19226437</pubid><pubid idtype="doi">10.1186/gb-2009-10-1-202</pubid></pubidlist></xrefbib>
      </bibl>
      <history>
         <pub>
            <date>
               <day>30</day>
               <month>01</month>
               <year>2009</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2009</year>
         <collab>BioMed Central Ltd</collab>
      </cpyrt>
      <shorttitle>
         <p>Integrating sequence, evolution and functional genomics in regulatory genomics</p>
      </shorttitle>
      <shortabs>
         <p>Finding transcription factor binding sites in regulatory regions of the genome</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification id="BioSapiens" subtype="theme_series_title" type="BMC">BioSapiens</classification>
         <classification id="BioSapiens" subtype="theme_series_editor" type="BMC"/>
         <classification id="30010002" subtype="man_spc_id" type="BMC">Bioinformatics</classification>
         <classification id="30010013" subtype="man_spc_id" type="BMC">Methods</classification>
         <classification id="30010010" subtype="man_spc_id" type="BMC">Genome studies</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p/>
         </st>
         <p>Sequencing and functional genomics have not only led to a better understanding of genes and their expression collectively, but also to a refueling of interest in how transcriptional regulation is encoded in the 'noncoding' part of the genome. For many years, the state of the art had been to collect observed transcription factor binding sites (TFBSs) in DNA, use them to build a description of the factor's binding motif, for example in the form of a positional weight matrix <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>, and then scan a putative regulatory region for hits to this motif. Promoter-prediction programs were, and are still, used to linearly scan genomic sequence for putative markers of promoters, such as CpG islands and/or TATA box motifs, Inr, and so on <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. After experimental advances, many more elements of this 'linear' code for transcription - such as histone positioning, histone modifications and DNA methylation - are currently being studied. Large-scale sequencing has enabled the comparative study of genomes, which in turn helps identify regulatory sequences. Functional genomics and, in particular, gene-expression data, is showing us the consequences of transcriptional activation and has propelled the quest to find regulatory sequences shared between coexpressed groups of genes. This review will attempt to summarize the past few years' progress in integrating these approaches for the purpose of identifying regulatory sequence elements and their function.</p>
      </sec>
      <sec>
         <st>
            <p>Regulatory feature description with positional weight matrices</p>
         </st>
         <p>Positional weight matrices (PWMs) have for many years been the workhorse of TFBS annotation. A set of experimentally determined binding sites for a transcription factor is aligned and the distribution of bases in each position of the binding site yields the weights in the PWM. There are two major databases for eukaryotic PWMs: TRANSFAC <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> and JASPAR <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. PWMs and their application are reviewed in <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. Two operations in conjunction with PWMs are important. First, coming up with the alignment for the PWM may be non-trivial. For example, in a landmark paper, Bucher <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> derived a PWM for the TATA box by extracting the relevant sequence alignment from a set of promoters he had collected <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Algorithms like the one used are still being improved and will be briefly summarized in the section Motif discovery.</p>
         <p>Second, because the PWM is meant to be a descriptor for a TFBS, a method is required to identify the predicted binding sites. Scanning a sequence with a PWM in a search of predicted binding sites seems simple. Yet because of the notorious lack of information content in the individual binding motif, a search over a long sequence region will inevitably turn up large numbers of probably false-positive results. Computationally, the program MATCH <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> uses predefined cutoffs for determination of binding sites, whereas patser <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, ProfileStats <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, and matrix-scan <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> determine the statistics of PWM matching under different background models. Weight-matrix based, biophysical models of transcription-factor binding <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> constitute an alternative to the <it>ad hoc </it>definition of a matching score. They allow the design of learning algorithms <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp> and can be validated against experimental measurements <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>.</p>
         <p>When selecting the most appropriate algorithm for predicting binding sites, one needs to distinguish between the two application scenarios of either predicting target genes for a factor with a given PWM, or predicting which factors, represented by their PWMs, might bind upstream of a gene. For the first task, a biophysical approach like TRAP <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, which determines the likelihood that a factor binds somewhere in a, say, promoter region, appears to be most appropriate. After proper statistical normalization, this algorithm can also determine a ranking of which factors might bind in a given sequence region <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. The program matrix-scan from the RSAT package <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> is most appropriate for determining actual binding sites for a factor, while also supporting the detection of regions enriched in putative binding sites.</p>
         <p>The false-positive problem, however, is inherent and to remedy it more information is needed. Besides possibly looking for evolutionarily conserved binding sites, this is usually provided through better motif descriptions, combination of motifs into <it>cis</it>-regulatory modules, and by considering the evolutionary conservation of binding sites.</p>
      </sec>
      <sec>
         <st>
            <p>Phylogenetic footprinting recognizes regulatory motifs in evolution</p>
         </st>
         <p>The most powerful remedy for the many false-positive annotations is the presence of evolutionary conservation in noncoding regions across several genomes. Because of the selective pressure exerted on their regulatory function, <it>cis</it>-acting elements are likely to be more conserved than the surrounding noncoding sequences. The expression 'phylogenetic footprints' was proposed by Tagle <it>et al. </it><abbrgrp><abbr bid="B20">20</abbr></abbrgrp> to denote anciently conserved <it>cis</it>-regulatory elements. Today, phylogenetic footprinting refers to conserved regulatory patterns in orthologous genes, or in regions that are deemed orthologous. Duret and Bucher <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> pointed out the utility of evolutionary sequence conservation for the identification of regulatory elements. With the ever-increasing number of genome sequences, phylogenetic footprint detection seems to be the key approach to deciphering regulatory mechanisms. Its utility is twofold: evolutionary conservation helps in defining a regulatory pattern and helps in reinforcing binding-site predictions. Figure <figr fid="F1">1</figr> illustrates the general approach from motif discovery to phylogenetic footprinting.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Illustration of the flow of information in regulatory region annotation</p>
            </caption>
            <text>
               <p><b>Illustration of the flow of information in regulatory region annotation</b>. Given a coexpressed group of genes, one can use <b>(a) </b>motif discovery or <b>(b) </b>search for known motifs in the upstream regions (motif building). Weight matrices can be used to scan sequences in a variety of ways: <b>(c) </b>the sites predicted by scanning the mouse <it>Hspa1b </it>promoter with the TRANSFAC matrix M01023 (left) and the conservation of those sites in vertebrate promoters (right).</p>
            </text>
            <graphic file="gb-2009-10-1-202-1"/>
         </fig>
      </sec>
      <sec>
         <st>
            <p>Complex features: <it>cis</it>-regulatory modules</p>
         </st>
         <p>In 1998, by aligning large genome fragments between mouse and human, Fickett and Wasserman observed that noncoding regions contain conserved fragments of a few kilobases that are enriched in <it>cis</it>-acting elements <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp>. Inspection of aligned regulatory regions typically yields a picture where ungapped conserved elements (describable by a PWM) occur at slightly different spacing in a set of sequences. Such an element is frequently called a <it>cis</it>-regulatory module (CRM). An early application of the concept can be found in Gailus-Durner <it>et al. </it><abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, where promoters were characterized by their arrangement of binding sites. Again, the problem of identifying CRMs comes in different flavors. In a single sequence the clustering of predicted TFBS may lead to the definition of a CRM. This was shown in the work by Berman <it>et al. </it><abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, who identified distal enhancer elements in the genome of <it>Drosophila melanogaster </it>by locating a sequence window with a high number of TFBSs. Similar windowed counting methods have been used with good success on the analysis of <it>Drosophila </it>early development <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>. Improved probabilistic variants search clusters that are significant according to some statistical model <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. A complementary approach uses probabilistic models that consider the likelihoods of binding sites and their distances <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>.</p>
         <p>The problem of finding CRMs from a single sequence is naturally extended to multiple sequences. The resulting methods work on the binding sites that lie on aligned DNA; hence they depend on the correctness of the alignments, which can be suspicious around the patchily conserved CRMs <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. The multiple sequence methods can also be divided into combinatorial and probabilistic ones <abbrgrp><abbr bid="B37">37</abbr><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr></abbrgrp>. The enhancer element locator (EEL) <abbrgrp><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp> is a recent CRM prediction tool that takes an orthologous pair of genes from two organisms (say, human and mouse) and searches through the DNA flanking the two genes to locate conserved clusters of TFBSs. The binding sites that might belong to such clusters are computationally determined, using some given collection of PWMs. The clusters are evaluated and ranked using a scoring scheme that gives bonuses and penalties as implied by the underlying biochemically motivated model of CRM structure and evolution. EEL finds the clusters with highest total score using an algorithm that is similar to the Smith-Waterman algorithm for local alignments. The significance analysis of the EEL scores can be based on generalized Karlin-Altschul statistics for local alignment scores <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Entire genomes with up to 1 Mbp flanking regions for orthologous pairs of genes can be analyzed by EEL in reasonable time. Some EEL predictions have successfully been verified to have biological function in the mouse using <it>in situ </it>hybridizations and transgenic reporter assays <abbrgrp><abbr bid="B43">43</abbr><abbr bid="B46">46</abbr></abbrgrp>. The method of Blanco <it>et al. </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp> aligns sequences of binding sites in much the same way as EEL, but their alignment is global instead of local. The global alignment is not able to locate novel CRMs from long sequences but requires exact knowledge about the location of the regulatory element. Another recent method <abbrgrp><abbr bid="B48">48</abbr></abbrgrp> combines binding-site annotation and DNA alignment into a single procedure.</p>
      </sec>
      <sec>
         <st>
            <p>Grouping genes into expression clusters</p>
         </st>
         <p>Whereas phylogenetic footprinting aims at delineating conserved patterns in orthologous regulatory elements, another logic says that coexpressed groups of genes from one organism might share regulatory elements, which mediate the coexpression. Clearly, in a first step the coexpressed groups of genes need to be determined, typically from microarray-generated gene-expression data. Many methods for this purpose have been proposed, starting with the work of Eisen <it>et al. </it><abbrgrp><abbr bid="B49">49</abbr></abbrgrp>, who used single linkage clustering. The literature on clustering is extensive, and specialized algorithms for gene-expression data have been proposed by Sharan and Shamir <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>, and Tamayo <it>et al. </it><abbrgrp><abbr bid="B51">51</abbr></abbrgrp> among others. Graphical methods have also proved useful in identifying clusters and associations <abbrgrp><abbr bid="B52">52</abbr><abbr bid="B53">53</abbr></abbrgrp>.</p>
         <p>A number of databases are available to retrieve or submit microarray data. In particular, the National Center for Biotechnology Information (NCBI) runs the Gene Expression Omnibus (GEO) <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>, the European Bioinformatics Institute (EBI) maintains the ArrayExpress (AE) database <abbrgrp><abbr bid="B55">55</abbr></abbrgrp> and Stanford University hosts the Stanford Microarray Database (SMD) <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. Based on AE, the ArrayExpress Warehouse <abbrgrp><abbr bid="B55">55</abbr></abbrgrp> was established, allowing queries based on a range of gene annotations, including gene symbols, GO terms and disease associations. Coexpressed clusters can be defined by determining which expression profiles in an experiment are significantly correlated with a 'seed gene', which supplies a blueprint for the cluster.</p>
         <p>The inclusion of protein links in AE opens up the possibility of identifying human and mouse genes encoding transcription factors. This allows the definition of a group of expression profiles that is not only coherently expressed but is also seeded by a transcription factor. Then, for the transcription factor, all the probe sets significantly correlated to its expression are pooled if they also exhibit significant differential expression for the same experimental factor. Probe sets present on the metazoan Affymetrix microarrays stored in the warehouse have been mapped to ENSEMBL gene entries along with their functional annotations, facilitating the linking of results from protein sequence searches to the expression data stored in the warehouse. An application of this mapping information is the identification of transcription factor expression profiles, initiating the generation of clusters of coexpressed genes.</p>
      </sec>
      <sec>
         <st>
            <p>Motif discovery</p>
         </st>
         <p>Whereas phylogenetic footprinting spots conserved, probably orthologous, patterns, a coexpressed group lets one ask whether the genes in the group are also co-regulated. Motifs that might be responsible for this assumed co-regulation can then be searched for. In one approach, no assumptions about known transcription factors binding in the promoter regions of the genes are made - motifs are searched <it>de novo</it>. An early example of such a method was used by Bucher <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. Today, discovered motifs can <it>a posteriori </it>be compared with databases of known motifs, to check if they are likely to be bound by known transcription factors, or whether the motifs are completely novel. The motif-discovery problem can be addressed by various algorithmic approaches, which take as input a set of sequences and return <it>de novo </it>predicted motifs. The approaches can roughly be subdivided in two classes: matrix-based or string-based pattern discovery.</p>
         <p>Matrix-based pattern discovery algorithms evaluate a large number of possible alignments between fragments of the input sequences, and attempt to return the alignment (summarized as a position-specific scoring matrix) that optimizes some scoring function. Historically, the first approach was developed by Gary Stormo's group: their program Consensus relies on a greedy algorithm, which progressively incorporates sequences to build matrices with maximal information content <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B57">57</abbr></abbrgrp>. The Gibbs sampling strategy, initially developed to detect protein domains <abbrgrp><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr></abbrgrp>, has been adapted to discovery of transcription factor binding motifs in promoters of coexpressed genes <abbrgrp><abbr bid="B60">60</abbr><abbr bid="B61">61</abbr><abbr bid="B62">62</abbr></abbrgrp>. More recent versions of the Gibbs sampling <abbrgrp><abbr bid="B63">63</abbr><abbr bid="B64">64</abbr></abbrgrp> support background models based on Markov chains, which take into account the higher-order dependencies between adjacent residues in biological sequences. The program MEME implements an expectation-maximization algorithm using multiple starting seeds in order to sample a large number of possible motifs <abbrgrp><abbr bid="B65">65</abbr><abbr bid="B66">66</abbr></abbrgrp>. String-based pattern discovery is based on the statistical detection of over-represented oligonucleotides <abbrgrp><abbr bid="B67">67</abbr><abbr bid="B68">68</abbr><abbr bid="B69">69</abbr><abbr bid="B70">70</abbr><abbr bid="B71">71</abbr><abbr bid="B72">72</abbr><abbr bid="B73">73</abbr><abbr bid="B74">74</abbr><abbr bid="B75">75</abbr></abbrgrp> or of dyads - that is, spaced pairs of oligonucleotides <abbrgrp><abbr bid="B76">76</abbr></abbrgrp>. Many of these algorithms have been compared and tested by Tompa <it>et al. </it><abbrgrp><abbr bid="B77">77</abbr></abbrgrp> and many have also been adapted to detect phylogenetic footprints in promoters of orthologous genes, with the programs Footprinter <abbrgrp><abbr bid="B78">78</abbr></abbrgrp>, PhyloCon <abbrgrp><abbr bid="B79">79</abbr></abbrgrp>, PhyME <abbrgrp><abbr bid="B80">80</abbr></abbrgrp>, OrthoMEME <abbrgrp><abbr bid="B81">81</abbr></abbrgrp>, PhyloGibbs <abbrgrp><abbr bid="B82">82</abbr></abbrgrp> and Footprint-analysis <abbrgrp><abbr bid="B83">83</abbr></abbrgrp>. A problem common to these approaches is the need to specify a theoretical background model, which usually does not fully capture the complexity and heterogeneity of real promoter sequences.</p>
      </sec>
      <sec>
         <st>
            <p>Where to search?</p>
         </st>
         <p>Patterns discovered among the regulatory regions of a coexpressed cluster of genes will typically be PWMs, like those derived from known TFBSs. The catch, however, is that inspection of PWMs for real binding sites has taught us not to expect well defined patterns with high information content. On the other hand, patterns as badly defined as the real ones can be easily extracted from any set of upstream regions if only the region chosen is large enough. This demonstrates an inherent limitation of the ability to identify patterns from co-regulated genes. Nevertheless, in yeast there has been considerable success with <it>de novo </it>identification of motifs <abbrgrp><abbr bid="B84">84</abbr></abbrgrp>. Pattern identification in <it>Drosophila </it>has profited greatly from the many sequences that are now available in conjunction with fairly well defined enhancer regions made up of clearly discernible regulatory elements. Vertebrate regulatory sequences, however, seem to be much harder to identify. Difficulties stem from the lack of knowledge on how to narrow down the sequence regions in which to look for patterns, and probably from the patchy nature of vertebrate CRMs.</p>
         <p>There are several ways of narrowing down the sequence regions to be searched for regulatory patterns. First, systematic identification of complete cDNAs together with new technologies such as cap-analysis gene expression (CAGE) tags have led to highly accurate identification of human and mouse transcription start sites <abbrgrp><abbr bid="B85">85</abbr></abbrgrp>. Therefore, focusing on a promoter sequence is less guesswork today than it was a few years ago. One price to be paid, though, is the increased complexity of promoter definition resulting from the insight that alternative promoters for a gene are more the rule than the exception. It thus seems appropriate to study several promoters per gene. The notion of an enhancer used to be biologically defined, but with more and more complete genomic sequences available, enhancers tend to be identified with highly conserved noncoding regions. Systematic mapping of DNase I hypersensitive sites is also contributing to pinpointing enhancer regions. However, with the identification of transcriptional start sites that are far upstream from the translational start, it is becoming increasingly difficult to distinguish clearly between an enhancer and a promoter.</p>
      </sec>
      <sec>
         <st>
            <p>CRMs and coexpressed groups</p>
         </st>
         <p>Even when one knows where to look for regulatory modules, identification of a CRM that might be responsible for co-regulation of a group of genes is hard, and methods development is a very active field of research. In general, available methods put the emphasis either on <it>de novo </it>pattern discovery or on a dictionary-based approach. A dictionary of patterns may contain PWMs from the existing databases, patterns identified through systematic phylogenetic footprinting <abbrgrp><abbr bid="B86">86</abbr><abbr bid="B87">87</abbr></abbrgrp>, or the output from <it>de novo </it>pattern identification. The dictionary serves to search for associations between sequence motifs and gene clusters. Often, this association is formalized as a statistical over-representation: that is, that the genes in the clusters contain a certain motif more often than expected.</p>
         <p>The prototypic approach to over-representation is the use of the hypergeometric distribution to quantify the probability that within a large set two subsets have an overlap exceeding a certain size. Hughes <it>et al. </it><abbrgrp><abbr bid="B62">62</abbr></abbrgrp> apply this to the predicted target genes containing a certain motif and clusters of genes. Clearly, if the overlap between the two sets is large, this hints at a biological role for the motif. Dieterich <it>et al. </it><abbrgrp><abbr bid="B88">88</abbr></abbrgrp> analyze human cell-cycle data and quantify the occurrence of motifs upstream of genes that peak in particular phases of the cell cycle. Many more procedures try to solve the problem of finding novel CRMs similar to a dictionary of CRMs <abbrgrp><abbr bid="B89">89</abbr><abbr bid="B90">90</abbr><abbr bid="B91">91</abbr><abbr bid="B92">92</abbr></abbrgrp>. Good results have been achieved in flies, and to some extent in mammals <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B27">27</abbr><abbr bid="B90">90</abbr><abbr bid="B93">93</abbr><abbr bid="B94">94</abbr></abbrgrp>. Bussemaker <abbrgrp><abbr bid="B95">95</abbr></abbrgrp> has pioneered linear models to explain expression data in terms of the occurrence of motifs in the upstream region of the genes. A recent paper by Bulyk and co-workers <abbrgrp><abbr bid="B96">96</abbr></abbrgrp> relates over-represented CRMs to their target genes in human myogenic differentiation. Other algorithms identify common CRMs from a set of co-regulated genes without assuming a given dictionary <abbrgrp><abbr bid="B33">33</abbr><abbr bid="B97">97</abbr><abbr bid="B98">98</abbr><abbr bid="B99">99</abbr><abbr bid="B100">100</abbr><abbr bid="B101">101</abbr></abbrgrp>.</p>
         <p>Some methods find CRMs by distinguishing the regulatory DNA regions from neutrally evolved sequences and from sequences conserved for a reason other than transcriptional regulation <abbrgrp><abbr bid="B102">102</abbr><abbr bid="B103">103</abbr><abbr bid="B104">104</abbr><abbr bid="B105">105</abbr></abbrgrp>. These methods are universal in the sense that they do not need prior transcription factor binding information. The output provides the putative regulatory sequences but gives no clue of the transcription factors binding them. The tissue specificity of the regulators can be tested in the wet lab or predicted using other computational methods <abbrgrp><abbr bid="B106">106</abbr><abbr bid="B107">107</abbr></abbrgrp>.</p>
         <p>The availability of gene-expression data in conjunction with chromatin immunoprecipitation and DNA microarray (ChIP-chip) and sequence data has led to many attempts to construct entire gene regulatory networks, rather than predicting particular regulatory connections. These efforts are known as 'reverse engineering' of networks, but are beyond the scope of this review.</p>
         <p>What are the eventual chances of producing systematic genomic annotation of regulatory elements in the human genome? Impressive progress is clearly being made. The ORegAnno database <abbrgrp><abbr bid="B108">108</abbr></abbrgrp> collects binding-site information for many organisms and the Encode project <abbrgrp><abbr bid="B85">85</abbr></abbrgrp> has provided a plethora of regulatory information, such as transcription start sites, DNase I hypersensitive sites, histone-modification data and more, much of which is available in the Ensembl Regulatory Build <abbrgrp><abbr bid="B109">109</abbr></abbrgrp>. The extension of Encode to the entire human genome and to model organisms will provide significantly more of this kind of information. However, the obvious discrepancy between the large number of transcription factors and the comparatively small number of PWMs in the databases makes it clear that the best we can hope for at the moment is an annotation with regulatory elements. The actual binding factor may remain unknown, although the sequence element can be used as a predictor for expression in a certain tissue or under certain conditions. Extensive transcription factor binding experiments such as ChIP-chip, ChIP followed by DNA sequencing (ChIP-seq), protein-binding arrays, and DNA adenine methyltransferase tagging of TFBSs (DamID) will hopefully help in establishing better links between experimental data and computationally derived data. Likewise, the predicted CRMs constitute a huge resource of hypotheses waiting to be tested experimentally.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>Support from EU FP6 NoE BioSapiens (contract number LSHG-CT-2003-503265) is gratefully acknowledged.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Methods for calculating the probabilities of finding patterns in sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Staden</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1989</pubdate>
            <volume>5</volume>
            <fpage>89</fpage>
            <lpage>96</lpage>
            <xrefbib>
               <pubid idtype="pmpid">2720468</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>DNA binding sites: representation and discovery.</p>
            </title>
            <aug>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>16</fpage>
            <lpage>23</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10812473</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Performance assessment of promoter predictions on ENCODE regions in the EGASP experiment.</p>
            </title>
            <aug>
               <au>
                  <snm>Bajic</snm>
                  <fnm>VB</fnm>
               </au>
               <au>
                  <snm>Brent</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Frankish</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Harrow</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ohler</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Solovyev</snm>
                  <fnm>VV</fnm>
               </au>
               <au>
                  <snm>Tan</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <issue>Suppl 1</issue>
            <fpage>S3.1</fpage>
            <lpage>S3.13</lpage>
         </bibl>
         <bibl id="B4">
            <title>
               <p>The RNA polymerase II core promoter.</p>
            </title>
            <aug>
               <au>
                  <snm>Smale</snm>
                  <fnm>ST</fnm>
               </au>
               <au>
                  <snm>Kadonaga</snm>
                  <fnm>JT</fnm>
               </au>
            </aug>
            <source>Annu Rev Biochem</source>
            <pubdate>2003</pubdate>
            <volume>72</volume>
            <fpage>449</fpage>
            <lpage>479</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12651739</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>TRANSFAC: transcriptional regulation, from patterns to profiles.</p>
            </title>
            <aug>
               <au>
                  <snm>Matys</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Fricke</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Geffers</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>G&#246;ssling</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Haubrock</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hehl</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hornischer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Karas</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kel</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Kel-Margoulis</snm>
                  <fnm>OV</fnm>
               </au>
               <au>
                  <snm>Kloos</snm>
                  <fnm>DU</fnm>
               </au>
               <au>
                  <snm>Land</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lewicki-Potapov</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Michael</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>M&#252;nch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Reuter</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Rotert</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Saxel</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Scheer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Thiele</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wingender</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>374</fpage>
            <lpage>378</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165555</pubid>
                  <pubid idtype="pmpid" link="fulltext">12520026</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>A new generation of JASPAR, the open-access repository for transcription factor binding site profiles.</p>
            </title>
            <aug>
               <au>
                  <snm>Vlieghe</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sandelin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>De Bleser</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Vleminckx</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>HertzWasserman</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>van Roy</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Lenhard</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <issue>34 Database</issue>
            <fpage>D95</fpage>
            <lpage>D97</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347477</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381983</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Computational prediction of transcription-factor binding site locations.</p>
            </title>
            <aug>
               <au>
                  <snm>Bulyk</snm>
                  <fnm>ML</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2003</pubdate>
            <volume>5</volume>
            <fpage>201</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">395725</pubid>
                  <pubid idtype="pmpid" link="fulltext">14709165</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Bucher</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1990</pubdate>
            <volume>212</volume>
            <fpage>563</fpage>
            <lpage>578</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2329577</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>[Description of eukaryotic promoters in the EPD database].</p>
            </title>
            <aug>
               <au>
                  <snm>Bucher</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Mol Biol (Mosk)</source>
            <pubdate>1997</pubdate>
            <volume>31</volume>
            <fpage>616</fpage>
            <lpage>625</lpage>
            <note>In Russian.</note>
            <xrefbib>
               <pubid idtype="pmpid">9340489</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>MATCH: A tool for searching transcription factor binding sites in DNA sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Kel</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Gossling</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Reuter</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Cheremushkin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Kel-Margoulis</snm>
                  <fnm>OV</fnm>
               </au>
               <au>
                  <snm>Wingender</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>3576</fpage>
            <lpage>3579</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">169193</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824369</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Identifying DNA and protein patterns with statistically significant alignments of multiple sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Hertz</snm>
                  <fnm>GZ</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <fpage>563</fpage>
            <lpage>577</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10487864</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>On the power of profiles for transcription factor binding site detection.</p>
            </title>
            <aug>
               <au>
                  <snm>Rahmann</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Muller</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Stat Appl Genet Mol Biol</source>
            <pubdate>2003</pubdate>
            <volume>2</volume>
            <fpage>Article7</fpage>
            <xrefbib>
               <pubid idtype="pmpid">16646785</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Using RSAT to scan genome sequences for transcription factor binding sites and <it>cis </it>-regulatory modules.</p>
            </title>
            <aug>
               <au>
                  <snm>Turatsinze</snm>
                  <fnm>JV</fnm>
               </au>
               <au>
                  <snm>Thomas-Chollier</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Defrance</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>van Helden</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nat Protoc</source>
            <pubdate>2008</pubdate>
            <volume>3</volume>
            <fpage>1578</fpage>
            <lpage>1588</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">18802439</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Physical constraints and functional characteristics of transcription factor-DNA interaction.</p>
            </title>
            <aug>
               <au>
                  <snm>Gerland</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Moroz</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Hwa</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>12015</fpage>
            <lpage>12020</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">129390</pubid>
                  <pubid idtype="pmpid" link="fulltext">12218191</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCE.</p>
            </title>
            <aug>
               <au>
                  <snm>Foat</snm>
                  <fnm>BC</fnm>
               </au>
               <au>
                  <snm>Morozov</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Bussemaker</snm>
                  <fnm>HJ</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <fpage>e141</fpage>
            <lpage>149</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16873464</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Precise physical models of protein-DNA interaction from high-throughput data.</p>
            </title>
            <aug>
               <au>
                  <snm>Kinney</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Tkacik</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Callan</snm>
                  <fnm>CG</fnm>
                  <suf>Jr</suf>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2007</pubdate>
            <volume>104</volume>
            <fpage>501</fpage>
            <lpage>506</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1766414</pubid>
                  <pubid idtype="pmpid" link="fulltext">17197415</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>A biophysical approach to transcription factor binding site discovery.</p>
            </title>
            <aug>
               <au>
                  <snm>Djordjevic</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sengupta</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Shraiman</snm>
                  <fnm>BI</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>2381</fpage>
            <lpage>2390</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">403756</pubid>
                  <pubid idtype="pmpid" link="fulltext">14597652</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Predicting transcription factor affinities to DNA from a biophysical model.</p>
            </title>
            <aug>
               <au>
                  <snm>Roider</snm>
                  <fnm>HG</fnm>
               </au>
               <au>
                  <snm>Kanhere</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Manke</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>23</volume>
            <fpage>134</fpage>
            <lpage>141</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17098775</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Statistical modeling of transcription factor binding affinities predicts regulatory interactions.</p>
            </title>
            <aug>
               <au>
                  <snm>Manke</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Roider</snm>
                  <fnm>HG</fnm>
               </au>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>PLoS Comput Biol</source>
            <pubdate>2008</pubdate>
            <volume>4</volume>
            <fpage>e1000039</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2266803</pubid>
                  <pubid idtype="pmpid" link="fulltext">18369429</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Embryonic epsilon and gamma globin genes of a prosimian primate (<it>Galago crassicaudatus</it>). Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints.</p>
            </title>
            <aug>
               <au>
                  <snm>Tagle</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Koop</snm>
                  <fnm>BF</fnm>
               </au>
               <au>
                  <snm>Goodman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Slightom</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Hess</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>RT</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1988</pubdate>
            <volume>203</volume>
            <fpage>439</fpage>
            <lpage>455</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">3199442</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Searching for regulatory elements in human non-coding sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Duret</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Bucher</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>1997</pubdate>
            <volume>7</volume>
            <fpage>399</fpage>
            <lpage>406</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9204283</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Human-mouse genome comparisons to locate regulatory sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Palumbo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Fickett</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>CE</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>26</volume>
            <fpage>225</fpage>
            <lpage>228</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11017083</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Identification of regulatory regions which confer muscle-specific gene expression.</p>
            </title>
            <aug>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Fickett</snm>
                  <fnm>JW</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>278</volume>
            <fpage>167</fpage>
            <lpage>181</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9571041</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Experimental data of a single promoter can be used for <it>in silico </it>detection of genes with related regulation in the absence of sequence similarity.</p>
            </title>
            <aug>
               <au>
                  <snm>Gailus-Durner</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Scherf</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Werner</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Mamm Genome</source>
            <pubdate>2001</pubdate>
            <volume>12</volume>
            <fpage>67</fpage>
            <lpage>72</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11178746</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Exploiting transcription factor binding site clustering to identify <it>cis </it>-regulatory modules involved in pattern formation in the <it>Drosophila </it>genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Berman</snm>
                  <fnm>BP</fnm>
               </au>
               <au>
                  <snm>Nibu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Pfeiffer</snm>
                  <fnm>BD</fnm>
               </au>
               <au>
                  <snm>Tomancak</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Celniker</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>757</fpage>
            <lpage>762</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">117378</pubid>
                  <pubid idtype="pmpid" link="fulltext">11805330</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Genome-wide analysis of clustered Dorsal binding sites identifies putative target genes in the Drosophila embryo.</p>
            </title>
            <aug>
               <au>
                  <snm>Markstein</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Markstein</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Markstein</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>763</fpage>
            <lpage>768</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">117379</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752406</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Computation-based discovery of related transcriptional regulatory modules and motifs using an experimentally validated combinatorial model.</p>
            </title>
            <aug>
               <au>
                  <snm>Halfon</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Grad</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Michelson</snm>
                  <fnm>AM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>1019</fpage>
            <lpage>1028</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">186630</pubid>
                  <pubid idtype="pmpid" link="fulltext">12097338</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Genes regulated cooperatively by one or more transcription factors and their identification in whole eukaryotic genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Wagner</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <fpage>776</fpage>
            <lpage>784</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10705431</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>SCORE: a computational approach to the identification of cis-regulatory modules and target genes in whole-genome sequence data. Site clustering over random expectation.</p>
            </title>
            <aug>
               <au>
                  <snm>Rebeiz</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Reeves</snm>
                  <fnm>NL</fnm>
               </au>
               <au>
                  <snm>Posakony</snm>
                  <fnm>JW</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>9888</fpage>
            <lpage>9893</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">125053</pubid>
                  <pubid idtype="pmpid" link="fulltext">12107285</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Identification of functional clusters of transcription factor binding motifs in genome sequences: the MSCAN algorithm.</p>
            </title>
            <aug>
               <au>
                  <snm>Johansson</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Alkema</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Lagergren</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>Suppl 1</issue>
            <fpage>i169</fpage>
            <lpage>i176</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12855453</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Searching for statistically significant regulatory modules.</p>
            </title>
            <aug>
               <au>
                  <snm>Bailey</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Noble</snm>
                  <fnm>WS</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>Suppl 2</issue>
            <fpage>ii16</fpage>
            <lpage>ii25</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14534166</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>CREME: a framework for identifying cis-regulatory modules in human-mouse conserved segments.</p>
            </title>
            <aug>
               <au>
                  <snm>Sharan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ovcharenko</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Ben-Hur</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Karp</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>Suppl 1</issue>
            <fpage>i283</fpage>
            <lpage>i291</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12855471</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Identification of sparsely distributed clusters of cis-regulatory elements in sets of co-expressed genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Kreiman</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>2889</fpage>
            <lpage>2900</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">419615</pubid>
                  <pubid idtype="pmpid" link="fulltext">15155858</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Cluster-Buster: finding dense clusters of motifs in DNA sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Frith</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Weng</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>3666</fpage>
            <lpage>3668</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">168947</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824389</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>A discriminative model for identifying spatial <it>cis </it>-regulatory modules.</p>
            </title>
            <aug>
               <au>
                  <snm>Segal</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Sharan</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>J Comput Biol</source>
            <pubdate>2005</pubdate>
            <volume>12</volume>
            <fpage>822</fpage>
            <lpage>834</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16108719</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Computational detection of genomic <it>cis </it>-regulatory modules applied to body patterning in the early Drosophila embryo.</p>
            </title>
            <aug>
               <au>
                  <snm>Rajewsky</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Vergassola</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gaul</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Siggia</snm>
                  <fnm>ED</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>30</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">139975</pubid>
                  <pubid idtype="pmpid" link="fulltext">12398796</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>A probabilistic method to detect regulatory modules.</p>
            </title>
            <aug>
               <au>
                  <snm>Sinha</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>van Nimwegen</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Siggia</snm>
                  <fnm>ED</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>Suppl 1</issue>
            <fpage>i292</fpage>
            <lpage>i301</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12855472</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>rVista for comparative sequence-based discovery of functional transcription factor binding sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Loots</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Ovcharenko</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>832</fpage>
            <lpage>839</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">186580</pubid>
                  <pubid idtype="pmpid" link="fulltext">11997350</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Computational detection of <it>cis </it>-regulatory modules.</p>
            </title>
            <aug>
               <au>
                  <snm>Aerts</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Van Loo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Thijs</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Moreau</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>De Moor</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>Suppl 2</issue>
            <fpage>ii5</fpage>
            <lpage>ii14</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14534164</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>TFBScluster: a resource for the characterization of transcriptional regulatory networks.</p>
            </title>
            <aug>
               <au>
                  <snm>Donaldson</snm>
                  <fnm>IJ</fnm>
               </au>
               <au>
                  <snm>Chapman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gottgens</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>3058</fpage>
            <lpage>3059</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15855248</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Modulefinder: a tool for computational discovery of cis regulatory modules.</p>
            </title>
            <aug>
               <au>
                  <snm>Philippakis</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>FS</fnm>
               </au>
               <au>
                  <snm>Bulyk</snm>
                  <fnm>ML</fnm>
               </au>
            </aug>
            <source>Pac Symp Biocomput</source>
            <pubdate>2005</pubdate>
            <fpage>519</fpage>
            <lpage>530</lpage>
            <xrefbib>
               <pubid idtype="pmpid">15759656</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression.</p>
            </title>
            <aug>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bataille</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Poitras</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lagani&#232;re</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lef&#232;bvre</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Deblois</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gigu&#232;re</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Ferretti</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Bergeron</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Coulombe</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Robert</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <fpage>656</fpage>
            <lpage>668</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1457048</pubid>
                  <pubid idtype="pmpid" link="fulltext">16606704</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Genome-wide prediction of mammalian enhancers based on analysis of transcription-factor binding affinity.</p>
            </title>
            <aug>
               <au>
                  <snm>Hallikas</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Palin</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sinjushina</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Rautiainen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Partanen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ukkonen</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Taipale</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2006</pubdate>
            <volume>124</volume>
            <fpage>47</fpage>
            <lpage>59</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16413481</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Locating potential enhancer elements by comparative genomics using the EEL software.</p>
            </title>
            <aug>
               <au>
                  <snm>Palin</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Taipale</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ukkonen</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nat Protoc</source>
            <pubdate>2006</pubdate>
            <volume>1</volume>
            <fpage>368</fpage>
            <lpage>374</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17406258</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.</p>
            </title>
            <aug>
               <au>
                  <snm>Karlin</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1990</pubdate>
            <volume>87</volume>
            <fpage>2264</fpage>
            <lpage>2268</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">53667</pubid>
                  <pubid idtype="pmpid" link="fulltext">2315319</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Genomic characterization of Gli-activator targets in sonic hedgehog-mediated neural patterning.</p>
            </title>
            <aug>
               <au>
                  <snm>Vokes</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Ji</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>McCuine</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tenzen</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Giles</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Zhong</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Longabaugh</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Davidson</snm>
                  <fnm>EH</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>WH</fnm>
               </au>
               <au>
                  <snm>McMahon</snm>
                  <fnm>AP</fnm>
               </au>
            </aug>
            <source>Development</source>
            <pubdate>2007</pubdate>
            <volume>134</volume>
            <fpage>1977</fpage>
            <lpage>1989</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17442700</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Transcription factor map alignment of promoter regions.</p>
            </title>
            <aug>
               <au>
                  <snm>Blanco</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Messeguer</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>TF</fnm>
               </au>
               <au>
                  <snm>Guigo</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>PLoS Comput Biol</source>
            <pubdate>2006</pubdate>
            <volume>2</volume>
            <fpage>e49</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1464811</pubid>
                  <pubid idtype="pmpid" link="fulltext">16733547</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Simultaneous alignment and annotation of cis-regulatory regions.</p>
            </title>
            <aug>
               <au>
                  <snm>Bais</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Grossmann</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>23</volume>
            <fpage>e44</fpage>
            <lpage>49</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17237103</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Cluster analysis and display of genome-wide expression patterns</p>
            </title>
            <aug>
               <au>
                  <snm>Eisen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Spellman</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>14863</fpage>
            <lpage>14868</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">24541</pubid>
                  <pubid idtype="pmpid" link="fulltext">9843981</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>CLICK and EXPANDER: a system for clustering and visualizing gene expression data.</p>
            </title>
            <aug>
               <au>
                  <snm>Sharan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Maron-Katz</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Shamir</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>1787</fpage>
            <lpage>1799</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14512350</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation.</p>
            </title>
            <aug>
               <au>
                  <snm>Tamayo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Slonim</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Kitareewan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dmitrovsky</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <fpage>2907</fpage>
            <lpage>2912</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">15868</pubid>
                  <pubid idtype="pmpid" link="fulltext">10077610</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Singular value decomposition for genome-wide expression data processing and modeling.</p>
            </title>
            <aug>
               <au>
                  <snm>Alter</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <fpage>10101</fpage>
            <lpage>10106</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">27718</pubid>
                  <pubid idtype="pmpid" link="fulltext">10963673</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Correspondence analysis applied to microarray data.</p>
            </title>
            <aug>
               <au>
                  <snm>Fellenberg</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hauser</snm>
                  <fnm>NC</fnm>
               </au>
               <au>
                  <snm>Brors</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Neutzner</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hoheisel</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <fpage>10781</fpage>
            <lpage>10786</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">58552</pubid>
                  <pubid idtype="pmpid" link="fulltext">11535808</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>NCBI GEO: mining tens of millions of expression profiles - database and tools update.</p>
            </title>
            <aug>
               <au>
                  <snm>Barrett</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Troup</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Wilhite</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Ledoux</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Rudnev</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Evangelista</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>IF</fnm>
               </au>
               <au>
                  <snm>Soboleva</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tomashevsky</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Edgar</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <issue>35 Database</issue>
            <fpage>D760</fpage>
            <lpage>D765</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1669752</pubid>
                  <pubid idtype="pmpid" link="fulltext">17099226</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>ArrayExpress - a public database of microarray experiments and gene expression profiles.</p>
            </title>
            <aug>
               <au>
                  <snm>Parkinson</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kapushesky</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Shojatalab</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Abeygunawardena</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Coulson</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Farne</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Holloway</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Kolesnykov</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Lilja</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lukk</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Rayner</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sharma</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>William</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Sarkans</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Brazma</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <issue>35 Database</issue>
            <fpage>D747</fpage>
            <lpage>D750</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1716725</pubid>
                  <pubid idtype="pmpid" link="fulltext">17132828</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>The Stanford Microarray Database: implementation of new analysis tools and open source release of software.</p>
            </title>
            <aug>
               <au>
                  <snm>Demeter</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Beauheim</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Gollub</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hernandez-Boussard</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Jin</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Maier</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Matese</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Nitzberg</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Wymore</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Zachariah</snm>
                  <fnm>ZK</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Sherlock</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Ball</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <issue>35 Database</issue>
            <fpage>D766</fpage>
            <lpage>D770</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1781111</pubid>
                  <pubid idtype="pmpid" link="fulltext">17182626</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Identification of consensus patterns in unaligned DNA sequences known to be functionally related.</p>
            </title>
            <aug>
               <au>
                  <snm>Hertz</snm>
                  <fnm>GZ</fnm>
               </au>
               <au>
                  <snm>Hartzell</snm>
                  <fnm>GW</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>CABIOS</source>
            <pubdate>1990</pubdate>
            <volume>6</volume>
            <fpage>81</fpage>
            <lpage>92</lpage>
            <xrefbib>
               <pubid idtype="pmpid">2193692</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment.</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Boguski</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Neuwald</snm>
                  <fnm>AF</fnm>
               </au>
               <au>
                  <snm>Wootton</snm>
                  <fnm>JC</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1993</pubdate>
            <volume>262</volume>
            <fpage>208</fpage>
            <lpage>214</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8211139</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>Gibbs motif sampling: detection of bacterial outer membrane protein repeats.</p>
            </title>
            <aug>
               <au>
                  <snm>Neuwald</snm>
                  <fnm>AF</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>CE</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1995</pubdate>
            <volume>4</volume>
            <fpage>1618</fpage>
            <lpage>1632</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2143180</pubid>
                  <pubid idtype="pmpid" link="fulltext">8520488</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation.</p>
            </title>
            <aug>
               <au>
                  <snm>Roth</snm>
                  <fnm>FP</fnm>
               </au>
               <au>
                  <snm>Hughes</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Estep</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>1998</pubdate>
            <volume>16</volume>
            <fpage>939</fpage>
            <lpage>945</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9788350</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>Systematic determination of genetic network architecture.</p>
            </title>
            <aug>
               <au>
                  <snm>Tavazoie</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hughes</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Campbell</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Cho</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>1999</pubdate>
            <volume>22</volume>
            <fpage>281</fpage>
            <lpage>285</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10391217</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Computational identification of cis-regulatory elements associated with groups of functionally related genes in <it>Saccharomyces cerevisiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Hughes</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Estep</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Tavazoie</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2000</pubdate>
            <volume>296</volume>
            <fpage>1205</fpage>
            <lpage>1214</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10698627</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling.</p>
            </title>
            <aug>
               <au>
                  <snm>Thijs</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Lescot</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Marchal</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Rombauts</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>De Moor</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rouze</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Moreau</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>1113</fpage>
            <lpage>1122</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11751219</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B64">
            <title>
               <p>BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Brutlag</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Pac Symp Biocomput</source>
            <pubdate>2001</pubdate>
            <fpage>127</fpage>
            <lpage>138</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11262934</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>Fitting a mixture model by expectation maximization to discover motifs in biopolymers.</p>
            </title>
            <aug>
               <au>
                  <snm>Bailey</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Elkan</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Proc Int Conf Intell Syst Mol Biol</source>
            <pubdate>1994</pubdate>
            <volume>2</volume>
            <fpage>28</fpage>
            <lpage>36</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7584402</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>The value of prior knowledge in discovering motifs with MEME.</p>
            </title>
            <aug>
               <au>
                  <snm>Bailey</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Elkan</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Proc Int Conf Intell Syst Mol Biol</source>
            <pubdate>1995</pubdate>
            <volume>3</volume>
            <fpage>21</fpage>
            <lpage>29</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7584439</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies.</p>
            </title>
            <aug>
               <au>
                  <snm>van Helden</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Andre</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>281</volume>
            <fpage>827</fpage>
            <lpage>842</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9719638</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B68">
            <title>
               <p>Exceptional motifs in different Markov chain models for a statistical analysis of DNA sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Schbath</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Prum</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>de Turckheim</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>J Comput Biol</source>
            <pubdate>1995</pubdate>
            <volume>2</volume>
            <fpage>417</fpage>
            <lpage>437</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8521272</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B69">
            <title>
               <p>Predicting gene regulatory elements in silico on a genomic scale.</p>
            </title>
            <aug>
               <au>
                  <snm>Brazma</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jonassen</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Vilo</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ukkonen</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1998</pubdate>
            <volume>8</volume>
            <fpage>1202</fpage>
            <lpage>1215</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">310790</pubid>
                  <pubid idtype="pmpid" link="fulltext">9847082</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B70">
            <title>
               <p>Approaches to the automatic discovery of patterns in biosequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Brazma</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jonassen</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Eidhammer</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>J Comput Biol</source>
            <pubdate>1998</pubdate>
            <volume>5</volume>
            <fpage>279</fpage>
            <lpage>305</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9672833</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B71">
            <title>
               <p>Algorithms for phylogenetic footprinting.</p>
            </title>
            <aug>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Schwikowski</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Comput Biol</source>
            <pubdate>2002</pubdate>
            <volume>9</volume>
            <fpage>211</fpage>
            <lpage>223</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12015878</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B72">
            <title>
               <p>An exact algorithm to identify motifs in orthologous sequences from multiple species.</p>
            </title>
            <aug>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Schwikowski</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Int Conf Intell Syst Mol Biol</source>
            <pubdate>2000</pubdate>
            <volume>8</volume>
            <fpage>37</fpage>
            <lpage>45</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10977064</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B73">
            <title>
               <p>An exact method for finding short motifs in sequences, with application to the ribosome binding site problem.</p>
            </title>
            <aug>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Int Conf Intell Syst Mol Biol</source>
            <pubdate>1999</pubdate>
            <fpage>262</fpage>
            <lpage>271</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10786309</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B74">
            <title>
               <p>Regulatory element detection using a probabilistic segmentation model.</p>
            </title>
            <aug>
               <au>
                  <snm>Bussemaker</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Siggia</snm>
                  <fnm>ED</fnm>
               </au>
            </aug>
            <source>Proc Int Conf Intell Syst Mol Biol</source>
            <pubdate>2000</pubdate>
            <volume>8</volume>
            <fpage>67</fpage>
            <lpage>74</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10977067</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B75">
            <title>
               <p>Promoter sequences and algorithmical methods for identifying them.</p>
            </title>
            <aug>
               <au>
                  <snm>Vanet</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Marsan</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Sagot</snm>
                  <fnm>MF</fnm>
               </au>
            </aug>
            <source>Res Microbiol</source>
            <pubdate>1999</pubdate>
            <volume>150</volume>
            <fpage>779</fpage>
            <lpage>799</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10673015</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B76">
            <title>
               <p>Discovering regulatory elements in non-coding sequences by analysis of spaced dyads.</p>
            </title>
            <aug>
               <au>
                  <snm>van Helden</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Rios</snm>
                  <fnm>AF</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>1808</fpage>
            <lpage>1818</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">102821</pubid>
                  <pubid idtype="pmpid" link="fulltext">10734201</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B77">
            <title>
               <p>Assessing computational tools for the discovery of transcription factor binding sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bailey</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>De Moor</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Eskin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Favorov</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Frith</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Fu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Makeev</snm>
                  <fnm>VJ</fnm>
               </au>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Noble</snm>
                  <fnm>WS</fnm>
               </au>
               <au>
                  <snm>Pavesi</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Pesole</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>R&#233;gnier</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Simonis</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sinha</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Thijs</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>van Helden</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Vandenbogaert</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Weng</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Workman</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ye</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2005</pubdate>
            <volume>23</volume>
            <fpage>137</fpage>
            <lpage>144</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15637633</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B78">
            <title>
               <p>FootPrinter: A program designed for phylogenetic footprinting.</p>
            </title>
            <aug>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>3840</fpage>
            <lpage>3842</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">169012</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824433</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B79">
            <title>
               <p>Combining phylogenetic data with co-regulated genes to identify regulatory motifs.</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>2369</fpage>
            <lpage>2380</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14668220</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B80">
            <title>
               <p>PhyME: a probabilistic algorithm for finding motifs in sets of orthologous sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Sinha</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>170</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">534098</pubid>
                  <pubid idtype="pmpid" link="fulltext">15511292</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B81">
            <title>
               <p>Motif discovery in heterogeneous sequence data.</p>
            </title>
            <aug>
               <au>
                  <snm>Prakash</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sinha</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Pac Symp Biocomput</source>
            <pubdate>2004</pubdate>
            <volume>9</volume>
            <fpage>348</fpage>
            <lpage>359</lpage>
         </bibl>
         <bibl id="B82">
            <title>
               <p>PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.</p>
            </title>
            <aug>
               <au>
                  <snm>Siddharthan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Siggia</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>van Nimwegen</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>PLoS Comput Biol</source>
            <pubdate>2005</pubdate>
            <volume>1</volume>
            <fpage>e67</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1309704</pubid>
                  <pubid idtype="pmpid" link="fulltext">16477324</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B83">
            <title>
               <p>Evaluation of phylogenetic footprint discovery for predicting bacterial cis-regulatory elements and revealing their evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Janky</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>van Helden</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2008</pubdate>
            <volume>9</volume>
            <fpage>37</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2248561</pubid>
                  <pubid idtype="pmpid" link="fulltext">18215291</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B84">
            <title>
               <p>An improved map of conserved regulatory sites for <it>Saccharomyces cerevisiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>MacIsaac</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Gordon</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Gifford</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Fraenkel</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>113</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1435934</pubid>
                  <pubid idtype="pmpid" link="fulltext">16522208</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B85">
            <title>
               <p>Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.</p>
            </title>
            <aug>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Stamatoyannopoulos</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Dutta</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Guig&#243;</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gingeras</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Margulies</snm>
                  <fnm>EH</fnm>
               </au>
               <au>
                  <snm>Weng</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Snyder</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dermitzakis</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Thurman</snm>
                  <fnm>RE</fnm>
               </au>
               <au>
                  <snm>Kuehn</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Neph</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Koch</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Asthana</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Malhotra</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Adzhubei</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Greenbaum</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Andrews</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Flicek</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Boyle</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Cao</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Carter</snm>
                  <fnm>NP</fnm>
               </au>
               <au>
                  <snm>Clelland</snm>
                  <fnm>GK</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Day</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Dhami</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Dillon</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Dorschner</snm>
                  <fnm>MO</fnm>
               </au>
               <au>
                  <snm>Fiegler</snm>
                  <fnm>H</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2007</pubdate>
            <volume>447</volume>
            <fpage>799</fpage>
            <lpage>816</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2212820</pubid>
                  <pubid idtype="pmpid">17571346</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B86">
            <title>
               <p>Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals.</p>
            </title>
            <aug>
               <au>
                  <snm>Xie</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kulbokas</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Mootha</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Lindblad-Toh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Kellis</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2005</pubdate>
            <volume>434</volume>
            <fpage>338</fpage>
            <lpage>345</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15735639</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B87">
            <title>
               <p>Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Xie</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Mikkelsen</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Gnirke</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lindblad-Toh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kellis</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2007</pubdate>
            <volume>104</volume>
            <fpage>7145</fpage>
            <lpage>7150</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1852749</pubid>
                  <pubid idtype="pmpid" link="fulltext">17442748</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B88">
            <title>
               <p>Functional inference from non-random distributions of conserved predicted transcription factor binding sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Dieterich</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Rahmann</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <issue>Suppl 1</issue>
            <fpage>i109</fpage>
            <lpage>i115</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15262788</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B89">
            <title>
               <p>A statistical model for locating regulatory regions in genomic DNA.</p>
            </title>
            <aug>
               <au>
                  <snm>Crowley</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Roeder</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Bina</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1997</pubdate>
            <volume>268</volume>
            <fpage>8</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9149136</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B90">
            <title>
               <p>A predictive model for regulatory sequences directing liver-specific transcription.</p>
            </title>
            <aug>
               <au>
                  <snm>Krivan</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>WW</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>1559</fpage>
            <lpage>1566</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">311083</pubid>
                  <pubid idtype="pmpid" link="fulltext">11544200</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B91">
            <title>
               <p>Toucan: deciphering the cis-regulatory logic of coregulated genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Aerts</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Thijs</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Coessens</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Staes</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Moreau</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>De Moor</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>1753</fpage>
            <lpage>1764</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">152870</pubid>
                  <pubid idtype="pmpid" link="fulltext">12626717</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B92">
            <title>
               <p>Identifying synonymous regulatory elements in vertebrate genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Ovcharenko</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Nobrega</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <issue>33 Web Server</issue>
            <fpage>W403</fpage>
            <lpage>W407</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1160227</pubid>
                  <pubid idtype="pmpid" link="fulltext">15980499</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B93">
            <title>
               <p>A regulatory code for neurogenic gene expression in the Drosophila embryo.</p>
            </title>
            <aug>
               <au>
                  <snm>Markstein</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Zinzen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Markstein</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Yee</snm>
                  <fnm>KP</fnm>
               </au>
               <au>
                  <snm>Erives</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Stathopoulos</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Development</source>
            <pubdate>2004</pubdate>
            <volume>131</volume>
            <fpage>2387</fpage>
            <lpage>2394</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15128669</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B94">
            <title>
               <p>Genome-wide identification of cis-regulatory sequences controlling blood and endothelial development.</p>
            </title>
            <aug>
               <au>
                  <snm>Donaldson</snm>
                  <fnm>IJ</fnm>
               </au>
               <au>
                  <snm>Chapman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kinston</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Landry</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Knezevic</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Piltz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Buckley</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Gottgens</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2005</pubdate>
            <volume>14</volume>
            <fpage>595</fpage>
            <lpage>601</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15649946</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B95">
            <title>
               <p>Predictive modeling of genome-wide mRNA expression: from modules to molecules.</p>
            </title>
            <aug>
               <au>
                  <snm>Bussemaker</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Foat</snm>
                  <fnm>BC</fnm>
               </au>
               <au>
                  <snm>Ward</snm>
                  <fnm>LD</fnm>
               </au>
            </aug>
            <source>Annu Rev Biophys Biomol Struct</source>
            <pubdate>2007</pubdate>
            <volume>36</volume>
            <fpage>329</fpage>
            <lpage>347</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17311525</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B96">
            <title>
               <p>Systematic identification of mammalian regulatory motifs' target genes and functions.</p>
            </title>
            <aug>
               <au>
                  <snm>Warner</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Philippakis</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Jaeger</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>FS</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bulyk</snm>
                  <fnm>ML</fnm>
               </au>
            </aug>
            <source>Nat Methods</source>
            <pubdate>2008</pubdate>
            <volume>5</volume>
            <fpage>347</fpage>
            <lpage>353</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">18311145</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B97">
            <title>
               <p>CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhou</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>WH</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2004</pubdate>
            <volume>101</volume>
            <fpage>12114</fpage>
            <lpage>12119</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">514443</pubid>
                  <pubid idtype="pmpid" link="fulltext">15297614</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B98">
            <title>
               <p>Decoding human regulatory circuits.</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Palumbo</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>CE</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <fpage>1967</fpage>
            <lpage>1974</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">524421</pubid>
                  <pubid idtype="pmpid" link="fulltext">15466295</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B99">
            <title>
               <p>De novo cis-regulatory module elicitation for eukaryotic genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Gupta</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <fpage>7079</fpage>
            <lpage>7084</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1129096</pubid>
                  <pubid idtype="pmpid" link="fulltext">15883375</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B100">
            <title>
               <p>Discovering functional transcription-factor combinations in the human cell cycle.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhu</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Shendure</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <fpage>848</fpage>
            <lpage>855</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1142475</pubid>
                  <pubid idtype="pmpid" link="fulltext">15930495</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B101">
            <title>
               <p>CoMoDis: composite motif discovery in mammalian genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Donaldson</snm>
                  <fnm>IJ</fnm>
               </au>
               <au>
                  <snm>Gottgens</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <fpage>e1</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1702496</pubid>
                  <pubid idtype="pmpid" link="fulltext">17130158</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B102">
            <title>
               <p>Distinguishing regulatory DNA from neutral sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kolbe</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Eswara</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>O'Connor</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Chiaromonte</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>64</fpage>
            <lpage>72</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">430974</pubid>
                  <pubid idtype="pmpid" link="fulltext">12529307</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B103">
            <title>
               <p>ESPERR: learning strong and weak signals in genomic sequence alignments to identify functional elements.</p>
            </title>
            <aug>
               <au>
                  <snm>Taylor</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Tyekucheva</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>DC</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Chiaromonte</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <fpage>1596</fpage>
            <lpage>1604</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1665643</pubid>
                  <pubid idtype="pmpid" link="fulltext">17053093</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B104">
            <title>
               <p>Identifying cis-regulatory modules by combining comparative and compositional analysis of DNA.</p>
            </title>
            <aug>
               <au>
                  <snm>Pierstorff</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bergman</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Wiehe</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <fpage>2858</fpage>
            <lpage>2864</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17032682</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B105">
            <title>
               <p>Close sequence comparisons are sufficient to identify human cis-regulatory elements.</p>
            </title>
            <aug>
               <au>
                  <snm>Prabhakar</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Poulin</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Shoukry</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Afzal</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Couronne</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Pennacchio</snm>
                  <fnm>LA</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <fpage>855</fpage>
            <lpage>863</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1484452</pubid>
                  <pubid idtype="pmpid" link="fulltext">16769978</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B106">
            <title>
               <p>In vivo enhancer analysis of human conserved non-coding sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Pennacchio</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Ahituv</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Moses</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Prabhakar</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nobrega</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Shoukry</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Minovitsky</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Holt</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Plajzer-Frick</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Akiyama</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>De Val</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Afzal</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Black</snm>
                  <fnm>BL</fnm>
               </au>
               <au>
                  <snm>Couronne</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Visel</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2006</pubdate>
            <volume>444</volume>
            <fpage>499</fpage>
            <lpage>502</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17086198</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B107">
            <title>
               <p>Predicting tissue-specific enhancers in the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Pennacchio</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Loots</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Nobrega</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Ovcharenko</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2007</pubdate>
            <volume>17</volume>
            <fpage>201</fpage>
            <lpage>211</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1781352</pubid>
                  <pubid idtype="pmpid" link="fulltext">17210927</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B108">
            <title>
               <p>ORegAnno: an open-access community-driven resource for regulatory annotation.</p>
            </title>
            <aug>
               <au>
                  <snm>Griffith</snm>
                  <fnm>OL</fnm>
               </au>
               <au>
                  <snm>Montgomery</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Bernier</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Chu</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kasaian</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Aerts</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mahony</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sleumer</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Bilenky</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Haeussler</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Griffith</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gallo</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Giardine</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Hooghe</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Van Loo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Blanco</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Ticoll</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lithwick</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Portales-Casamar</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Donaldson</snm>
                  <fnm>IJ</fnm>
               </au>
               <au>
                  <snm>Robertson</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Wadelius</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>De Bleser</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Vlieghe</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Halfon</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Bergman</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <cnm>Open Regulatory Annotation Consortium</cnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2008</pubdate>
            <issue>36 Database</issue>
            <fpage>D107</fpage>
            <lpage>D113</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2239002</pubid>
                  <pubid idtype="pmpid" link="fulltext">18006570</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B109">
            <title>
               <p>Ensembl 2008.</p>
            </title>
            <aug>
               <au>
                  <snm>Flicek</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Aken</snm>
                  <fnm>BL</fnm>
               </au>
               <au>
                  <snm>Beal</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ballester</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Caccamo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Clarke</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Coates</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Cunningham</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Cutts</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Down</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Dyer</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Eyre</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fitzgerald</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fernandez-Banet</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gr&#228;f</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Haider</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hammond</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Holland</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Howe</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Howe</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Jenkinson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>K&#228;h&#228;ri</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Keefe</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kokocinski</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Kulesha</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Lawson</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Longden</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Megy</snm>
                  <fnm>K</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2008</pubdate>
            <issue>36 Database</issue>
            <fpage>D707</fpage>
            <lpage>D714</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2238821</pubid>
                  <pubid idtype="pmpid" link="fulltext">18000006</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>