<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2001-2-8-research0029</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Comparison of complete nuclear receptor sets from the human, <it>Caenorhabditis elegans</it> and <it>Drosophila</it> genomes</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Maglich</snm>
               <mi>M</mi>
               <fnm>Jodi</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A2">
               <snm>Sluder</snm>
               <fnm>Ann</fnm>
               <insr iid="I3"/>
            </au>
            <au id="A3">
               <snm>Guan</snm>
               <fnm>Xiaojun</fnm>
               <insr iid="I2"/>
            </au>
            <au id="A4">
               <snm>Shi</snm>
               <fnm>Yunling</fnm>
               <insr iid="I2"/>
            </au>
            <au id="A5">
               <snm>McKee</snm>
               <mi>D</mi>
               <fnm>David</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A6">
               <snm>Carrick</snm>
               <fnm>Kevin</fnm>
               <insr iid="I2"/>
            </au>
            <au id="A7">
               <snm>Kamdar</snm>
               <fnm>Kim</fnm>
               <insr iid="I4"/>
            </au>
            <au id="A8">
               <snm>Willson</snm>
               <mi>M</mi>
               <fnm>Timothy</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A9" ca="yes">
               <snm>Moore</snm>
               <mi>T</mi>
               <fnm>John</fnm>
               <insr iid="I1"/>
               <email>jtm36008@gsk.com.</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Nuclear Receptor Discovery Research</p>
            </ins>
            <ins id="I2">
               <p>Cellular Genomics, GlaxoSmithKline, Research Triangle Park, NC 27709, USA</p>
            </ins>
            <ins id="I3">
               <p>Cambria Biosciences, Bedford, MA 01730, USA</p>
            </ins>
            <ins id="I4">
               <p>Syngenta Inc, Research Triangle Park, NC 27709, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2001</pubdate>
         <volume>2</volume>
         <issue>8</issue>
         <fpage>research0029.1</fpage>
         <lpage>research0029.7</lpage>
         <url>http://genomebiology.com/2001/2/8/research/0029</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/gb-2001-2-8-research0029</pubid>
               <pubid idtype="pmpid">11532213</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>20</day>
               <month>4</month>
               <year>2001</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>6</day>
               <month>6</month>
               <year>2001</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>20</day>
               <month>6</month>
               <year>2001</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>24</day>
               <month>7</month>
               <year>2001</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2001</year>
         <collab>Maglich et al., licensee BioMed Central Ltd</collab>
      </cpyrt>
      <shortabs>
         <p>Using the nearly completed human genome sequence, <it>in silico</it> and experimental approaches have been combined to define the complete human nuclear receptor set. Two novel NR pseudogenes have been identified and there are fewer than 50 functional human nuclear receptors, fewer than in <it>C. elegans</it> but more than in <it>Drosophila</it>.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The availability of complete genome sequences enables all the members of a gene family to be identified without limitations imposed by temporal, spatial or quantitative aspects of mRNA expression. Using the nearly completed human genome sequence, we combined <it>in silico</it> and experimental approaches to define the complete human nuclear receptor (NR) set. This information was used to carry out a comparative genomic study of the NR superfamily.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Our analysis of the human genome identified two novel NR sequences. Both these contained stop codons within the coding regions, indicating that both are pseudogenes. One (<it>HNF4 &#947;</it>-related) contained no introns and expressed no detectable mRNA, whereas the other (<it>FXR</it>-related) produced mRNA at relatively high levels in testis. If translated, the latter is predicted to encode a short, non-functional protein. Our analysis indicates that there are fewer than 50 functional human NRs, dramatically fewer than in <it>Caenorhabditis elegans</it> and about twice as many as in <it>Drosophila.</it> Using the complete human NR set we made comparisons with the NR sets of <it>C. elegans</it> and <it>Drosophila.</it> Searches for the >200 NRs unique to <it>C. elegans</it> revealed no human homologs. The comparative analysis also revealed a <it>Drosophila</it> member of NR subfamily NR3, confirming an ancient metazoan origin for this subfamily.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>This work provides the basis for new insights into the evolution and functional relationships of NR superfamily members.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010001">Biochemistry and structural biology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010004">Cell biology</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The complete genomic sequences of four eukaryotic organisms have been reported (<it>Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae</it> and <it>Arabidopsis thaliana</it>). These genome sequences have been used to demonstrate the utility of comparative genomics in discerning evolutionary as well as possible functional relationships within protein superfamilies [<abbr bid="B1">1</abbr>,<abbr bid="B2">2</abbr>]. With the recent addition of the complete human genome sequence available within the human genome project database (>94% sequence available as of February 2001), it is now possible to begin comparative genomic studies using human superfamily members.</p>
         <p>We are particularly interested in the nuclear receptor (NR) class of proteins. NRs appear to be restricted to metazoans, in which they have key roles in integrating the complexities of homeostasis and development [<abbr bid="B3">3</abbr>]. In general, NRs function as transcriptional regulators, working in concert with coactivators and corepressors to activate and suppress target gene expression [<abbr bid="B4">4</abbr>]. The transcriptional activities of about half of the NRs studied in humans are regulated by small lipophilic ligands whereas the other, so-called orphan, receptors await ligand identification. Because of their potential for modulation by exogenous compounds and their central roles in metabolism, these receptors are extremely important targets in human disease. Additionally, NRs represent possible new targets for the control of invertebrate pests in agriculture. The latter approach would be especially promising if certain classes of NRs are shown to be specific to selected invertebrate subgroups. Comparison of complete sets of NRs from evolutionarily divergent organisms should tell us more about the feasibility of these approaches as well as shed light on general phylogenetic relationships among NRs and among species.</p>
         <p>For the past year, known NR sets have presented an interesting puzzle. When the <it>C. elegans</it> genome sequence was reported, over 220 NR members were found [<abbr bid="B5">5</abbr>], and subsequent sequence releases have brought the number of predicted <it>C. elegans</it> NR genes to 270 (A.S., unpublished data). This dramatic increase over the number of currently published human NRs (48) led to speculation that the human NR set could also expand dramatically [<abbr bid="B6">6</abbr>]. Surprisingly, only 21 total NRs were found in the recently reported <it>Drosophila</it> genome sequence [<abbr bid="B7">7</abbr>]. An intriguing question now is whether the total set of human NRs will reflect the diversity seen in <it>C. elegans</it> or instead will parallel that found in <it>Drosophila.</it> We employed a combined bioinformatic/molecular biology approach to answer this question.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <p>We have developed a genomic sequence analysis pipeline utilizing BLAST searches [<abbr bid="B8">8</abbr>] followed by HMMER domain analysis [<abbr bid="B9">9</abbr>] to identify NR sequences within the human genome. Domain analysis was facilitated by the knowledge that the NR superfamily is unified by a common modular structure [<abbr bid="B9">9</abbr>]. One hallmark structure that characterizes the family is a DNA-binding domain (DBD) characterized by two C4-type zinc fingers contained in the amino-terminal half of the proteins. A second characteristic feature, the ligand-binding domain (LBD), is found at the distal carboxyl terminus and contains a highly conserved transcriptional transactivation function (AF2) [<abbr bid="B10">10</abbr>]. The complete known complement of human NRs was used as a query set to identify candidate novel NR sequences from public human genome databases. Identified candidate sequences were followed up with more detailed bioinformatic and, when warranted, molecular biology analysis. Using this approach, we identified two novel NR sequences. The closest homologs of these sequences were represented by FXR (NR1H4) and HNF4&#947; (NR2A2).</p>
         <p>The <it>FXR</it>-related gene sequence (<it>FXR-r</it>) was mapped <it>in silico</it> to chromosomal position 1p13.1-1p13.3, distinct from the chromosomal location of <it>FXR</it> (12q23.1-20). The predicted coding sequence of <it>FXR-r</it> was not contiguous within the genome. A total of seven intronic gaps separated the regions of coding similarity. Interestingly, the positions of the introns within <it>FXR-r</it> were at the same relative positions within the coding sequence as in <it>FXR,</it> suggesting a close evolutionary relationship between these two sequences. The predicted coding sequence of <it>FXR-r</it> displayed similarity to FXR across nearly the entire length (48% sequence identity at the amino acid level) but contained multiple stop codons (Figure <figr fid="F1">1a</figr>). The sequences of multiple stop codons were confirmed by PCR amplification and subsequent sequencing of <it>FXR-r</it> genomic DNA fragments (see Materials and methods). Sequence analysis thus indicated that this gene does not code for a functional NR and is likely to be a pseudogene. Surprisingly, real-time quantitative PCR (RTQ-PCR) detected relatively high levels of expression of <it>FXR-r</it> mRNA in testis (data not shown) indicating that this gene is a transcribed pseudogene.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Amino acid alignments of the novel NR sequences FXR-r and HNF4&#947;-r</p>
            </caption>
            <text>
               <p>Amino acid alignments of the novel NR sequences FXR-r and HNF4&#947;-r. <b>(a)</b> FXR-r amino acid alignment with FXR (NR1H4). The nucleotide sequences from complete and ordered clone AL358372 contained eight fragments with amino acid sequence similarity with FXR (from 5'-3' relative to FXR-r, nucleotide positions 13144-13480; 15053-15199; 21541-21672; 21781-21879; 22015-22115; 23681-23795; 24486-23795; 27041-27262). The positions of the intron gaps within FXR-r and FXR are boxed. Positions of termination codons within FXR-r are indicated by an asterisk. <b>(b)</b> HNF4&#947;-r amino acid alignment with HNF4&#947; (NR2A2). The complete and ordered clones AL449103 and AL138997 contained contiguous HNF4&#947;-r sequence. Seven frameshifts (indicated by asterisks) were required to preserve the reading frame of HNF4&#947;-r relative to HNF4&#947;. Intron locations in the <it>HNF4&#947;</it> gene are boxed.</p>
            </text>
            <graphic file="gb-2001-2-8-research0029-1"/>
         </fig>
         <p>The second novel NR gene (<it>HNF4&#947;-r</it>) was mapped <it>in silico</it> to chromosome position 13q14.11 - 13q14.3, unlinked to the known <it>HNF4&#947; gene</it> at position 12q12. The <it>HNF4&#947;-r</it> sequence showed sequence similarity across nearly the entire length of the coding region of <it>HNF4&#947;</it> (71.4% sequence identity at the amino acid level). Like <it>FXR-r, HNF4&#947;-r</it> coding sequence contained multiple stop codons (Figure <figr fid="F1">1b</figr>) and thus also appears to represent a pseudogene. Nine frame-shifts were necessary to maintain the amino acid reading frame relative to <it>HNF4&#947;.</it> The predicted <it>HNF4&#947;-r</it> sequence was confirmed by sequence analysis of human genomic DNA (see Materials and methods). The predicted coding sequence of <it>HNF4&#947;-r</it> was contiguous within the genome, consistent with possible retrotransposition into the genome [<abbr bid="B11">11</abbr>]. Unlike <it>FXR-r,</it> no expression of <it>HNF4&#947;-r</it> mRNA was detected in any of the tissues examined (data not shown).</p>
         <p>Only one other NR pseudogene has been reported to date, a pseudogene related to the ERR&#945; receptor [<abbr bid="B12">12</abbr>]. The identification of <it>FXR-r</it> and <it>HNF4&#947;-r</it> brings the total human NR pseudogene number to three. Further evidence that these genes are pseudogenes includes the fact that no homologs of the <it>HNF4&#947;-r</it> and <it>FXR-r</it> genes could be found in available mouse, rat, <it>Fugu,</it> or <it>Drosophila</it> genome sequences. Pseudogene sequences would not be expected to be conserved between genomes even as closely related as human and mouse. In addition, and in contrast to their closest functional gene homologs <it>HNF4</it> and <it>FXR, HNF4&#947;-r</it> or <it>FXR-r</it> did not display conservation of their amino acid sequences relative to their nucleic acid sequences. This result is also consistent with the pseudogene characterization of these sequences.</p>
         <p><it>FXR-r</it> and <it>HNF4&#947;-r</it> were found using searches that utilized other known human NRs as query sequences. Certain <it>C. elegans</it> receptors contain LBD sequences that differ significantly from mammalian LBDs [<abbr bid="B5">5</abbr>]. It remained possible, then, that NR LED sequences existed in the human genome that represented homologs of <it>C. elegans</it> NRs but were not identified using mammalian NRs as a query set. To address this question, we scanned the human genome with the complete sets of <it>C. elegans</it> and <it>Drosophila</it> NRs. Extensive analysis using these receptors as query sequences did not reveal a single novel mammalian homolog of these sequences (no BLAST hits with <it>p</it> value &lt; 10). From our analysis, we conclude that the human NR set will not be expanded by orthologs resembling the large number of NRs found in <it>C. elegans.</it></p>
         <p>Phylogenetic analysis of vertebrate NRs has defined six ancestral NR subfamilies [<abbr bid="B13">13</abbr>,<abbr bid="B14">14</abbr>,<abbr bid="B15">15</abbr>]. Five of the subfamilies are also represented among both the <it>C. elegans</it> and <it>Drosophila</it> NRs (Figure <figr fid="F2">2</figr>), consistent with the proposed ancient metazoan origin of these subfamilies [<abbr bid="B14">14</abbr>]. The earlier analysis identified no arthropod or nematode members of the NR3 subfamily, suggesting that this subfamily might be more recently derived and specific to the deuterostome lineage [<abbr bid="B14">14</abbr>]. More recently, however, the <it>Drosophila</it> genome sequence [<abbr bid="B1">1</abbr>] has revealed a previously unknown NR sequence (CG7407) that falls into the NR3 subfamily (Figures <figr fid="F2">2</figr>,<figr fid="F3">3</figr>). The observation that NR3 is represented in both protostomes and deuterostomes indicates that NR3, like the other major NR subfamilies, is of an ancient metazoan origin. Notably, the <it>C. elegans</it> genome does not encode a member of the NR3 subfamily. It will be of interest to learn if NR3 is represented in any nematode species, as absence of the NR3 subfamily from nematodes in general would suggest that arthropods and vertebrates may share an evolutionary history that occurred after separation of the nematode lineage. Such an early divergence of the nematode evolutionary lineage would be in disagreement with a recent hypothesis placing nematodes and arthropods in a common evolutionary clade of molting invertebrates [<abbr bid="B16">16</abbr>] and would be more consistent with the traditional placement of nematodes in a lineage that diverged from other metazoans before separation of the major protostome and deuterostome lineages [<abbr bid="B17">17</abbr>].</p>
         <fig id="F2">
            <title>
               <p>Figure 2</p>
            </title>
            <caption>
               <p>Relationships within the completed sets of NR superfamily members from humans, <it>C. elegans</it> and <it>Drosophila</it></p>
            </caption>
            <text>
               <p>Relationships within the completed sets of NR superfamily members from humans, <it>C. elegans</it> and <it>Drosophila</it>. A neighbor-joining tree of NR DNA-binding domain sequences was generated using the paupsearch feature of the GCG 10.1 program package (1,000 bootstrap replicates, midpoint rooting); analysis methods are described in detail in Sluder <it>et al</it>. [5]. Significant bootstrap support values are indicated by slashes on the appropriate branches: (/) 50-79%; (//) 80-94%; (///) &#8805; 95%; branches also supported by parsimony analysis are marked by dots; subfamilies are boxed. All human (Hs) and <it>Drosophila</it> (Dm) NRs are included, except for the two human NRs that lack a canonical DBD (DAX-1 (NR0B1) and SHP (NR0B2)). The <it>C. elegans</it> (Ce) sequences include all members of the six major metazoan NR subfamilies as well as selected representatives of the major groupings of divergent <it>C. elegans</it> NRs evident in a larger tree containing all the nematode sequences.</p>
            </text>
            <graphic file="gb-2001-2-8-research0029-2"/>
         </fig>
         <fig id="F3">
            <title>
               <p>Figure 3</p>
            </title>
            <caption>
               <p>Amino acid alignment of <it>Drosophila</it> CG7404 with the sequences of human ERR&#945;, ERR&#946;, and ERR&#947;</p>
            </caption>
            <text>
               <p>Amino acid alignment of <it>Drosophila</it> CG7404 with the sequences of human ERR&#945;, ERR&#946;, and ERR&#947;. DmCG7404 exhibits similarity to the human ERRs in both DNA- and ligand-binding domain sequences. Block boxes indicate residues identical in at least two of the four sequences; gray boxes indicate similar residues. Expression of CG7407 as mRNA and the splicing pattern has been confirmed by the recovery and sequencing of cDNA (K.K. <it>et al</it>., unpublished data).</p>
            </text>
            <graphic file="gb-2001-2-8-research0029-3"/>
         </fig>
         <p>Clearly, dramatically divergent evolutionary pathways have shaped the NR sets in separate phylogenetic lineages. Within the six major NR subfamilies, four groups of NRs are currently known only in vertebrates (thyroid hormone receptors (TR), peroxisome proliferator-activated receptor (PPAR), retinoic acid receptors (RAR) and the steroid receptor group containing glucocorticoid receptor (GR), mineralocorticoid receptors (MR), progesterone receptor (PR) and androgen receptors (AR)). In addition, both invertebrate genomes encode NRs that are not clearly placed in one of the six defined NR subfamilies (Figure <figr fid="F2">2</figr>). As previously noted [<abbr bid="B13">13</abbr>], the three <it>Drosophila</it> members of the Knirps group define an unusual class of NRs that lack similarity to the LBDs of the vertebrate NRs. Most of the <it>C. elegans</it> NRs (255 of 270) are diverged from those found in humans and flies (Figure <figr fid="F2">2</figr>). It is unclear whether these divergent nematode NRs represent new subfamilies [<abbr bid="B18">18</abbr>] or are highly diverged members of one or more of the six recognized subfamilies. In contrast to the situation with the insect Knirps group, analyses of potential structures indicates that the majority of the divergent <it>C. elegans</it> receptors are predicted to contain the canonical antiparallel &#945;-helix sandwich structure characteristic of ligand-regulated NRs (A. Bogan, C. Maina, J.-M. Chandonia, F. Cohen, K. Yamamoto, and A. Sluder, unpublished data). Thus, despite the extensive diversity in sequence of many <it>C. elegans</it> LBD sequences, they are unified by a common structural fold, possibly reflecting the requirement for interaction with a core set of NR cofactors [<abbr bid="B19">19</abbr>]. Since the known structures of NR LBDs contain a hydrophobic cleft in which their endogenous hormone ligands are bound, it is likely that most, if not all, orphan receptors will be amenable to modulation by small molecules.</p>
         <p>In sum, we have found a striking difference between humans, <it>Drosophila</it> and <it>C. elegans</it> with respect to their NR sets. There is a finite possibility that the last 5% of the human genome sequence could harbor an additional novel NR sequence, but this is unlikely given that this 5% is enriched in repetitive heterochromatic sequence. Such a finding would not change the general conclusion that there are striking differences between the three genomes. Knowledge of all the members of each NR set defines the unique landscape for NR modulation and provides a basis for more detailed phylogenetic studies. Furthermore, such a whole-genome comparison of the types and numbers of genes only reflects one level of NR complexity in an organism. The impact of transcriptional and post-transcriptional processing events on total NR functional diversity in each proteome will be a subject for future studies.</p>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>Computational identification of human NR sequences</p>
            </st>
            <p>Protein homology searches were performed using individual members of the NR family compared against the human genome database (HTG section from GenBank) using TBLASTN [<abbr bid="B8">8</abbr>]. Results of all sequence hits were stored in an in-house Oracle database. Homologous sequences (those with a BLAST expectation value of 0.1 or lower) were then compared with the entire query NR data set using BLASTN [<abbr bid="B8">8</abbr>]. By choosing the criteria on the basis of automated assessment for 95% sequence identity over 200 base pairs (bp) at the nucleotide level, we could map these hits to individual members of the protein family ('bins'). Those sequence hits that could not automatically be placed in bins on the basis of the selected criteria were either mapped to other protein families (if their sequences matched a known gene in GenBank), considered as potentially novel NR sequences (when key domains were present) or deemed irrelevant (no match to anything). We then applied Hidden Markov Models (HMMs) built on the DBD and LBD domains using the HMMER package [<abbr bid="B9">9</abbr>] to these potential novel sequences. The HMM models were used for further substantiation or for ruling out potential novel hits.</p>
            <p>Cross-species comparisons were performed using the following analyses. First, the HMM model for NR DBD (zf-C4.hmm from Pfam) domain was used to search the <it>C. elegans</it> protein database (assembled in house from GenBank) using the HMMER package [<abbr bid="B9">9</abbr>], and identified 252 <it>C. elegans</it> sequences. These 252 <it>C. elegans</it> sequences were compared to the human genomic sequence database by the search strategy described above. No novel human NRs were found from these searches. Second, the above process was repeated for the <it>Drosophila</it> genome and again no novel human NR sequences were found. Third, the majority of the <it>C. elegans</it> NRs were grouped into ten subfamilies based on comparative sequence analysis of the DNA-binding domain sequences (A.S., data not shown). These subfamilies were used to build 10 HMMs. These HMMs were then used to search the database of predicted <it>Drosophila</it> proteins (established in-house by extracting from GenBank) with the HMMER package. The significant hits (score &lt; 0.1) were compared with the 21 known <it>Drosophila</it> NRs. No novel <it>Drosophila</it> NR sequences were found.</p>
            <p>With each new update of the human genomic database, the system initiated a new round of searching and filtering, capturing all sequences that were homologous and using a master bin list to filter out sequences that had been annotated in a previous cycle. This eliminated the need to separate out all new sequences in the database at each update. The analysis engine was written in Perl. A Java applet was used as the user's main interface to the analysis engine and the underlying Oracle database. It was embedded in an HTML page from which users could view the sequence record by linking to the in-house and public databases as well as all the sequence and domain alignments. As validation of our bioinformatic search tools, 48 of the 49 known NR sequences in the genome (including the ERR pseudogene) were found. The gene representing the CAR NR (NR1I3) is still absent from high-throughput genomic sequence databases.</p>
         </sec>
         <sec>
            <st>
               <p>Genomic DNA analysis of novel human NR sequences</p>
            </st>
            <p>Confirmation of <it>in silico</it> nucleotide sequence was performed by sequencing <it>FXR-r</it> and <it>HNF4&#947;-r</it> genomic DNA fragments amplified by PCR. PCR primers were designed using GenBank accession numbers AL358372 (<it>FXR-r</it>) and AL138997 (<it>HNF4&#947;-r</it>). <it>FXR-r:</it> 5'-GTT GAG TGG AAA TGT GAG AG -3' and 5'-GTA GGA ATG TTG GCA GAA TG -3', product size 566 bp, <it>HNF4&#947;-r; 5'-AGC</it> TTG ATT TGT AGC TGT TC -3' and 5'-TCC TCT CTG GGG CAG AA-3' product size 1,205 bp. These primers were used to amplify 100 ng Clontech genomic DNA using Amplitaq Gold (PE Biosystems) using standard PCR amplification conditions. DNA sequencing was performed by the GlaxoSmithKline core sequencing facility.</p>
         </sec>
         <sec>
            <st>
               <p>mRNA expression analysis</p>
            </st>
            <p>Human tissues were derived from snap frozen surgical specimens (Duke University, Durham, NC). Total RNA was extracted using Qiagen RNeasy kits according to the supplier's protocol. Some total RNAs were obtained from commercial vendors such as Clontech (Palo Alto, CA) and Zen-Bio (Research Triangle Park, NC). To maximize the sensitivity of the quantitation of transcripts, it was critical to ensure that there was no contaminating genomic DNA. The RNA was treated with DNase using Ambion RNase-free DNase. Effectiveness of the DNase treatment was confirmed by running PCR with human &#946;-actin primers in the absence of reverse transcriptase, to ensure no signal was detected. The RNA was then quantitated using the RNA-specific Ribogreen dye (Molecular Probes), and 25 ng aliquoted into a 96-well plate format. To quantitate mRNA expression, RTQ-PCR was performed using an ABI PRISM 7700 Sequence Detection System instrument and software (PE Applied Biosystems) as described [<abbr bid="B20">20</abbr>,<abbr bid="B21">21</abbr>]. Gene-specific primers were synthesized (Keystone, Camarillo, CA) for the sequences <it>FXR-r</it> (accession number AL358372) and <it>HNF4&#947;-r</it> (accession number AL138997).</p>
         </sec>
      </sec>
   </bdy>
   <bm>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Comparative genomics of the eukaryotes.</p>
            </title>
            <aug>
               <au>
                  <snm>Rubin</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Yandell</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Wortman</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Gabor Miklos</snm>
                  <fnm>GL</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Hariharan</snm>
                  <fnm>IK</fnm>
               </au>
               <au>
                  <snm>Fortini</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Fleischmann</snm>
                  <fnm>W</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>287</volume>
            <fpage>2204</fpage>
            <lpage>2215</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.287.5461.2204</pubid>
                  <pubid idtype="pmpid" link="fulltext">10731134</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Analysis of the genome sequence of the flowering plant <it>Arabidopsis thaliana</it>.</p>
            </title>
            <aug>
               <au>
                  <cnm>The Arabidopsis Genome Initiative</cnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>408</volume>
            <fpage>796</fpage>
            <lpage>815</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35048692</pubid>
                  <pubid idtype="pmpid" link="fulltext">11130711</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Orphan nuclear receptors: from gene to function.</p>
            </title>
            <aug>
               <au>
                  <snm>Giguere</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Endocr Rev</source>
            <pubdate>1999</pubdate>
            <volume>20</volume>
            <fpage>689</fpage>
            <lpage>725</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10529899</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Nuclear receptor coregulators: cellular and molecular biology.</p>
            </title>
            <aug>
               <au>
                  <snm>McKenna</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Lanz</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>O'Malley</snm>
                  <fnm>BW</fnm>
               </au>
            </aug>
            <source>Endocr Rev</source>
            <pubdate>1999</pubdate>
            <volume>20</volume>
            <fpage>321</fpage>
            <lpage>344</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10368774</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>The nuclear receptor superfamily has undergone extensive proliferation and diversification in nematodes.</p>
            </title>
            <aug>
               <au>
                  <snm>Sluder</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Mathews</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Hough</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Yin</snm>
                  <fnm>VP</fnm>
               </au>
               <au>
                  <snm>Maina</snm>
                  <fnm>CV</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1999</pubdate>
            <volume>9</volume>
            <fpage>103</fpage>
            <lpage>120</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10022975</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Nematode genome sequence dramatically extends the nuclear receptor superfamily.</p>
            </title>
            <aug>
               <au>
                  <snm>Enmark</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Gustafsson</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Trends Pharmacol Sci</source>
            <pubdate>2000</pubdate>
            <volume>21</volume>
            <fpage>85</fpage>
            <lpage>87</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0165-6147(99)01417-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">10689360</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>The genome sequence of <it>Drosophila melanogaster</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Adams</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Celniker</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Holt</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Evans</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Gocayne</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Amanatides</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Scherer</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Hoskins</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Galle</snm>
                  <fnm>RF</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>287</volume>
            <fpage>2185</fpage>
            <lpage>2195</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.287.5461.2185</pubid>
                  <pubid idtype="pmpid" link="fulltext">10731132</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Schaffer</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <fpage>3389</fpage>
            <lpage>3402</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
                  <pubid idtype="pmpid" link="fulltext">9254694</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Hidden Markov models.</p>
            </title>
            <aug>
               <au>
                  <snm>Eddy</snm>
                  <fnm>SR</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>1996</pubdate>
            <volume>6</volume>
            <fpage>361</fpage>
            <lpage>365</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-440X(96)80056-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">8804822</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Identification of a conserved region required for hormone dependent transcriptional activation by steroid hormone receptors.</p>
            </title>
            <aug>
               <au>
                  <snm>Danielian</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lees</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Parker</snm>
                  <fnm>MG</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1992</pubdate>
            <volume>11</volume>
            <fpage>1025</fpage>
            <lpage>1033</lpage>
            <xrefbib>
               <pubid idtype="pmpid">1372244</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Generation of processed pseudogenes in murine cells.</p>
            </title>
            <aug>
               <au>
                  <snm>Tchenio</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Segal-Bendirdjian</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Heidmann</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1993</pubdate>
            <volume>12</volume>
            <fpage>1487</fpage>
            <lpage>1497</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8385606</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Chromosomal mapping of the human and murine orphan receptors ERRalpha (ESRRA) and ERRbeta (ESRRB) and identification of a novel human ERRalpha-related pseudogene.</p>
            </title>
            <aug>
               <au>
                  <snm>Sladek</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Beatty</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Squire</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Copeland</snm>
                  <fnm>NG</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Jenkins</snm>
                  <fnm>NA</fnm>
               </au>
               <au>
                  <snm>Giguere</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>1997</pubdate>
            <volume>45</volume>
            <fpage>320</fpage>
            <lpage>326</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/geno.1997.4939</pubid>
                  <pubid idtype="pmpid" link="fulltext">9344655</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Evolution of the nuclear receptor gene superfamily.</p>
            </title>
            <aug>
               <au>
                  <snm>Laudet</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Hanni</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Coll</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Catzeflis</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Stehelin</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1992</pubdate>
            <volume>11</volume>
            <fpage>1003</fpage>
            <lpage>1013</lpage>
            <xrefbib>
               <pubid idtype="pmpid">1312460</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Evolution of the nuclear receptor superfamily: early diversification from an ancestral orphan receptor.</p>
            </title>
            <aug>
               <au>
                  <snm>Laudet</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>J Mol Endocrinol</source>
            <pubdate>1997</pubdate>
            <volume>19</volume>
            <fpage>207</fpage>
            <lpage>226</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9460643</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Ligand binding was acquired during evolution of nuclear receptors.</p>
            </title>
            <aug>
               <au>
                  <snm>Escriva</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Safi</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hanni</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Langlois</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Saumitou-Laprade</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Stehelin</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Capron</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pierce</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Laudet</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1997</pubdate>
            <volume>94</volume>
            <fpage>6803</fpage>
            <lpage>6308</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">21239</pubid>
                  <pubid idtype="pmpid" link="fulltext">9192646</pubid>
                  <pubid idtype="doi">10.1073/pnas.94.13.6803</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Evidence for a clade of nematodes, arthropods and other moulting animals.</p>
            </title>
            <aug>
               <au>
                  <snm>Aguinaldo</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Turbeville</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Linford</snm>
                  <fnm>LS</fnm>
               </au>
               <au>
                  <snm>Rivera</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Garey</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Raff</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Lake</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1997</pubdate>
            <volume>387</volume>
            <fpage>489</fpage>
            <lpage>493</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/387489a0</pubid>
                  <pubid idtype="pmpid">9168109</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Fitch</snm>
                  <fnm>DHA</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>WK</fnm>
               </au>
            </aug>
            <source>In C. elegans II. Edited by Riddle DL, Blumenthal T, Meyer BJ and Priess JR. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Pres 1997,</source>
            <fpage>815</fpage>
            <lpage>850</lpage>
         </bibl>
         <bibl id="B18">
            <title>
               <p>A unified nomenclature system for the nuclear receptor superfamily.</p>
            </title>
            <aug>
               <au>
                  <cnm>The Nuclear Receptors Committee</cnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1999</pubdate>
            <volume>97</volume>
            <fpage>161</fpage>
            <lpage>163</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10219237</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Nuclear hormone receptor coregulators in action: diversity for shared tasks.</p>
            </title>
            <aug>
               <au>
                  <snm>Robyr</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Wolffe</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Wahli</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Mol Endocrinol</source>
            <pubdate>2000</pubdate>
            <volume>14</volume>
            <fpage>329</fpage>
            <lpage>347</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10707952</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>A novel method for real time quantitative RT-PCR.</p>
            </title>
            <aug>
               <au>
                  <snm>Gibson</snm>
                  <fnm>UE</fnm>
               </au>
               <au>
                  <snm>Heid</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1996</pubdate>
            <volume>6</volume>
            <fpage>995</fpage>
            <lpage>1001</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8908519</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Real time quantitative PCR.</p>
            </title>
            <aug>
               <au>
                  <snm>Heid</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Stevens</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Livak</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1996</pubdate>
            <volume>6</volume>
            <fpage>986</fpage>
            <lpage>994</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8908518</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
