<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>gb-2004-5-4-105</ui>
	<ji>GBJ</ji>
	<fm>
		<dochead>Opinion</dochead>
		<bibl>
			<title>
				<p>The mammalian transcriptome and the function of non-coding DNA sequences</p>
			</title>
			<aug>
				<au id="A1" ca="yes">
					<snm>Shabalina</snm>
					<mi>A</mi>
					<fnm>Svetlana</fnm>
					<insr iid="I1"/>
					<email>shabalin@ncbi.nlm.nih.gov</email>
				</au>
				<au id="A2">
					<snm>Spiridonov</snm>
					<mi>A</mi>
					<fnm>Nikolay</fnm>
					<insr iid="I2"/>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA</p>
				</ins>
				<ins id="I2">
					<p>Division of Therapeutic Proteins, Center for Drug Evaluation and Research, US Food and Drug Administration, Bethesda MD 20892, USA</p>
				</ins>
			</insg>
			<source>Genome Biology</source>
			<issn>1465-6906</issn>
			<pubdate>2004</pubdate>
			<volume>5</volume>
			<issue>4</issue>
			<fpage>105</fpage>
			<url>http://genomebiology.com/2004/5/4/105</url>
			<xrefbib>
				<pubid idtype="pmpid">15059247</pubid>
			</xrefbib>
		</bibl>
		<history>
			<pub>
				<date>
					<day>25</day>
					<month>3</month>
					<year>2004</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2004</year>
			<collab>BioMed Central Ltd</collab>
		</cpyrt>
		<shorttitle>
			<p>The mammalian transcriptome and the function of non-coding DNA sequences</p>
		</shorttitle>
		<shortabs>
			<p>Many non-coding sequences transcribed from the mammalian genome are proving to have important regulatory roles, but the functions of the majority remain mysterious.</p>
		</shortabs>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<p>For decades, researchers have focused most of their attention on protein-coding genes and proteins. With the completion of the human and mouse genomes and the accumulation of data on the mammalian transcriptome, the focus now shifts to non-coding DNA sequences, RNA-coding genes and their transcripts. Many non-coding transcribed sequences are proving to have important regulatory roles, but the functions of the majority remain mysterious.</p>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010009">Genetics</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010015">Model organisms</classification>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p/>
			</st>
			<p>Initial understanding of the profound differences between the mammalian proteome and the underlying transcriptome emerged in the 1970s, with the discovery of RNA splicing and of the complex intron-exon structure of primary RNA transcripts. The importance of non-coding DNA was not readily accepted by the scientific community at the time of the discovery of the seemingly wasteful mechanisms of RNA processing, whereby most of the primary transcript is edited out of the mature messenger RNA <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. As the biological function of non-protein-coding DNA sequences was not understood, the term 'junk DNA', was coined and applied to most of the mammalian genome. The historic achievement of sequencing and annotating the complete human genome has revealed the complex landscape of mammalian non-coding DNA <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. The subsequent sequencing of the complete genome of the mouse has not only provided a genetic platform for biomedical studies on this model mammal, but also promoted better understanding of the human genome through detailed comparative analysis <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>.</p>
			<p>Large-scale sequencing and initial analysis of mouse and human cDNA libraries has provided the first in-depth look into the mammalian transcriptome <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. An assembly of the rat genome is also now available online <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. These accomplishments allow researchers to address some unanswered questions using genome-wide comparisons. How many genes - separately regulated transcriptional units, encoding distinct transcripts - are there in the mammalian genome, and what is the proportion of the protein-coding and non-coding genes? What part of the mammalian genome is transcribed? What is the function of non-coding RNA transcripts and non-coding DNA regions? And what structural elements in the genomes of mammals are responsible for the increased complexity of mammals relative to other organisms?</p>
		</sec>
		<sec>
			<st>
				<p>The transcribed part of the mammalian genome</p>
			</st>
			<p>Early estimations of the level of transcription in mammals were based on the hybridization of primary nuclear transcripts to genomic DNA. The major part of the mammalian genome was found to be expressed as nuclear transcripts, from one strand or the other. Hybridization experiments demonstrated that, in rat embryos, primary nuclear transcripts contained both unique and moderately repetitive sequences transcribed from 32.8% and 32.9% of genomic DNA, respectively <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. The most transcriptionally active rat tissue is the adult brain, where transcribed unique and moderately repetitive DNA represent 46.6% and 13.7% of the whole genome, respectively <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. Similar results were obtained for mouse brain tissues, where 42% of the genome represented by unique sequences was found to be transcribed <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>.</p>
			<p>Maximal transcription levels are difficult to measure with hybridization experiments because not all genes may be expressed under particular physiological conditions, and also because of difficulties in the isolation of rare transcripts. Experimental determination of the transcribed part of the well-annotated genomes of <it>Escherichia coli </it>(73% <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>) and <it>Saccharomyces cerevisiae </it>(40% <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>) yielded smaller numbers than calculations based on genomic annotation for the same species (88.6% <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> and 78% <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>, respectively; see Figure <figr fid="F1">1</figr>). According to a recent detailed analysis of the length of sequence occupied by the annotated genes on several chromosomes in the human genome, primary transcripts cover 42.2%, 46.5%, 43.6%, 42.4% and 51% of chromosomes 6, 7,14, 20 and 22, respectively (reviewed in <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>). But these numbers do not represent the full transcriptional potential of the human genome.</p>
			<fig id="F1">
				<title>
					<p>Figure 1</p>
				</title>
				<caption>
					<p>Ratios of the protein-coding, non-coding, and untranscribed sequences in bacterial, yeast, nematode and mammalian genomes</p>
				</caption>
				<text>
					<p>Ratios of the protein-coding, non-coding, and untranscribed sequences in bacterial, yeast, nematode and mammalian genomes. Estimations of the transcribed and protein-coding parts of genomes are based on the sequence length of annotated genes <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B73">73</abbr></abbrgrp>. Estimation of the transcribed portion of the human genome is based on the sequence length occupied by the annotated genes on chromosomes 6, 7, 14, 20, and 22 <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>.</p>
				</text>
				<graphic file="gb-2004-5-4-105-1"/>
			</fig>
			<p>The annotation of the human genome mostly comprises data on identified protein-coding genes, while a substantial part of the transcriptome has not yet been identified and annotated. Whole-chromosome analysis with oligonucleotide arrays has demonstrated that the level of transcription from human chromosomes 21 and 22 is significantly higher than can be accounted for by known or predicted sequence annotations <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. The unmapped part of the mammalian transcriptome may contain numerous non-protein-coding genes, as evidenced by the high proportion of non-protein-coding transcripts in human and mouse cDNA libraries <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B6">6</abbr></abbrgrp>. Estimations of the relative complexity of heterogeneous nuclear (hn) RNA versus mature mRNA, based on analysis of the kinetics of hybridization, suggest that non-protein-coding transcripts could represent half, or more, of all transcriptional output from the genomes of eukaryotic organisms <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>. We might expect that in mammals about half of the genome is transcribed.</p>
			<p>The question of how many genes there are in the mammalian genome remains open. Pregenomic estimates of the number of human genes ranged between 30,000 and 120,000 <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr></abbrgrp>. Recent analysis of the mouse transcriptome on the basis of annotation of full-length cDNA collections enabled identification of 33,409 unique full-length transcripts, with an estimated total number of independent transcriptional units in the mouse genome of around 70,000 <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Large-scale annotation of the human genome with the UniGene assembly of individual expressed sequence tags (ESTs) and cDNAs revealed 59,500 nonredundant clusters representing putative transcriptional units <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp>. Thus, the total number of genes in the mammalian transcriptome could be as high as 60,000-70,000.</p>
			<sec>
				<st>
					<p>Protein-coding genes and their untranslated regions</p>
				</st>
				<p>A detailed inventory of the protein-coding genes was made upon the completion of the human and mouse genome projects <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Overall, the mouse proteome is similar to that of the human, and about 99% of the mouse protein-coding genes have a homolog in the human genome <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. The number of protein-coding genes in the mammalian genome was calculated on the basis of known cDNAs and genes predicted by similarity to protein-coding genes in other organisms, and was extended by computer predictions that are supported by experimental evidence such as ESTs. Catalogs of human and mouse protein-coding genes contain slightly more than 22,000 genes for each species <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Recent large-scale sequencing and analysis of the large Japanese collection of human cDNA clones added around 2,000 more new sequences to the human protein catalog <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. Current approaches to gene identification are likely to miss a substantial number of small genes, such as those encoding neuropeptides, antimicrobial peptides, and small adaptor and regulatory proteins. Taking into account the small genes that have yet to be discovered, the total number of protein-coding genes in the mammalian genome is estimated to be around 30,000 <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. This upper estimate is still surprisingly close to the number of protein-coding genes in the nematode genome (Table <tblr tid="T1">1</tblr>). The average size of mammalian protein-coding genes far exceeds the average size of the nematode and yeast protein-coding genes, however, mostly on account of the increased length of introns.</p>
				<tbl id="T1">
					<title>
						<p>Table 1</p>
					</title>
					<caption>
						<p>General features of bacterial, yeast, nematode and mammalian genomes</p>
					</caption>
					<tblbdy cols="7">
						<r>
							<c ca="left">
								<p>Species</p>
							</c>
							<c ca="center">
								<p>Genome size (Mbp)</p>
							</c>
							<c ca="center">
								<p>Repetitive sequences (%)</p>
							</c>
							<c ca="center">
								<p>Transcriptional units</p>
							</c>
							<c ca="center">
								<p>Protein-coding genes</p>
							</c>
							<c ca="center">
								<p>Introns</p>
							</c>
							<c ca="center">
								<p>References</p>
							</c>
						</r>
						<r>
							<c cspan="7">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<it>Escherichia coli</it>
								</p>
							</c>
							<c ca="center">
								<p>4.6</p>
							</c>
							<c ca="center">
								<p>0.7</p>
							</c>
							<c ca="center">
								<p>5,471</p>
							</c>
							<c ca="center">
								<p>4,288</p>
							</c>
							<c>
								<p/>
							</c>
							<c ca="center">
								<p>
									<abbrgrp>
										<abbr bid="B12">12</abbr>
										<abbr bid="B74">74</abbr>
									</abbrgrp>
								</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<it>Saccharomyces cerevisiae</it>
								</p>
							</c>
							<c ca="center">
								<p>12</p>
							</c>
							<c ca="center">
								<p>3.2</p>
							</c>
							<c ca="center">
								<p>6,682</p>
							</c>
							<c ca="center">
								<p>6,183</p>
							</c>
							<c ca="center">
								<p>233</p>
							</c>
							<c ca="center">
								<p>
									<abbrgrp>
										<abbr bid="B13">13</abbr>
									</abbrgrp>
								</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<it>Caenorhabditis elegans</it>
								</p>
							</c>
							<c ca="center">
								<p>100.3</p>
							</c>
							<c ca="center">
								<p>16.5</p>
							</c>
							<c ca="center">
								<p>19,646</p>
							</c>
							<c ca="center">
								<p>18,808</p>
							</c>
							<c ca="center">
								<p>99,237</p>
							</c>
							<c ca="center">
								<p>
									<abbrgrp>
										<abbr bid="B73">73</abbr>
									</abbrgrp>
								</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<it>Caenorhabditis briggsae</it>
								</p>
							</c>
							<c ca="center">
								<p>104</p>
							</c>
							<c ca="center">
								<p>22.4</p>
							</c>
							<c ca="center">
								<p>20,469</p>
							</c>
							<c ca="center">
								<p>19,507</p>
							</c>
							<c ca="center">
								<p>94,832</p>
							</c>
							<c ca="center">
								<p>
									<abbrgrp>
										<abbr bid="B73">73</abbr>
									</abbrgrp>
								</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<it>Mus musculus</it>
								</p>
							</c>
							<c ca="center">
								<p>2,500</p>
							</c>
							<c ca="center">
								<p>40</p>
							</c>
							<c ca="center">
								<p>33,409</p>
							</c>
							<c ca="center">
								<p>22,011</p>
							</c>
							<c ca="center">
								<p>191,500</p>
							</c>
							<c ca="center">
								<p>
									<abbrgrp>
										<abbr bid="B3">3</abbr>
										<abbr bid="B4">4</abbr>
									</abbrgrp>
								</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<it>Homo sapiens</it>
								</p>
							</c>
							<c ca="center">
								<p>2,900</p>
							</c>
							<c ca="center">
								<p>44</p>
							</c>
							<c ca="center">
								<p>25,003</p>
							</c>
							<c ca="center">
								<p>22,808</p>
							</c>
							<c ca="center">
								<p>177,000</p>
							</c>
							<c ca="center">
								<p>
									<abbrgrp>
										<abbr bid="B3">3</abbr>
										<abbr bid="B5">5</abbr>
										<abbr bid="B14">14</abbr>
									</abbrgrp>
								</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
				<sec>
					<st>
						<p>The 5' and 3' untranslated regions</p>
					</st>
					<p>Gene expression in eukaryotic organisms is tightly controlled at various levels, and critical <it>cis</it>-regulatory elements for posttranscriptional control are encoded in the 5' and 3' untranslated regions (UTRs). On average, 5' and 3' UTRs are less conserved than protein-coding sequences across species, but more conserved than untranscribed sequences <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>. Highly conserved nucleotide blocks have been detected in 5' UTRs and, especially, in the 3' UTRs of orthologous genes from different mammalian orders, and even between mammals and birds or fish <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>. For some genes, the conservation of UTRs exceeds the conservation of the corresponding coding regions <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Many conserved sequence elements in UTRs have been identified as binding sites for proteins or antisense RNAs, which contribute to the regulation of nucleocytoplasmic transport, subcellular localization, translation and the stability of mRNAs <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr></abbrgrp>. The nucleotide context around the principal functional signals, such as start and stop codons, is also an important determinant of expression level <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>.</p>
					<p>According to the current, scanning model of translation initiation, the eukaryotic ribosome binds to the 5'-terminal cap of an mRNA and starts scanning the mRNA until it detects the first AUG start codon, where it initiates translation <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>. The 5' UTRs contain binding sites for components of multiprotein transcription complexes and also participate in the recruitment of the 40S ribosomal subunit and translation initiation. The length of 5' UTRs, and the presence of additional upstream transcription start codons, may be important for regulating the basal translation level of an mRNA, It has been shown that transcripts with an optimal start codon context tend to have shorter 5' UTRs, whereas an increased length of 5' UTR correlates with a 'weak' start codon context and with the presence of additional upstream start codons <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. A reduced level of basal translation also correlates with the presence of minor open reading frames located within 5' UTRs and upstream of the main start codon in some genes. Other sequence elements within 5' UTRs act as internal ribosome entry sites (IRESs); these elements have been found in many cellular mRNAs encoding regulatory proteins <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>.</p>
					<p>It is widely accepted that 3' UTRs play crucial roles in transcript cleavage, polyadenylation and nuclear export, and in regulating the level of transcription and the stability of transcripts. The 3' UTRs may contain sequence elements that mediate negative posttranscriptional regulation. Increasing numbers of publications describe suppression of mRNA translation by small RNA molecules through base-pairing interactions with complementary sequence motifs within 3' UTRs <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. It has also been shown that the turnover of mRNA is regulated by <it>cis</it>-acting AU-rich elements that promote mRNA degradation, and such motifs are found in the conserved 3' UTRs of many mRNAs encoding regulatory proteins <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>.</p>
					<p>In addition to motifs that have a negative effect on translation, 3' UTRs carry binding sites for factors involved in translation termination and the release of the synthesized polypeptide, processes that are understood much less thoroughly than the initiation of translation <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. Binding of regulatory proteins to <it>cis</it>-acting elements within a 3' UTR can be either sequence-specific or facilitated by stem-loop structural elements formed within the mRNA. The importance of the secondary structure of the 3' UTR is exemplified by the family of selenoprotein mRNAs. All mammalian selenoproteins identified so far contain a selenocysteine residue encoded by the stop codon UGA. Incorporation of selenocysteine into the growing polypeptide depends on a conserved stem-loop structure within the mRNA formed by the selenocysteine insertion sequence (SECIS), which is necessary for decoding UGA as selenocysteine rather than as a stop signal <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>.</p>
					<p>Genomic regions corresponding to the UTRs of mRNAs may contain introns, which leads to the formation of alternative UTRs. Introns are more frequently found in 5' UTRs, although 3' UTRs are generally much longer than 5' UTRs. Alternative UTRs can be formed by the use of different transcription start sites, different donor/acceptor splice sites, and different polyadenylation sites. These have been shown to vary with the tissue and the stage of development, and can significantly affect patterns of gene expression <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B41">41</abbr></abbrgrp>.</p>
				</sec>
				<sec>
					<st>
						<p>Introns</p>
					</st>
					<p>The origin of eukaryotic introns is the subject of much debate. One hypothesis argues that modern nuclear introns are evolutionary descendants of bacterial self-catalytic introns that penetrated into the eukaryotic lineage and gained biological function in eukaryotes in the process of co-evolution with their hosts through their involvement in the splicing of primary RNA transcripts. An alternative notion is that the vast majority of introns arose within multicellular eukaryotes and were randomly inserted into eukaryotic genes (reviewed in <abbrgrp><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr></abbrgrp>). Introns, which are few in unicellular eukaryotes, are greatly increased in numbers and size within the genomes of higher eukaryotes (Table <tblr tid="T1">1</tblr>). Nematodes contain more DNA in introns than in exons, while in mammalian genomes introns comprise about 95% of the sequence within protein-coding genes <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>. Interspecies sequence conservation studies have demonstrated that introns are generally high in sequence complexity, although they are less conserved than protein-coding sequences; introns contain blocks of conserved sequences and a significant number of selectively constrained nucleotides that remain invariant as a result of stabilizing selection. Genomic sequencing of different taxa has allowed large-scale analysis of homologous intron sequences between related species, such as between <it>Caenorhabditis </it>species or <it>Drosophila </it>species, or between human and mouse or rat, and human and whale or seal <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr></abbrgrp>. Using different alignment methods, these studies estimate that the level of selective constraint in introns is between 5% and 28%, as compared to around 60-70% in exons.</p>
					<p>One established biological role for introns is their involvement in nucleosome formation and chromatin organization. Introns have higher potential for nucleosome formation than exons or Alu repeats <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. Other functional elements identified in mammalian introns are scaffold/matrix-attachment regions (S/MARs), which are thought to anchor chromatin loops to the nuclear matrix and to chromosome scaffolds <abbrgrp><abbr bid="B53">53</abbr><abbr bid="B54">54</abbr></abbrgrp>. These elements account for only a small proportion of constrained nucleotides in introns, however.</p>
					<p>Alternative splicing is an important source of proteome complexity in higher eukaryotes; it amplifies the number of proteins encoded by a single gene by generating isoforms differing in amino-acid sequence. Nevertheless, the dominance of intronic sequences in the protein-coding genes of higher organisms cannot be fully explained by their role in alternative splicing. Although the vast majority of human and mouse protein-coding genes have introns, only about 40% of them show evidence of alternative splicing <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B55">55</abbr></abbrgrp>. As a rule, internal introns within protein-coding regions are not involved in alternative splicing, unlike those in UTRs, and splicing signals located at intron-exon boundaries are relatively short. The significant levels of nucleotide conservation within introns suggest that introns may have other important functional roles, probably at the RNA level. It has been suggested that the products of intron degradation generated during splicing of pre-mRNA transcripts serve as endogenous control molecules of an RNA-based gene-regulatory network <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>; but to date, no experimental data confirm or disprove this idea.</p>
					<p>In modern eukaryotes, the transcription and processing of mRNA are highly coupled with intron splicing and/or exon recognition. There is an obvious correlation between the number and total length of introns on the one hand and the developmental complexity of organisms on the other, although the reasons for the abundance of intron sequences and their functions in higher organisms are not fully understood. The notion that introns are involved in complex regulation and development in higher eukaryotes is supported by several lines of evidence. For example, there is a negative correlation between the size of introns and the level of transcription of protein-coding genes. Furthermore, introns in highly expressed genes are substantially shorter than those in genes that are expressed at lower levels. This difference is greater in humans, where introns are, on average, 14 times shorter in highly expressed genes than in genes with low expression <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B56">56</abbr></abbrgrp>.</p>
					<p>The intron sequences of mammalian protein-coding genes have also been shown to harbor independent transcriptional units, such as small RNA genes <abbrgrp><abbr bid="B57">57</abbr></abbrgrp> and repetitive elements <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Repeats constitute about 45% of the human and the mouse genomes (Table <tblr tid="T1">1</tblr>) and can be found in both transcribed (introns and UTRs) and non-transcribed intergenic sequences. It is not obvious whether the proliferation of transposable repetitive elements in mammalian genomes is associated with some biological advantage. There are notable similarities in the genomic distribution of the major repetitive elements, LINEs (long interspersed nucleotide elements) and SINEs (short interspersed nucleotide elements), in the human and mouse genomes. Genome-wide profiling of human gene expression has revealed that SINE elements are mostly associated with highly expressed short-intron genes, while LINE elements are associated with weakly expressed long-intron genes <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. Furthermore, similar repeats accumulate in orthologous locations in the human and mouse genomes <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B58">58</abbr></abbrgrp>.</p>
				</sec>
			</sec>
			<sec>
				<st>
					<p>The expanding world of non-coding RNA genes</p>
				</st>
				<p>Transcripts from non-coding RNA (ncRNA) genes are not translated into proteins and function directly as structural, regulatory or catalytic molecules. It is not clear how many ncRNA genes are present in the mammalian genome. The existing catalog of mammalian genes is strongly biased towards protein-coding genes, because most efforts were made in cloning and sequencing polyadenylated mRNAs, which tend not to be ncRNAs. Analysis of 33,409 full-length mouse cDNAs showed that ncRNA constitutes more than one third of all the identified transcripts <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Recently Ota <it>et al</it>. <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> reported the sequencing and characterization of 10,897 novel human full-length cDNA clones, and ncRNAs represent about half of these newly identified transcripts. Nevertheless, it is not known how many real RNA genes have been cloned, and how many clones in fact represent transcriptional artifacts. Surprisingly, a large proportion of ncRNA transcripts have introns, and many ncRNAs demonstrate distinct patterns of splicing <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. The presence of introns in ncRNAs adds possibilities for regulation, given that the primary transcript might be functionally inactive, with subsequent cleavage and splicing being required to produce an active RNA molecule. Novel ncRNA genes are difficult to recognize and identify on the basis of sequence, and their discovery still depends largely on experimental approaches. The nature of ncRNA genes, which are often small and multicopy, lacking open reading frames and immune to point mutations, makes them difficult targets for genetic screens. Current estimates of the number of independent transcriptional units (around 70,000) and protein-coding genes (around 30,000) in the mouse transcriptome suggest that ncRNA genes may be highly abundant in the mouse genome <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>.</p>
				<p>Our understanding of the cellular function of ncRNAs has expanded far beyond the initial notion of their being intermediates and accessories in protein biosynthesis. The size of ncRNA molecules ranges from 20 nucleotides (microRNAs) to thousands of nucleotides (ncRNAs involved in gene silencing) <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>. Furthermore, ncRNAs are involved in many processes, including transcriptional and posttranscriptional regulation, chromosome replication, genomic imprinting, RNA processing, modification and alternative splicing, mRNA stability and translation, and even protein degradation and translocation <abbrgrp><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr><abbr bid="B61">61</abbr><abbr bid="B62">62</abbr></abbrgrp>. Within the genome, ncRNA genes are found in extended stretches of conservation within orthologous regions of related genomes in intergenic and intronic sequences that have elevated GC content. Important noncanonical RNA species include families of translational repressors, such as microRNAs and small temporary RNAs (stRNAs) that inhibit translation of target mRNAs, small nuclear RNAs (snRNAs) that function as components of spliceosomes, and small nucleolar RNAs (snoRNAs) that are involved in the chemical modification of structural RNAs. Another important class of ncRNA molecules comprises those with catalytic activity, such as ribonuclease P. The functional importance of ncRNA genes is emphasized by the recent discoveries that link human genetic disorders with non-protein-coding genes <abbrgrp><abbr bid="B63">63</abbr><abbr bid="B64">64</abbr></abbrgrp>.</p>
				<p>The inhibition and silencing of genes by RNA molecules exploits the highly specific complementarity of nucleic acid interactions. There are two types of naturally occurring regulatory ncRNAs. First, <it>cis</it>-antisense transcripts originate from the same genomic region as the target gene, but have the opposite orientation, and can form long perfect duplexes with their targets; such <it>cis</it>-antisense transcripts may be expressed from imprinted regions of vertebrate chromosomes and play roles in chromatin structure. Second, <it>trans</it>-antisense RNAs are short molecules that are transcribed from loci distinct from their mRNA targets and form imperfect duplexes with complementary regions within their targets; examples of <it>trans</it>-antisense RNAs are microRNAs and small interfering (si) RNAs <abbrgrp><abbr bid="B59">59</abbr><abbr bid="B62">62</abbr><abbr bid="B65">65</abbr></abbrgrp>. It seems that the increased complexity of gene expression and regulation in higher organisms has promoted the increased use (during evolution) of modular systems, whereby substrate recognition is delegated to diverse small RNA molecules that share a common protein catalytic subunit to exert their effects. An example of such a mechanism is the site-specific methylation of structural RNAs (rRNAs, tRNAs and snRNAs), in which numerous different snoRNAs provide specificity for methylation and pseudouridylation of target bases on the structural RNAs by complementarity, while catalytic activity is conferred by a protein methylase or pseudo-U synthetase associated with the snoRNA <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>.</p>
				<p>Another class of sequence elements that contributes to the mammalian transcriptome comprises Alu and other repeats. The observed evolutionary selection against change in Alu repeat sequences in the human genome has led to the hypothesis that they are functionally important. In a few cases, Alu elements have been shown to serve as regulators of transcription of adjacent genes <abbrgrp><abbr bid="B66">66</abbr></abbrgrp> and in nucleosome positioning within chromatin <abbrgrp><abbr bid="B67">67</abbr></abbrgrp>. More recent studies indicate that Alu repeats serve as templates for non-coding RNAs that can be involved in the regulation of gene activity and posttranscriptional gene silencing through repression of expression of other genes that contain similar repeats. A strong increase in the level of Alu transcripts in the cell is observed under stress conditions and after viral infection <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Non-coding sequences and the complexity of organisms</p>
			</st>
			<p>The function of non-coding DNA remains poorly studied, and interspecies comparison is often the only way to demonstrate that a conserved DNA sequence, which has evolved slowly as a result of negative selection, is functionally important. In general, non-coding regions are less conserved than the protein-coding parts of genes. Comparative analysis of orthologous non-coding regions in the genomes of higher eukaryotes has revealed a mosaic structure of alternating highly conserved and dissimilar segments. Conserved elements, the so-called phylogenetic footprints, constitute a significant proportion of non-coding DNA. Comparative analysis of the human and mouse genomes has demonstrated that about 5% of the genomic sequence consists of highly conserved segments of 50-100 base-pairs (bp); this proportion is much higher than can be explained by protein-coding sequences alone <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. But this analysis of relatively long conserved segments did not take into account numerous shorter and weaker homologous elements of genomic DNA. The average selective constraint within mouse and human intergenic regions (15-19%) does not differ substantionally from the number of constrained nucleotides in the introns and intergenic regions of nematodes (18%) <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B68">68</abbr></abbrgrp>. The average number of constrained nucleotides within a mammalian intergenic region is at least 2,000, which is twice as many as in an average protein-coding region. Some of the short conserved sequences in mammalian intergenic regions represent binding sites for known transcription factors and regulatory proteins, while others have no known biological function <abbrgrp><abbr bid="B69">69</abbr></abbrgrp>. Needless to say, comparative inter-species analysis is not helpful in the detection of species-specific functional sequence elements. Some authors estimate that as much as one third of the human genome (about a billion base pairs) could be involved in <it>cis</it>-regulatory functions, such as the regulation of gene expression and the control of chromosomal replication, condensation, pairing and segregation <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>.</p>
			<p>The fraction of protein-coding DNA in the genome decreases with increasing organismal complexity. In bacteria, about 90% of the genome codes for proteins. This number drops off to 68% in yeast, to 23-24% in nematodes and to 1.5-2% in mammals (Table <tblr tid="T1">1</tblr>). Among the different mechanisms for increasing protein diversity (such as the use of multiple transcription start sites, alternative pre-mRNA splicing and polyadenylation, pre-mRNA editing, and posttranslational protein modification) alternative splicing is considered to be the most important source of protein diversity in mammals <abbrgrp><abbr bid="B71">71</abbr></abbrgrp>. But this view was challenged when no significant difference in the level of alternative splicing was found in mammals as compared to other phyla, such as insects and nematodes <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>. Also, only a fraction of alternatively spliced human genes (10-30%) shows evidence of tissue-specific splice forms, mostly within the brain, testis and a few other tissues <abbrgrp><abbr bid="B72">72</abbr></abbrgrp>. A relatively modest increase in the number of protein-coding genes from bacteria to unicellular eukaryotes to mammals does not account for the dramatic rise in the complexity of the organisms. The relatively small number of identified mammalian genes poses a question: what other factors contribute to the complexity of higher organisms?</p>
			<p>We cannot rationally quantify the structural, physiological and behavioral complexity of organisms from different phyla. It is evident, however, that increased organismal complexity correlates less with the number of the protein-coding genes than with the length and diversity of non-protein-coding sequences. Generally, the complexity of organisms correlates with increases in the following parameters: first, the transcribed, but nontranslated, part of the genome; second, the length and number of introns in protein-coding genes; third, the number and complexity of <it>cis</it>-control elements and the increased use of complex and multiple promoters for a single gene; fourth, gene numbers, for both protein-coding and ncRNA genes; fifth, the complexity of UTRs and the length of 3' UTRs; and sixth, the ratio and the absolute number of transcription factors per genome <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B23">23</abbr><abbr bid="B70">70</abbr><abbr bid="B73">73</abbr></abbrgrp>. In other words, the structural and physiological complexity of an organism is highly dependent on the complex regulation of gene expression and on the size and diversity of the transcriptome. Why is this so? Single-stranded RNA has some unique properties that make it suitable for regulatory roles. These include its ability to specifically recognize DNA sequences through complementary interactions; its conformational flexibility, which allows quick structural changes in a cooperative manner, and the ability to serve as a scaffold for protein molecules. The widespread use of RNA molecules in cell regulation and in the transient modulation of gene expression is also due to the quick and easy production of RNA (as no protein synthesis is required), and quick degradation by nucleases.</p>
			<p>As discussed in this article, the non-coding transcribed part of the genome increases dramatically in size with the complexity of organisms, culminating in an estimated 1.2 billion nucleotides in humans. The function of these sequences still poses a challenge, some 30 years after their discovery. With the completion of the first three mammalian genome sequences, and more in view, the era of comparative mammalian genomics is coming to the fore, and efforts are increasingly focusing on genome annotation and the determination of functions for uncharacterized sequences. No genome annotation can be complete without characterization of the non-coding part of the transcriptome, however. This may become a priority for the future large-scale mammalian genome sequencing and annotation projects. We can hope that the scope and the complexity of the mammalian transcriptome will emerge in more detail with the discovery of orthologs of ncRNA genes, transcripts, and conserved functional sequence elements for closely and distantly related mammalian species.</p>
		</sec>
	</bdy>
	<bm>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>On pre-messenger RNA and transcriptons. A review.</p>
				</title>
				<aug>
					<au>
						<snm>Scherrer</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Imaizumi-Scherrer</snm>
						<fnm>MT</fnm>
					</au>
					<au>
						<snm>Reynaud</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Therwath</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Mol Biol Rep</source>
				<pubdate>1979</pubdate>
				<volume>5</volume>
				<fpage>5</fpage>
				<lpage>28</lpage>
				<xrefbib>
					<pubid idtype="pmpid">379595</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Initial sequencing and analysis of the human genome.</p>
				</title>
				<aug>
					<au>
						<cnm>International Human Genome Sequencing Consortium</cnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2001</pubdate>
				<volume>409</volume>
				<fpage>860</fpage>
				<lpage>921</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35057062</pubid>
						<pubid idtype="pmpid" link="fulltext">11237011</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Initial sequencing and comparative analysis of the mouse genome.</p>
				</title>
				<aug>
					<au>
						<cnm>Mouse Genome Sequencing Consortium</cnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2002</pubdate>
				<volume>420</volume>
				<fpage>520</fpage>
				<lpage>562</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature01262</pubid>
						<pubid idtype="pmpid" link="fulltext">12466850</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs.</p>
				</title>
				<aug>
					<au>
						<cnm>The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase I &amp; II Team</cnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2002</pubdate>
				<volume>420</volume>
				<fpage>563</fpage>
				<lpage>573</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature01266</pubid>
						<pubid idtype="pmpid" link="fulltext">12466851</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>The human transcriptome map reveals extremes in gene density, intron length, GC content, and repeat pattern for domains of highly and weakly expressed genes.</p>
				</title>
				<aug>
					<au>
						<snm>Versteeg</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>van Schaik</snm>
						<fnm>BDC</fnm>
					</au>
					<au>
						<snm>van Batenburg</snm>
						<fnm>MF</fnm>
					</au>
					<au>
						<snm>Roos</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Monajemi</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Caron</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Bussemaker</snm>
						<fnm>HJ</fnm>
					</au>
					<au>
						<snm>van Kampen</snm>
						<fnm>AHC</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2003</pubdate>
				<volume>13</volume>
				<fpage>1998</fpage>
				<lpage>2004</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1101/gr.1649303</pubid>
						<pubid idtype="pmpid" link="fulltext">12915492</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>Complete sequencing and characterization of 21,234 full-lemgth human cDNAs.</p>
				</title>
				<aug>
					<au>
						<snm>Ota</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Nishikawa</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Otsuki</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Sugiyama</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Irie</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Wakamatsu</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Hayashi</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Sato</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Nagai</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Kimura</snm>
						<fnm>K</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nat Genet </source>
				<pubdate>2004</pubdate>
				<volume>36</volume>
				<fpage>40</fpage>
				<lpage>45</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/ng1285</pubid>
						<pubid idtype="pmpid" link="fulltext">14702039</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>Rat Genome Project</p>
				</title>
				<url>http://www.hgsc.bcm.tmc.edu/projects/rat</url>
			</bibl>
			<bibl id="B8">
				<title>
					<p>An attempt to determine the maximal expression of the rat genome.</p>
				</title>
				<aug>
					<au>
						<snm>Evtushenko</snm>
						<fnm>VI</fnm>
					</au>
					<au>
						<snm>Hanson</snm>
						<fnm>KP</fnm>
					</au>
					<au>
						<snm>Barabitskaya</snm>
						<fnm>OV</fnm>
					</au>
					<au>
						<snm>Emelyanov</snm>
						<fnm>AV</fnm>
					</au>
					<au>
						<snm>Reshetnikov</snm>
						<fnm>VL</fnm>
					</au>
					<au>
						<snm>Kozlov</snm>
						<fnm>AP</fnm>
					</au>
				</aug>
				<source>Mol Biol (Mosk)</source>
				<pubdate>1989</pubdate>
				<volume>23</volume>
				<fpage>663</fpage>
				<lpage>675</lpage>
				<xrefbib>
					<pubid idtype="pmpid">2770736</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>Complexity and characterization of polyadenylated RNA in the mouse brain.</p>
				</title>
				<aug>
					<au>
						<snm>Bantle</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Hahn</snm>
						<fnm>WE</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>1976</pubdate>
				<volume>8</volume>
				<fpage>139</fpage>
				<lpage>150</lpage>
				<xrefbib>
					<pubid idtype="pmpid">986249</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>One strand equivalent of the <it>Escherichia coli </it>genome is transcribed: complexity and abundance classes of mRNA.</p>
				</title>
				<aug>
					<au>
						<snm>Hahn</snm>
						<fnm>WE</fnm>
					</au>
					<au>
						<snm>Pettijohn</snm>
						<fnm>DE</fnm>
					</au>
					<au>
						<snm>Van Ness</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>1977</pubdate>
				<volume>197</volume>
				<fpage>582</fpage>
				<lpage>585</lpage>
				<xrefbib>
					<pubid idtype="pmpid">327551</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>Number and distribution of polyadenylated RNA sequences in yeast.</p>
				</title>
				<aug>
					<au>
						<snm>Hereford</snm>
						<fnm>LM</fnm>
					</au>
					<au>
						<snm>Rosbash</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>1977</pubdate>
				<volume>10</volume>
				<fpage>453</fpage>
				<lpage>462</lpage>
				<xrefbib>
					<pubid idtype="pmpid">321129</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>The complete genome sequence of <it>Escherichia coli </it>K-12.</p>
				</title>
				<aug>
					<au>
						<snm>Blattner</snm>
						<fnm>FR</fnm>
					</au>
					<au>
						<snm>Plunkett</snm>
						<fnm>G</fnm>
						<suf>III</suf>
					</au>
					<au>
						<snm>Bloch</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Perna</snm>
						<fnm>NT</fnm>
					</au>
					<au>
						<snm>Burland</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Riley</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Collado-Vides</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Glasnr</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Rode</snm>
						<fnm>CK</fnm>
					</au>
					<au>
						<snm>Mayhew</snm>
						<fnm>GF</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>1997</pubdate>
				<volume>277</volume>
				<fpage>1453</fpage>
				<lpage>1474</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.277.5331.1453</pubid>
						<pubid idtype="pmpid" link="fulltext">9278503</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>The yeast genome project: what did we learn?</p>
				</title>
				<aug>
					<au>
						<snm>Dujon</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<source>Trends Genet</source>
				<pubdate>1996</pubdate>
				<volume>12</volume>
				<fpage>263</fpage>
				<lpage>270</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0168-9525(96)10027-5</pubid>
						<pubid idtype="pmpid" link="fulltext">8763498</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>The DNA sequence and analysis of human chromosome 6.</p>
				</title>
				<aug>
					<au>
						<snm>Mungall</snm>
						<fnm>AJ</fnm>
					</au>
					<au>
						<snm>Palmer</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Sims</snm>
						<fnm>SK</fnm>
					</au>
					<au>
						<snm>Edwards</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Ashurst</snm>
						<fnm>JL</fnm>
					</au>
					<au>
						<snm>Wilming</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Jones</snm>
						<fnm>MC</fnm>
					</au>
					<au>
						<snm>Horton</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Hunt</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Scott</snm>
						<fnm>CE</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2003</pubdate>
				<volume>425</volume>
				<fpage>805</fpage>
				<lpage>811</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature02055</pubid>
						<pubid idtype="pmpid" link="fulltext">14574404</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Large-scale transcriptional activity in chromosomes 21 and 22.</p>
				</title>
				<aug>
					<au>
						<snm>Kapranov</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Cawley</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Drenkow</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Bekiranov</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Strausberg</snm>
						<fnm>RL</fnm>
					</au>
					<au>
						<snm>Fodor</snm>
						<fnm>SP</fnm>
					</au>
					<au>
						<snm>Gingeras</snm>
						<fnm>TR</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>296</volume>
				<fpage>916</fpage>
				<lpage>919</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1068597</pubid>
						<pubid idtype="pmpid" link="fulltext">11988577</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>Sequence organization in animal DNA and a speculation on hnRNA as a coordinate regulatory transcript.</p>
				</title>
				<aug>
					<au>
						<snm>Davidson</snm>
						<fnm>EH</fnm>
					</au>
					<au>
						<snm>Klein</snm>
						<fnm>WH</fnm>
					</au>
					<au>
						<snm>Britten</snm>
						<fnm>RJ</fnm>
					</au>
				</aug>
				<source>Dev Biol</source>
				<pubdate>1977</pubdate>
				<volume>55</volume>
				<fpage>69</fpage>
				<lpage>84</lpage>
				<xrefbib>
					<pubid idtype="pmpid">832773</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>The evolution of controlled multitasked gene network: the role of introns and other non-coding RNAs in the development of complex organisms.</p>
				</title>
				<aug>
					<au>
						<snm>Mattick</snm>
						<fnm>JS</fnm>
					</au>
					<au>
						<snm>Gagen</snm>
						<fnm>MJ</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>2001</pubdate>
				<volume>18</volume>
				<fpage>1611</fpage>
				<lpage>1630</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11504843</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B18">
				<title>
					<p>How many genes in the human genome?</p>
				</title>
				<aug>
					<au>
						<snm>Fields</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Adams</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>White</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Venter</snm>
						<fnm>JC</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>1994</pubdate>
				<volume>7</volume>
				<fpage>345</fpage>
				<lpage>346</lpage>
				<xrefbib>
					<pubid idtype="pmpid">7920649</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Analysis of expressed sequence tags indicates 35,000 human genes.</p>
				</title>
				<aug>
					<au>
						<snm>Ewing</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Green</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2000</pubdate>
				<volume>25</volume>
				<fpage>232</fpage>
				<lpage>234</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/76115</pubid>
						<pubid idtype="pmpid" link="fulltext">10835644</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p>Gene index analysis of the human genome estimates approximately 120,000 genes.</p>
				</title>
				<aug>
					<au>
						<snm>Liang</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Holt</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Pertea</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Karamysheva</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Salzberg</snm>
						<fnm>SL</fnm>
					</au>
					<au>
						<snm>Quackenbush</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2000</pubdate>
				<volume>25</volume>
				<fpage>239</fpage>
				<lpage>240</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/76126</pubid>
						<pubid idtype="pmpid" link="fulltext">10835646</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>Assembly, annotation, and integration of UNIGENE clusters into the human genome.</p>
				</title>
				<aug>
					<au>
						<snm>Zhuo</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Zhao</snm>
						<fnm>WD</fnm>
					</au>
					<au>
						<snm>Wright</snm>
						<fnm>FA</fnm>
					</au>
					<au>
						<snm>Yang</snm>
						<fnm>HY</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>JP</fnm>
					</au>
					<au>
						<snm>Sears</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Baer</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Kwon</snm>
						<fnm>DH</fnm>
					</au>
					<au>
						<snm>Gordon</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Gibbs</snm>
						<fnm>S</fnm>
					</au>
					<etal/>
				</aug>
				<source>Genome Res </source>
				<pubdate>2001</pubdate>
				<volume>11</volume>
				<fpage>904</fpage>
				<lpage>918</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1101/gr.GR-1645R</pubid>
						<pubid idtype="pmpid" link="fulltext">11337484</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>UniGene</p>
				</title>
				<url>http://www.ncbi.nlm.nih.gov/UniGene</url>
			</bibl>
			<bibl id="B23">
				<title>
					<p>Structural and functional features of eukaryotic mRNA untranslated regions.</p>
				</title>
				<aug>
					<au>
						<snm>Pesole</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Mignone</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Gissi</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Grillo</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Licciulli</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Liuni</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Gene</source>
				<pubdate>2001</pubdate>
				<volume>276</volume>
				<fpage>73</fpage>
				<lpage>81</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0378-1119(01)00674-6</pubid>
						<pubid idtype="pmpid" link="fulltext">11591473</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>Evolutionary dynamics of mammalian mRNA untranslated regions by comparative analysis of orthologous human, artiodactyl and rodent gene pairs.</p>
				</title>
				<aug>
					<au>
						<snm>Larizza</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Makalowski</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Pesole</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Saccone</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Comput Chem </source>
				<pubdate>2002</pubdate>
				<volume>26</volume>
				<fpage>479</fpage>
				<lpage>490</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0097-8485(02)00009-8</pubid>
						<pubid idtype="pmpid">12144177</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>Strong conservation of noncoding sequences during vertebrate evolution: potential involvement in post-transcriptional regulation of gene expression.</p>
				</title>
				<aug>
					<au>
						<snm>Duret</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Dorkeld</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Gautier</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1993</pubdate>
				<volume>21</volume>
				<fpage>2315</fpage>
				<lpage>2322</lpage>
				<xrefbib>
					<pubid idtype="pmpid">8506129</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p>Searching for regulatory elements in human noncoding sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Duret</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Bucher</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Curr Opin Struct Biol</source>
				<pubdate>1997</pubdate>
				<volume>7</volume>
				<fpage>399</fpage>
				<lpage>406</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0959-440X(97)80058-9</pubid>
						<pubid idtype="pmpid">9204283</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>Highly conserved RNA sequences that are sensors of environmental stress.</p>
				</title>
				<aug>
					<au>
						<snm>Spicher</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Guicherit</snm>
						<fnm>OM</fnm>
					</au>
					<au>
						<snm>Duret</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Aslanian</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Sanjines</snm>
						<fnm>EM</fnm>
					</au>
					<au>
						<snm>Denko</snm>
						<fnm>NC</fnm>
					</au>
					<au>
						<snm>Giaccia</snm>
						<fnm>AJ</fnm>
					</au>
					<au>
						<snm>Blau</snm>
						<fnm>HM</fnm>
					</au>
				</aug>
				<source>Mol Cell Biol</source>
				<pubdate>1998</pubdate>
				<volume>18</volume>
				<fpage>7371</fpage>
				<lpage>7382</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9819424</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B28">
				<title>
					<p>Untranslated regions of mRNAs.</p>
				</title>
				<aug>
					<au>
						<snm>Mignone</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Gissi</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Liuni</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Pesole</snm>
						<fnm>G</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2002</pubdate>
				<volume>3</volume>
				<fpage>reviews0004</fpage>
				<lpage>0004.10</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1186/gb-2002-3-3-reviews0004</pubid>
						<pubid idtype="pmpid" link="fulltext">11897027</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B29">
				<title>
					<p>Regulatory functions of 3' UTRs.</p>
				</title>
				<aug>
					<au>
						<snm>Grzybowska</snm>
						<fnm>EA</fnm>
					</au>
					<au>
						<snm>Wilczynska</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Siedlecki</snm>
						<fnm>JA</fnm>
					</au>
				</aug>
				<source>Biochem Biophys Res Commun</source>
				<pubdate>2001</pubdate>
				<volume>288</volume>
				<fpage>291</fpage>
				<lpage>295</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/bbrc.2001.5738</pubid>
						<pubid idtype="pmpid" link="fulltext">11606041</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<title>
					<p>RNA localization in development.</p>
				</title>
				<aug>
					<au>
						<snm>Bashirullah</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Cooperstock</snm>
						<fnm>RL</fnm>
					</au>
					<au>
						<snm>Lipshitz</snm>
						<fnm>HD</fnm>
					</au>
				</aug>
				<source>Annu Rev Biochem</source>
				<pubdate>1998</pubdate>
				<volume>67</volume>
				<fpage>335</fpage>
				<lpage>394</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1146/annurev.biochem.67.1.335</pubid>
						<pubid idtype="pmpid" link="fulltext">9759492</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B31">
				<title>
					<p>Making (anti)sense of non-coding sequence conservation.</p>
				</title>
				<aug>
					<au>
						<snm>Lipman</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1997</pubdate>
				<volume>25</volume>
				<fpage>3580</fpage>
				<lpage>3583</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/25.18.3580</pubid>
						<pubid idtype="pmpid" link="fulltext">9278476</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>The context organization of functional regions in yeast genes with high-level expression.</p>
				</title>
				<aug>
					<au>
						<snm>Kochetov</snm>
						<fnm>AV</fnm>
					</au>
					<au>
						<snm>Sarai</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Vorob'ev</snm>
						<fnm>DG</fnm>
					</au>
					<au>
						<snm>Kolchanov</snm>
						<fnm>NA</fnm>
					</au>
				</aug>
				<source>Mol Biol (Mosk)</source>
				<pubdate>2002</pubdate>
				<volume>36</volume>
				<fpage>1026</fpage>
				<lpage>1034</lpage>
				<xrefbib>
					<pubid idtype="pmpid">12500541</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B33">
				<title>
					<p>Patterns in interspecies similarity correlate with nucleotide composition in mammalian 3' UTRs.</p>
				</title>
				<aug>
					<au>
						<snm>Shabalina</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Ogurtsov</snm>
						<fnm>AY</fnm>
					</au>
					<au>
						<snm>Lipman</snm>
						<fnm>DJ</fnm>
					</au>
					<au>
						<snm>Kondrashov</snm>
						<fnm>AS</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>5433</fpage>
				<lpage>5439</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/gkg751</pubid>
						<pubid idtype="pmpid" link="fulltext">12954780</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B34">
				<title>
					<p>The scanning model for translation: an update.</p>
				</title>
				<aug>
					<au>
						<snm>Kozak</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>J Cell Biol</source>
				<pubdate>1989</pubdate>
				<volume>108</volume>
				<fpage>229</fpage>
				<lpage>241</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">2645293</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B35">
				<title>
					<p>Molecular mechanisms of translation initiation in eukaryotes.</p>
				</title>
				<aug>
					<au>
						<snm>Pestova</snm>
						<fnm>TV</fnm>
					</au>
					<au>
						<snm>Kolupaeva</snm>
						<fnm>VG</fnm>
					</au>
					<au>
						<snm>Lomakin</snm>
						<fnm>IB</fnm>
					</au>
					<au>
						<snm>Pilipenko</snm>
						<fnm>EV</fnm>
					</au>
					<au>
						<snm>Shatsky</snm>
						<fnm>IN</fnm>
					</au>
					<au>
						<snm>Agol</snm>
						<fnm>VI</fnm>
					</au>
					<au>
						<snm>Hellen</snm>
						<fnm>CU</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2001</pubdate>
				<volume>98</volume>
				<fpage>7029</fpage>
				<lpage>7036</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1073/pnas.111145798</pubid>
						<pubid idtype="pmpid" link="fulltext">11416183</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p>Presence of ATG triplets in 5' untranslated regions of eukaryotic cDNAs correlates with a 'weak' context of the start codon.</p>
				</title>
				<aug>
					<au>
						<snm>Rogozin</snm>
						<fnm>IB</fnm>
					</au>
					<au>
						<snm>Kochetov</snm>
						<fnm>AV</fnm>
					</au>
					<au>
						<snm>Kondrashov</snm>
						<fnm>FA</fnm>
					</au>
					<au>
						<snm>Koonin</snm>
						<fnm>EV</fnm>
					</au>
					<au>
						<snm>Milanesi</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2001</pubdate>
				<volume>17</volume>
				<fpage>890</fpage>
				<lpage>900</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/17.10.890</pubid>
						<pubid idtype="pmpid" link="fulltext">11673233</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B37">
				<title>
					<p>Control of translation initiation in animals.</p>
				</title>
				<aug>
					<au>
						<snm>Gray</snm>
						<fnm>NK</fnm>
					</au>
					<au>
						<snm>Wickens</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Annu Rev Cell Dev Biol</source>
				<pubdate>1998</pubdate>
				<volume>14</volume>
				<fpage>399</fpage>
				<lpage>458</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1146/annurev.cellbio.14.1.399</pubid>
						<pubid idtype="pmpid" link="fulltext">9891789</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B38">
				<title>
					<p>The nonamer UUAUUUAUU is the key AU-rich sequence motif that mediates mRNA degradation.</p>
				</title>
				<aug>
					<au>
						<snm>Zubiaga</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Belasco</snm>
						<fnm>JG</fnm>
					</au>
					<au>
						<snm>Greenberg</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Mol Cell Biol</source>
				<pubdate>1995</pubdate>
				<volume>15</volume>
				<fpage>2219</fpage>
				<lpage>2230</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">7891716</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>Endless possibilities: translation termination and stop codon recognition.</p>
				</title>
				<aug>
					<au>
						<snm>Bertram</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Innes</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Minella</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Richardson</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Stansfield</snm>
						<fnm>I</fnm>
					</au>
				</aug>
				<source>Microbiology</source>
				<pubdate>2001</pubdate>
				<volume>147</volume>
				<fpage>255</fpage>
				<lpage>269</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11158343</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B40">
				<title>
					<p>Mammalian selenoprotein gene signature: identification and functional analysis of selenoprotein genes using bioinformatics methods.</p>
				</title>
				<aug>
					<au>
						<snm>Kryukov</snm>
						<fnm>GV</fnm>
					</au>
					<au>
						<snm>Gladyshev</snm>
						<fnm>VN</fnm>
					</au>
				</aug>
				<source>Methods Enzymol</source>
				<pubdate>2002</pubdate>
				<volume>347</volume>
				<fpage>84</fpage>
				<lpage>100</lpage>
				<xrefbib>
					<pubid idtype="pmpid">11898441</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>The power of the 3' UTR: translational control and development.</p>
				</title>
				<aug>
					<au>
						<snm>Kuersten</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Goodwin</snm>
						<fnm>EB</fnm>
					</au>
				</aug>
				<source>Nat Rev Genet</source>
				<pubdate>2003</pubdate>
				<volume>4</volume>
				<fpage>626</fpage>
				<lpage>637</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nrg1125</pubid>
						<pubid idtype="pmpid" link="fulltext">12897774</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>The evolution of spliceosomal introns.</p>
				</title>
				<aug>
					<au>
						<snm>Lynch</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Richardson</snm>
						<fnm>AO</fnm>
					</au>
				</aug>
				<source>Curr Opin Genet Dev</source>
				<pubdate>2002</pubdate>
				<volume>12</volume>
				<fpage>701</fpage>
				<lpage>710</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0959-437X(02)00360-X</pubid>
						<pubid idtype="pmpid" link="fulltext">12433585</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B43">
				<title>
					<p>Messenger RNA surveillance and the evolutionary proliferation of introns.</p>
				</title>
				<aug>
					<au>
						<snm>Lynch</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Kewalramani</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>2003</pubdate>
				<volume>20</volume>
				<fpage>563</fpage>
				<lpage>571</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/molbev/msg068</pubid>
						<pubid idtype="pmpid" link="fulltext">12654936</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B44">
				<title>
					<p>Pattern of selective constraint in <it>C. elegans </it>and <it>C. briggsae </it>genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Shabalina</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Kondrashov</snm>
						<fnm>AS</fnm>
					</au>
				</aug>
				<source>Genet Res</source>
				<pubdate>1999</pubdate>
				<volume>74</volume>
				<fpage>23</fpage>
				<lpage>30</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1017/S0016672399003821</pubid>
						<pubid idtype="pmpid">10505405</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B45">
				<title>
					<p>Analysis of conserved noncoding DNA in <it>Drosophila </it>reveals similar constraints in intergenic and intronic sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Bergman</snm>
						<fnm>CM</fnm>
					</au>
					<au>
						<snm>Kreitman</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2001</pubdate>
				<volume>11</volume>
				<fpage>1335</fpage>
				<lpage>1345</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1101/gr.178701</pubid>
						<pubid idtype="pmpid" link="fulltext">11483574</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B46">
				<title>
					<p>Human-mouse genome comparisons to locate regulatory sites.</p>
				</title>
				<aug>
					<au>
						<snm>Wasserman</snm>
						<fnm>WW</fnm>
					</au>
					<au>
						<snm>Palumbo</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Thompson</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Fickett</snm>
						<fnm>JW</fnm>
					</au>
					<au>
						<snm>Lawrence</snm>
						<fnm>CE</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2000</pubdate>
				<volume>26</volume>
				<fpage>225</fpage>
				<lpage>228</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/79965</pubid>
						<pubid idtype="pmpid" link="fulltext">11017083</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B47">
				<title>
					<p>Comparative analysis of noncoding regions of 77 orthologous mouse and human gene pairs.</p>
				</title>
				<aug>
					<au>
						<snm>Jareborg</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Birney</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Durbin</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>1999</pubdate>
				<volume>9</volume>
				<fpage>815</fpage>
				<lpage>824</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1101/gr.9.9.815</pubid>
						<pubid idtype="pmpid" link="fulltext">10508839</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B48">
				<title>
					<p>Enrichment of regulatory signals in conserved non-coding genomic sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Levy</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Hannenhalli</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Workman</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Bioinformatics </source>
				<pubdate>2001</pubdate>
				<volume>17</volume>
				<fpage>871</fpage>
				<lpage>877</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/17.10.871</pubid>
						<pubid idtype="pmpid" link="fulltext">11673231</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B49">
				<title>
					<p>Numerous potentionally functional but non-genic conserved sequences on human chromosome 21.</p>
				</title>
				<aug>
					<au>
						<snm>Dermitzakis</snm>
						<fnm>ET</fnm>
					</au>
					<au>
						<snm>Reymond</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Lyle</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Scamuffa</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Ucla</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Deutsch</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Stevenson</snm>
						<fnm>BJ</fnm>
					</au>
					<au>
						<snm>Flegel</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Bucher</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Jongeneel</snm>
						<fnm>CV</fnm>
					</au>
					<au>
						<snm>Antonarakis</snm>
						<fnm>SE</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2002</pubdate>
				<volume>420</volume>
				<fpage>578</fpage>
				<lpage>582</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature01251</pubid>
						<pubid idtype="pmpid" link="fulltext">12466853</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B50">
				<title>
					<p>High intron sequence conservation across three mammalian orders suggests functional constraints.</p>
				</title>
				<aug>
					<au>
						<snm>Hare</snm>
						<fnm>MP</fnm>
					</au>
					<au>
						<snm>Palumbi</snm>
						<fnm>SR</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>2003</pubdate>
				<volume>20</volume>
				<fpage>969</fpage>
				<lpage>978</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/molbev/msg111</pubid>
						<pubid idtype="pmpid" link="fulltext">12716984</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B51">
				<title>
					<p>Characterization of evolutionary rates and constraints in three mammalian genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Cooper</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Brudno</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Stone</snm>
						<fnm>EA</fnm>
					</au>
					<au>
						<snm>Dubchak</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Batzoglou</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Sidow</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2004</pubdate>
				<inpress/>
			</bibl>
			<bibl id="B52">
				<title>
					<p>Nucleosome formation potential of exons, introns and Alu repeats.</p>
				</title>
				<aug>
					<au>
						<snm>Levitsky</snm>
						<fnm>VG</fnm>
					</au>
					<au>
						<snm>Podkolodnaya</snm>
						<fnm>OA</fnm>
					</au>
					<au>
						<snm>Kolchanov</snm>
						<fnm>NA</fnm>
					</au>
					<au>
						<snm>Podkolodny</snm>
						<fnm>NL</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2001</pubdate>
				<volume>17</volume>
				<fpage>1062</fpage>
				<lpage>1064</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/17.11.1062</pubid>
						<pubid idtype="pmpid" link="fulltext">11724736</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B53">
				<title>
					<p>Identification and mapping of nuclear matrix attachment regions in a one megabase locus of human chromosome 19q13.12: long-range correlation of S/MARs and gene positions.</p>
				</title>
				<aug>
					<au>
						<snm>Chernov</snm>
						<fnm>IP</fnm>
					</au>
					<au>
						<snm>Akopov</snm>
						<fnm>SB</fnm>
					</au>
					<au>
						<snm>Nikolaev</snm>
						<fnm>LG</fnm>
					</au>
					<au>
						<snm>Sverdlov</snm>
						<fnm>ED</fnm>
					</au>
				</aug>
				<source>J Cell Biochem</source>
				<pubdate>2002</pubdate>
				<volume>84</volume>
				<fpage>590</fpage>
				<lpage>600</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1002/jcb.10043.abs</pubid>
						<pubid idtype="pmpid" link="fulltext">11813264</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B54">
				<title>
					<p>A significant fraction of conserved noncoding DNA in human and mouse consists of predicted matrix attachment regions.</p>
				</title>
				<aug>
					<au>
						<snm>Glazko</snm>
						<fnm>GV</fnm>
					</au>
					<au>
						<snm>Koonin</snm>
						<fnm>EV</fnm>
					</au>
					<au>
						<snm>Rogozin</snm>
						<fnm>IB</fnm>
					</au>
					<au>
						<snm>Shabalina</snm>
						<fnm>SA</fnm>
					</au>
				</aug>
				<source>Trends Genet</source>
				<pubdate>2003</pubdate>
				<volume>19</volume>
				<fpage>119</fpage>
				<lpage>124</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0168-9525(03)00016-7</pubid>
						<pubid idtype="pmpid" link="fulltext">12615002</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B55">
				<title>
					<p>Alternative splicing and genome complexity.</p>
				</title>
				<aug>
					<au>
						<snm>Brett</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Pospisil</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Valcarcel</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Reich</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Bork</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Nature Genet</source>
				<pubdate>2002</pubdate>
				<volume>30</volume>
				<fpage>29</fpage>
				<lpage>30</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/ng803</pubid>
						<pubid idtype="pmpid" link="fulltext">11743582</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B56">
				<title>
					<p>Selection for short introns in highly expressed genes.</p>
				</title>
				<aug>
					<au>
						<snm>Castillo-Davis</snm>
						<fnm>CI</fnm>
					</au>
					<au>
						<snm>Mekhedov</snm>
						<fnm>SL</fnm>
					</au>
					<au>
						<snm>Hartl</snm>
						<fnm>DL</fnm>
					</au>
					<au>
						<snm>Koonin</snm>
						<fnm>EV</fnm>
					</au>
					<au>
						<snm>Kondrashov</snm>
						<fnm>FA</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2002</pubdate>
				<volume>31</volume>
				<fpage>415</fpage>
				<lpage>418</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">12134150</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B57">
				<title>
					<p>Identification of 13 human modification guide RNAs.</p>
				</title>
				<aug>
					<au>
						<snm>Vitali</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Royo</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Seitz</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Bachellerie</snm>
						<fnm>JP</fnm>
					</au>
					<au>
						<snm>Huttenhofer</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Cavaille</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>6543</fpage>
				<lpage>6551</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/gkg849</pubid>
						<pubid idtype="pmpid" link="fulltext">14602913</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B58">
				<title>
					<p>Conserved fragments of transposable elements in intergenic regions: evidence for widespread recruitment of MIR- and L2-derived sequences within the mouse and human genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Silva</snm>
						<fnm>JC</fnm>
					</au>
					<au>
						<snm>Shabalina</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Harris</snm>
						<fnm>DG</fnm>
					</au>
					<au>
						<snm>Spouge</snm>
						<fnm>JL</fnm>
					</au>
					<au>
						<snm>Kondrashov</snm>
						<fnm>AS</fnm>
					</au>
				</aug>
				<source>Genet Res</source>
				<pubdate>2003</pubdate>
				<volume>82</volume>
				<fpage>1</fpage>
				<lpage>18</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1017/S0016672303006268</pubid>
						<pubid idtype="pmpid">14621267</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B59">
				<title>
					<p>RNAi: nature abhors a double strand.</p>
				</title>
				<aug>
					<au>
						<snm>Hutvagner</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Zamore</snm>
						<fnm>PD</fnm>
					</au>
				</aug>
				<source>Curr Opin Genet Dev</source>
				<pubdate>2002</pubdate>
				<volume>12</volume>
				<fpage>225</fpage>
				<lpage>232</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0959-437X(02)00290-3</pubid>
						<pubid idtype="pmpid" link="fulltext">11893497</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B60">
				<title>
					<p>Noncoding RNA genes.</p>
				</title>
				<aug>
					<au>
						<snm>Eddy</snm>
						<fnm>SR</fnm>
					</au>
				</aug>
				<source>Curr Opin Genet Dev </source>
				<pubdate>1999</pubdate>
				<volume>9</volume>
				<fpage>695</fpage>
				<lpage>699</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0959-437X(99)00022-2</pubid>
						<pubid idtype="pmpid" link="fulltext">10607607</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B61">
				<title>
					<p>Non-coding RNA genes and the modern RNA world.</p>
				</title>
				<aug>
					<au>
						<snm>Eddy</snm>
						<fnm>SR</fnm>
					</au>
				</aug>
				<source>Nat Rev Genet</source>
				<pubdate>2001</pubdate>
				<volume>2</volume>
				<fpage>919</fpage>
				<lpage>929</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35103511</pubid>
						<pubid idtype="pmpid" link="fulltext">11733745</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B62">
				<title>
					<p>Antisense-RNA regulation and RNA interference.</p>
				</title>
				<aug>
					<au>
						<snm>Brandl</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Biochim Biophys Acta</source>
				<pubdate>2002</pubdate>
				<volume>1575</volume>
				<fpage>15</fpage>
				<lpage>25</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">12020814</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B63">
				<title>
					<p>Mutations in the RNA component of RNase MRP gene cause a pleiotropic human disease, cartilage-hair displasia.</p>
				</title>
				<aug>
					<au>
						<snm>Ridanpaa</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>van Eenennaam</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Pelin</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Chadwick</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Johnson</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Yuan</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>vanVenrooij</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Pruijn</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Salmela</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Rockas</snm>
						<fnm>S</fnm>
					</au>
					<etal/>
				</aug>
				<source>Cell</source>
				<pubdate>2001</pubdate>
				<volume>104</volume>
				<fpage>195</fpage>
				<lpage>203</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11207361</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B64">
				<title>
					<p>The RNA component of telomerase is mutated in autosomal dominant dyskeratosis congenita.</p>
				</title>
				<aug>
					<au>
						<snm>Vulliamy</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Marrone</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Goldman</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Dearlove</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Bessler</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Mason</snm>
						<fnm>PJ</fnm>
					</au>
					<au>
						<snm>Dokal</snm>
						<fnm>I</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2001</pubdate>
				<volume>413</volume>
				<fpage>432</fpage>
				<lpage>435</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35096585</pubid>
						<pubid idtype="pmpid" link="fulltext">11574891</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B65">
				<title>
					<p>Noncoding RNA transcripts.</p>
				</title>
				<aug>
					<au>
						<snm>Szymanski</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Barciszewska</snm>
						<fnm>MZ</fnm>
					</au>
					<au>
						<snm>Zywicki</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Barciszewski</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>J Appl Genet</source>
				<pubdate>2003</pubdate>
				<volume>44</volume>
				<fpage>1</fpage>
				<lpage>19</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">12590177</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B66">
				<title>
					<p>Evolutionary selection against change in many Alu repeat sequences interspersed through primate genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Britten</snm>
						<fnm>RJ</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1994</pubdate>
				<volume>91</volume>
				<fpage>5992</fpage>
				<lpage>5996</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8016103</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B67">
				<title>
					<p>Nucleotide positioning by human Alu elements in chromatin.</p>
				</title>
				<aug>
					<au>
						<snm>Englander</snm>
						<fnm>EW</fnm>
					</au>
					<au>
						<snm>Howard</snm>
						<fnm>BH</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1995</pubdate>
				<volume>270</volume>
				<fpage>10091</fpage>
				<lpage>10096</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.270.17.10091</pubid>
						<pubid idtype="pmpid" link="fulltext">7730313</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B68">
				<title>
					<p>Selective constraint in intergenic regions of human and mouse genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Shabalina</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Ogurtsov</snm>
						<fnm>AY</fnm>
					</au>
					<au>
						<snm>Kondrashov</snm>
						<fnm>VA</fnm>
					</au>
					<au>
						<snm>Kondrashov</snm>
						<fnm>AS</fnm>
					</au>
				</aug>
				<source>Trends Genet</source>
				<pubdate>2001</pubdate>
				<volume>17</volume>
				<fpage>373</fpage>
				<lpage>376</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0168-9525(01)02344-7</pubid>
						<pubid idtype="pmpid" link="fulltext">11418197</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B69">
				<title>
					<p>Classification of common conserved sequences in mammalian intergenic regions.</p>
				</title>
				<aug>
					<au>
						<snm>Kondrashov</snm>
						<fnm>AS</fnm>
					</au>
					<au>
						<snm>Shabalina</snm>
						<fnm>SA</fnm>
					</au>
				</aug>
				<source>Hum Mol Genet</source>
				<pubdate>2002</pubdate>
				<volume>11</volume>
				<fpage>669</fpage>
				<lpage>674</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/hmg/11.6.669</pubid>
						<pubid idtype="pmpid" link="fulltext">11912182</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B70">
				<title>
					<p>Transcription regulation and animal diversity.</p>
				</title>
				<aug>
					<au>
						<snm>Levine</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Tjian</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2003</pubdate>
				<volume>424</volume>
				<fpage>147</fpage>
				<lpage>151</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature01763</pubid>
						<pubid idtype="pmpid" link="fulltext">12853946</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B71">
				<title>
					<p>Alternative pre-mRNA splicing and proteome expansion in metazoans.</p>
				</title>
				<aug>
					<au>
						<snm>Maniatis</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Tasic</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2002</pubdate>
				<volume>418</volume>
				<fpage>236</fpage>
				<lpage>243</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/418236a</pubid>
						<pubid idtype="pmpid" link="fulltext">12110900</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B72">
				<title>
					<p>Genome-wide detection of tissue-specific alternative splicing in the human transcriptome.</p>
				</title>
				<aug>
					<au>
						<snm>Xu</snm>
						<fnm>Q</fnm>
					</au>
					<au>
						<snm>Modrek</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Lee</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res </source>
				<pubdate>2002</pubdate>
				<volume>30</volume>
				<fpage>3754</fpage>
				<lpage>3766</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/gkf492</pubid>
						<pubid idtype="pmpid" link="fulltext">12202761</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B73">
				<title>
					<p>The genome sequence of <it>Caenorhabditis briggsae</it>: A platform for comparative genetics.</p>
				</title>
				<aug>
					<au>
						<snm>Stein</snm>
						<fnm>LD</fnm>
					</au>
					<au>
						<snm>Bao</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Blasiar</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Blumenthal</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Brent</snm>
						<fnm>MR</fnm>
					</au>
					<au>
						<snm>Chen</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Chinwala</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Clarke</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Clee</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Coghlan</snm>
						<fnm>A</fnm>
					</au>
					<etal/>
				</aug>
				<source>PLoS Biology</source>
				<pubdate>2003</pubdate>
				<volume>1</volume>
				<fpage>E45</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1371/journal.pbio.0000045</pubid>
						<pubid idtype="pmpid" link="fulltext">14624247</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B74">
				<title>
					<p>Transcriptome analysis of <it>Escherichia coli </it>using high-density oligonucleotide probe arrays.</p>
				</title>
				<aug>
					<au>
						<snm>Tjaden</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Saxena</snm>
						<fnm>RM</fnm>
					</au>
					<au>
						<snm>Stolyar</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Haynor</snm>
						<fnm>DR</fnm>
					</au>
					<au>
						<snm>Kolker</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Rosenow</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2002</pubdate>
				<volume>30</volume>
				<fpage>3732</fpage>
				<lpage>3738</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/gkf505</pubid>
						<pubid idtype="pmpid" link="fulltext">12202758</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
		</refgrp>
	</bm>
</art>
