<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>gb-2004-5-4-r29</ui>
	<ji>GBJ</ji>
	<fm>
		<dochead>Method</dochead>
		<bibl>
			<title>
				<p>Enriching for direct regulatory targets in perturbed gene-expression profiles</p>
			</title>
			<aug>
				<au id="A1">
					<snm>Tringe</snm>
					<mi>G</mi>
					<fnm>Susannah</fnm>
					<insr iid="I1"/>
					<insr iid="I3"/>
				</au>
				<au id="A2">
					<snm>Wagner</snm>
					<fnm>Andreas</fnm>
					<insr iid="I2"/>
				</au>
				<au id="A3" ca="yes">
					<snm>Ruby</snm>
					<mi>W</mi>
					<fnm>Stephanie</fnm>
					<insr iid="I1"/>
					<email>sruby@unm.edu</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Department of Molecular Genetics and Microbiology, University of New Mexico Health Sciences Center, Albuquerque, NM 87131, USA</p>
				</ins>
				<ins id="I2">
					<p>Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA</p>
				</ins>
				<ins id="I3">
					<p>Current address: DOE Joint Genome Institute, 2800 Mitchell Drive, Bldg 400, Walnut Creek, CA 94596, USA</p>
				</ins>
			</insg>
			<source>Genome Biology</source>
			<issn>1465-6906</issn>
			<pubdate>2004</pubdate>
			<volume>5</volume>
			<issue>4</issue>
			<fpage>R29</fpage>
			<url>http://genomebiology.com/2004/5/4/R29</url>
			<xrefbib>
				<pubid idtype="pmpid">15059262</pubid>
			</xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>27</day>
					<month>11</month>
					<year>2003</year>
				</date>
			</rec>
			<revrec>
				<date>
					<day>29</day>
					<month>1</month>
					<year>2004</year>
				</date>
			</revrec>
			<acc>
				<date>
					<day>12</day>
					<month>2</month>
					<year>2004</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>30</day>
					<month>3</month>
					<year>2004</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2004</year>
			<collab>Tringe et al.; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.</collab>
		</cpyrt>
		<shorttitle>
			<p>Enriching for direct regulatory targets in perturbed gene-expression profiles</p>
		</shorttitle>
		<shortabs>
			<p>This study presents an algorithm to infer direct regulatory relationships using gene expression profiles from cells in which individual genes are deleted or overexpressed.</p>
		</shortabs>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<p>Here we build on a previously proposed algorithm to infer direct regulatory relationships using gene-expression profiles from cells in which individual genes are deleted or overexpressed. The updated algorithm can process networks containing feedback loops, incorporate positive and negative regulatory relationships during network reconstruction, and utilize data from double mutants to resolve ambiguous regulatory relationships. When applied to experimental data the reconstruction procedure preferentially retains direct transcription factor-target relationships.</p>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="BMC" subtype="man_spc_id" id="30010013">Methods</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>Gene-expression studies, using cDNA or oligonucleotide arrays, hold promise for elucidating the structure of genetic regulatory networks. A wealth of computational techniques have been proposed for extracting regulatory relationships from these data, many of which rely on correlated expression patterns to identify temporally co-regulated genes (reviewed in <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>). While these methods often detect important patterns, they cannot definitively identify the targets of transcriptional regulators.</p>
			<p>Another approach to identifying regulatory targets involves perturbing gene activity by deleting or overexpressing a transcription factor, and analyzing the effects on the gene-expression profile. However, transcripts affected in such experiments include those of both direct and indirect targets of the perturbed gene, and in some cases the latter may dominate. Various methods have been used to identify the direct targets among the affected genes, including promoter sequence examination and/or genome-wide location analysis <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. In an earlier article, one of us proposed pooling data from a complete set of single-mutant gene-expression profiles to reconstruct a tentative network, then enriching for direct targets by paring the network down to the simplest acyclic directed graph (digraph) consistent with the available data <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>.</p>
			<p>Acyclic networks, by definition, lack feedback pathways through which genes can regulate their own activity. As feedback pathways are known to exist in regulatory networks, the previously proposed algorithm also included a procedure by which it could be applied to any network, even one with cycles. This procedure transforms the network into an equivalent acyclic digraph, called a condensation, before reconstruction. The algorithm thus bypasses the cyclic components and reconstructs the acyclic portion of the network. The structure of the feedback pathways themselves, however, cannot be determined from steady-state single-mutant data <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>.</p>
			<p>To improve the ability of our algorithm to reconstruct all types of regulatory pathway, we drew from the traditional genetic approach of epistasis analysis. 'Epistasis' describes a phenomenon in which an allele of one gene can influence the phenotypic expression of an allele of another gene <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. For example, an altered allele of a downstream gene in a biological pathway may block the effects of a mutation further upstream, thereby changing the outcome of a biological process. Such epistatic relationships can therefore be used to determine the order of gene function, or ascertain that two gene products act in parallel, independent pathways <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. In epistasis analysis, genes involved in the process of interest are systematically perturbed; phenotypes of double mutants, with two genes perturbed, are compared to those of single mutants with only one perturbed gene. If the phenotype of a double mutant is different from either of the related single mutants, the two genes are presumed to act independently of each other. However, if the double mutant resembles one or the other single mutant, the genes are likely to participate in an ordered pathway and the gene whose mutant phenotype dominates is placed downstream of the other. This type of analysis has proved highly informative in the study of genetic, metabolic and signaling networks, suggesting that the inclusion of double-mutant data in genetic network analysis could greatly improve the accuracy of the network reconstruction.</p>
			<p>Here we extend the capabilities of a genetic network reconstruction algorithm <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> to improve its performance and broaden its applicability. First, we implement a preprocessing step to accommodate feedback loops. Second, we modify the algorithm to consider positive and negative regulatory relationships when generating the reconstruction. Third, we utilize data from double mutants to resolve cyclical structures and to identify nontranscriptional or redundant regulatory relationships. The performance of these modified versions of the algorithm is then tested in multiple ways. We use synthetic networks to assess the ability of the cycle-accommodating algorithm to tolerate incomplete or noisy data, and to examine the potential improvement achieved by the incorporation of double-mutant data. Finally, we test the improved algorithm on published expression data from the budding yeast <it>Saccharomyces cerevisiae </it>and compare our results with transcription factor binding profiles.</p>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<sec>
				<st>
					<p>Graph theoretical framework</p>
				</st>
				<p>In this work we represent the genetic regulatory network as a directed graph or digraph, <it>G</it>, and all discussion of graphs here refers to digraphs. A digraph consists of nodes, which in this case correspond to genes, and directed edges, which in our model point from regulator to target. A graph can be represented by a diagram (Figure <figr fid="F1">1a</figr>) of nodes and edges. An alternative representation that fully defines the graph is the adjacency list, <it>Adj</it>(<it>G</it>), in which each node is listed along with the nodes to which it is connected by a directed edge (Figure <figr fid="F1">1b</figr>). In the context of a genetic regulatory network, the adjacency list of a gene includes all the genes it directly influences: for example, the genes whose promoters are bound by a transcription factor. The accessibility list, <it>Acc</it>(<it>G</it>), of the digraph lists each node along with all nodes that can be reached along a directed path of any length from that node (Figure <figr fid="F1">1c</figr>). For a genetic regulatory network, the accessibility list includes all genes whose transcription can be influenced by a gene, directly or indirectly. (For a more thorough discussion of digraphs, see <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>.)</p>
				<fig id="F1">
					<title>
						<p>Figure 1</p>
					</title>
					<caption>
						<p>Graphical representation of genetic regulatory networks</p>
					</caption>
					<text>
						<p>Graphical representation of genetic regulatory networks. <b>(a) </b>A sample regulatory network; <b>(b) </b>its adjacency list; <b>(c) </b>its accessibility list; and <b>(d) </b>its condensation. <b>(e) </b>The reconstruction of this network, mapped onto the original nodes. Circles represent nodes, or genes, and arrows represent edges.</p>
					</text>
					<graphic file="gb-2004-5-4-r29-1"/>
				</fig>
				<p>The genes whose transcript levels change when a gene is deleted or perturbed constitute an accessibility list for that gene. Reconstructing a genetic regulatory network from gene-expression data, therefore, is equivalent to determining an adjacency list based on an accessibility list <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. Because an accessibility list does not define a unique graph, the algorithm seeks the minimum equivalent (or most parsimonious) graph, in which the number of edges is minimized. A unique most parsimonious graph, which provides a core set of edges that are present in all graphs sharing the accessibility list, exists by definition for an acyclic graph <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B10">10</abbr></abbrgrp>. The algorithm obtains the simplest network that can explain the observations by a process that initially connects each perturbed gene to all genes affected by its perturbation then prunes away edges, called shortcuts, which connect one node with another node already accessible via a directed path.</p>
				<p>Cycles present a special problem for the reconstruction algorithm in that a graph with cycles does not possess a unique minimum equivalent graph. A cycle is a closed path in a digraph that begins and ends on the same node and crosses at least one other node (for example, Figure <figr fid="F1">1a</figr>, nodes 7, 8, and 9). It is impossible to reconstruct the edges in a cycle on the basis of single-mutant data: all genes in a cycle have identical accessibility lists, so they are effectively equivalent. Such a group of nodes, in which each node can be reached from every other node, is called a strong component. A multinode strong component may contain one or many cycles, whereas each node not contained within a cycle is a strong component unto itself. Every graph has an equivalent acyclic graph <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, or condensation <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, in which each strong component is represented by a single node (Figure <figr fid="F1">1d</figr>). By mapping the tentative network onto this acyclic equivalent, the algorithm circumvents the problem of cycles <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. This mapping is achieved by examining each perturbed gene and scanning its accessibility list for any reciprocally regulating genes, which are then assigned to the same component.</p>
			</sec>
			<sec>
				<st>
					<p>Extensions of the algorithm</p>
				</st>
				<p>While the previous paper <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> presented a basic procedure for reconstructing a network, several factors limit its applicability to gene-expression data. Here we address these shortcomings with a number of extensions. These modifications accommodate cycles in such a way that the error tolerance of the algorithm can be assessed, they distinguish between positive and negative regulation, and they incorporate information from double-mutant gene-expression profiles into the final reconstruction.</p>
				<sec>
					<st>
						<p>Accommodating cycles</p>
					</st>
					<p>In the previous paper, the error tolerance of the algorithm when reconstructing networks containing cycles was not examined. To compare graphs with different numbers of strong components, we devised a method of generating a reconstruction in which each node again represents one gene. In the reconstruction, any mutually regulating pairs of genes in the network are mapped onto the same strong component <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. We have added a step to expand each strong component into its constituent genes by adding direct connections from each node in the component to all other nodes in the component, and between each node in the component and all nodes adjacent to the component (Figure <figr fid="F1">1e</figr>). This maps the reconstruction back onto the original set of nodes and allows it to be compared, edge by edge, to the original network. We chose to treat the components in this way because alternative approaches result in the undesirable situation of a single network having multiple possible reconstructions <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>. While our method can result in a number of extra edges in the reconstruction that are not part of the real network (false-positive edges), it does result in a unique reconstruction that minimizes the number of correct relationships missed (false-negative edges).</p>
				</sec>
				<sec>
					<st>
						<p>Positive and negative regulation</p>
					</st>
					<p>The original algorithm represents the regulatory network as a simple directed graph, in which the edges have neither magnitude nor sign. However, real genetic regulatory relationships can be either activating or repressing, and can vary in strength; failure to take this information into account could result in erroneous reconstruction. Although the strength of an interaction is difficult to determine from microarray data, it is simple to assess whether a regulatory influence is activating or repressing. Moreover, it is straightforward to incorporate this information into the reconstruction algorithm.</p>
					<p>Mutant gene-expression data can be represented by an <it>M </it>&#215; <it>N </it>accessibility matrix <it>P</it>(<it>G</it>), where <it>M </it>is the number of genes perturbed and <it>N </it>is the number of genes in the network. Each matrix element <it>p</it><sub><it>ij </it></sub>= 1 if there is an edge from node <it>i </it>to node <it>j</it>, and <it>p</it><sub><it>ij </it></sub>= 0 if no edge is present <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B9">9</abbr></abbrgrp>. We modified this matrix such that <it>p</it><sub><it>ij </it></sub>= +1 if there is a positive regulatory relationship, and <it>p</it><sub><it>ij </it></sub>= -1 if there is a negative regulatory relationship. Thus, if the transcript level of gene <it>j </it>goes up when gene <it>i </it>is deleted, then gene <it>i </it>negatively regulates gene <it>j </it>and the matrix element <it>p</it><sub><it>ij </it></sub>= -1. Inspection reveals that any indirect regulatory pathway will have a value equal to the product of the intermediate edges, so the extended algorithm only prunes an edge, by converting the matrix element to zero, if this condition is met (Figure <figr fid="F2">2a</figr>, lines 15-19). For example, if the two intermediate edges both have a positive sign, the original algorithm will remove the shortcut regardless of its sign (Figure <figr fid="F2">2b</figr>), but the extended algorithm will only prune the edge if it is also positive (Figure <figr fid="F2">2c</figr>). Furthermore, an edge will not be pruned if the mediating node is a multigene strong component that contains some edges with negative sign, because edges to and from these components have ambiguous values.</p>
					<fig id="F2">
						<title>
							<p>Figure 2</p>
						</title>
						<caption>
							<p>Edge-removal criteria</p>
						</caption>
						<text>
							<p>Edge-removal criteria. <b>(a) </b>Pseudocode of the algorithm including positive and negative regulation. <it>Acc</it>(<it>i</it>) and <it>Adj</it>(<it>i</it>) indicate the accessibility and adjacency lists for gene <it>i</it>, respectively, and <it>Acc</it>(<it>i,j</it>) indicates the value (+1 or -1) of the edge from <it>i </it>to <it>j</it>. <b>(b) </b>The original algorithm will pare away any edge connecting two nodes that already have a pathway between them. <b>(c) </b>Algorithm taking positive and negative regulation into account will only pare away an edge if its sign is equal to the product of the signs of the remaining edges in the pathway.</p>
						</text>
						<graphic file="gb-2004-5-4-r29-2"/>
					</fig>
				</sec>
				<sec>
					<st>
						<p>Double-mutant data</p>
					</st>
					<p>Reconstructions generated by the algorithm using data from single mutants may contain a number of unresolved strong components. Double-mutant data, from strains in which two genes have been perturbed, should allow the reconstruction of edges within these strong components. We therefore developed an algorithm (Figure <figr fid="F3">3a</figr>) that uses double-mutant data to refine the reconstruction generated with single-mutant data.</p>
					<fig id="F3">
						<title>
							<p>Figure 3</p>
						</title>
						<caption>
							<p>Refining network structure with double-mutant data</p>
						</caption>
						<text>
							<p>Refining network structure with double-mutant data. <b>(a) </b>Pseudocode of the extension utilizing double-mutant data. <it>Acc</it><sub>-<it>j</it></sub>(<it>i</it>) indicates the accessibility list of gene <it>i </it>in the absence of gene <it>j. i</it>, <it>j</it>, <it>k</it>, and <it>l </it>are arbitrary indices for genes in the network. <b>(b) </b>An example of a three-gene cycle (top), its single-mutant accessibility lists (bottom left) and a reconstruction based on that data (bottom right). <b>(c) </b>The double-mutant accessibility lists for the cycle in (b) and the reconstruction process. For each set of double-mutant data (left), edges revealed to be indirect are removed from the reconstruction (right). The notation [1] 2 &gt; indicates the accessibility list of gene 2 in a strain in which gene 1 is already perturbed. <b>(d) </b>A network in which genes 1 and 2 redundantly regulate gene 3 (right), and single-mutant and double-mutant accessibility lists for the network (left). <b>(e) </b>A network in which gene 1 regulates the activity of gene 3 indirectly by modifying the activity of a direct regulator, gene 2 (right); single- and double-mutant accessibility lists (left).</p>
						</text>
						<graphic file="gb-2004-5-4-r29-3"/>
					</fig>
					<p>New accessibility lists for genes in a double mutant are generated by comparing the gene-expression profile of the double mutant to that of each single mutant. For example, in a simple three-gene cycle (Figure <figr fid="F3">3b</figr>), comparing the expression profile of a double mutant in which two genes have been deleted to that of a single mutant can reveal indirect relationships (Figure <figr fid="F3">3c</figr>). To incorporate this information, a reconstruction is first generated based on the single-mutant data alone (Figure <figr fid="F3">3b</figr>, bottom right), in which strong components are fully connected as described earlier. Information from the double mutants is then used to remove connections that are not supported by the data. If, in the reconstruction, a gene <it>k </it>is a member of the adjacency list of gene <it>i</it>, <it>Adj</it>(<it>i</it>), but not in the accessibility list of gene <it>i </it>in the presence of a mutation in gene <it>j</it>, <it>Acc</it><sub>-<it>j</it></sub>(<it>i</it>), then the connection from gene <it>i </it>to gene <it>k </it>is probably indirect. It is removed from the reconstruction as long as <it>k </it>is a member of <it>Acc</it>(<it>j</it>), meaning that gene <it>j </it>could be mediating the interaction (Figure <figr fid="F3">3a</figr>, lines 5-7). In this manner, data from each of the double mutants are used successively to refine the reconstruction (Figure <figr fid="F3">3c</figr>). To fully resolve the structure of cycles that are subcomponents of a larger graph, double-mutant data for all pairs of genes in, or immediately adjacent to, each multinode strong component are needed. This procedure is successful as long as there are not multiple cycles within the component. If there is more than one redundant, but indirect, pathway from one gene to another, the two genes will appear to be directly connected in this analysis.</p>
					<p>Double-mutant data can similarly be used to identify redundant or nontranscriptional regulatory relationships <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, and we have extended the algorithm to reconstruct these types of relationships (Figure <figr fid="F3">3a,d,e</figr>). If two genes, <it>i </it>and <it>j</it>, have redundant or overlapping regulatory effects on a third gene, <it>k</it>, the transcript level of <it>k </it>may be unchanged in each of the single mutants but altered in the double mutant. This type of relationship can be inferred when <it>Acc</it><sub>-<it>j</it></sub>(<it>i</it>) contains members that are absent from the single-mutant accessibility list <it>Acc</it>(<it>i</it>) (Figure <figr fid="F3">3d</figr>). In such a case, the algorithm adds connections from both gene <it>i </it>and gene <it>j </it>to gene <it>k </it>(Figure <figr fid="F3">3a</figr>, lines 12-15). This could represent a case, for example, where either of two transcription factors can bind the same site in the promoter and activate transcription. While a reconstruction based on the single-mutant data would be incomplete, the refinement utilizing the double-mutant data adds the appropriate edges (Figure <figr fid="F3">3d</figr>).</p>
					<p>Proteins other than transcription factors often indirectly influence gene expression. For example, a signaling kinase could phosphorylate and activate a transcription factor, initiating expression of its target genes. This type of relationship is indistinguishable from direct regulation on the basis of single-mutant data alone. Double-mutant data, however, can reveal that the action of the kinase is dependent on the presence of the transcription factor; this phenomenon is the foundation of epistasis analysis. If gene <it>i </it>influences gene <it>k </it>in a wild-type background, but not when gene <it>j </it>is deleted, this suggests that the effect of gene <it>i </it>on gene <it>k </it>is not direct (Figure <figr fid="F3">3e</figr>), and that it is in fact mediated by gene <it>j</it>. The extended algorithm will therefore remove the connection from gene <it>i </it>to gene <it>k </it>as long as <it>k </it>is also accessible from <it>j </it>(Figure <figr fid="F3">3a</figr>, lines 5-7). These observations also imply that gene <it>i </it>must somehow affect the activity of gene <it>j</it>; therefore, if gene <it>j </it>is not already accessible from gene <it>i </it>in the reconstruction, an edge is added from gene <it>i </it>to gene <it>j </it>(Figure <figr fid="F3">3a</figr>, lines 8-9; Figure <figr fid="F3">3e</figr>).</p>
				</sec>
			</sec>
			<sec>
				<st>
					<p>Prevalence of cycles in genetic regulatory networks</p>
				</st>
				<p>We anticipated that the ability to accurately reconstruct, or at least accommodate, cycles would be critical to the success of a reconstruction algorithm. Cycles may be positively selected for in biological networks because feedback loops can potentially provide stability and/or amplification to biological systems <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. Indeed, in both yeast and <it>Escherichia coli</it>, autoregulatory pathways in which transcription factors regulate their own expression are common <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>. Multigene feedback circuits, on the other hand, are rare in <it>E. coli </it><abbrgrp><abbr bid="B14">14</abbr></abbrgrp> but possibly more common in <it>S. cerevisiae </it><abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Such multigene cycles are of concern in reconstruction because they lead to ambiguities in network structure (see Figure <figr fid="F1">1</figr>). Single-gene 'self-loops,' by contrast, are not readily evident in gene-expression data, as manipulation of a gene's activity will, in most cases, affect the level of its RNA message even in the absence of autoregulation. We exclude self-loops from our network models, as have others <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, to avoid complications they can cause in the reconstruction process.</p>
				<p>To estimate the prevalence of cycles in real biological networks we examined the yeast dataset of Hughes <it>et al. </it><abbrgrp><abbr bid="B17">17</abbr></abbrgrp> for evidence of feedback loops. We generated accessibility lists for each of the perturbed genes, using the authors' criteria for statistically significant changes in gene expression. Any mutually regulating genes (where perturbing <it>i </it>affects <it>j </it>and perturbing <it>j </it>affects <it>i</it>) were assigned to the same strong component. Among the 260 single-gene perturbations examined, four multigene components, containing a total of 11 genes, are apparent. Two of these involve genes with closely related functions and known feedback regulation (Figure <figr fid="F4">4</figr>): one contains the genes <it>ERG2</it>, <it>ERG3</it>, <it>ERG11</it>, <it>ERG28 </it>and <it>TUP1</it>, whereas the other contains <it>ADE2 </it>and <it>HPT1</it>. All four <it>ERG </it>genes in the first group are involved in ergosterol biosynthesis and their transcription is coordinately controlled by a negative feedback pathway <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr></abbrgrp>. This gene set is believed to be controlled in part by the transcriptional repressors Rox1 and Mot3, which act by recruiting the co-repressor complex Ssn6/Tup1 <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp>; this may explain the participation of <it>TUP1 </it>in the component. <it>ADE2 </it>and <it>HPT1 </it>are both involved in purine biosynthesis, and deletion of <it>HPT1 </it>is known to activate <it>ADE2 </it>transcription via a feedback pathway dependent on an adenine metabolite <abbrgrp><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr></abbrgrp>.</p>
				<fig id="F4">
					<title>
						<p>Figure 4</p>
					</title>
					<caption>
						<p>Feedback loops in the yeast network: one controlling ergosterol biosynthesis (left) and another controlling purine biosynthesis (right)</p>
					</caption>
					<text>
						<p>Feedback loops in the yeast network: one controlling ergosterol biosynthesis (left) and another controlling purine biosynthesis (right). Pointed arrows indicate positive regulation while blunt-ended arrows indicate negative regulation; only statistically significant relationships are shown.</p>
					</text>
					<graphic file="gb-2004-5-4-r29-4"/>
				</fig>
				<p>The other two multigene components, by contrast, are likely to be artifacts. One contains two open reading frames (ORFs), <it>YML034W </it>and <it>YML033W</it>, which have since been shown to be exons of a single gene, <it>SRC1 </it><abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. The other contains two adjacent ORFs, <it>RTS1 </it>(<it>YOR014W</it>) and <it>YOR015W</it>, whose effects on each other are likely to be mediated in <it>cis </it>rather than in <it>trans</it>. We conclude that among these 260 genes, there are at least 7 genes (2.7%) that participate in cycles, demonstrating the importance of considering this type of structure when reconstructing networks. A precise determination of the prevalence of cycles will await more extensive experimental data.</p>
			</sec>
			<sec>
				<st>
					<p>Error tolerance of the algorithm</p>
				</st>
				<p>To evaluate the usefulness of the algorithm in reconstructing regulatory networks, we would like to know how errors in the data affect the final reconstruction. Previous work suggested that the accuracy of a reconstruction scaled more or less linearly with the amount of data available; however, this work was based on simulations with acyclic networks only <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. To determine the impact of cycles on the accuracy of reconstruction, we assessed the accuracy of the reconstruction under three conditions: when data for only a subset of genes is available; where some accessibility data is missing; and where some of the accessibility data is erroneous. All of these problems are likely to present themselves to some extent in real biological data.</p>
				<p>The ability of the algorithm to tolerate inaccuracies in the data was assessed with random 500-gene networks with varying numbers of edges (see Appendix 1 in Additional data file 1, and Materials and methods for information on the generation of these networks). For each network, accessibility lists were generated, and then altered to simulate experimental error. An initial reconstruction, based on the unaltered data, was used as the standard to which all other reconstructions were compared.</p>
				<p>Real experimental data may not include perturbations of all genes, and not all expression changes may be detected. We investigated the impact of each type of missing data on the accuracy of reconstructions. When perturbation data is absent for up to 50% of the genes in a network, the fraction of correct edges declines roughly linearly with the number of unperturbed genes, with a slope slightly steeper than -1 (Figure <figr fid="F5">5a</figr>). Performance is somewhat more compromised by random loss of accessibility information (Figure <figr fid="F5">5b</figr>), but an informative reconstruction can still be generated from incomplete data. For example, when 10% of the accessibilities are removed, roughly 70% of the reconstructed edges are correct (Figure <figr fid="F5">5b</figr>). The impact of missing data is relatively unaffected by the density of edges in the network, at least within the tested range (1.0-1.2 edges per node).</p>
				<fig id="F5">
					<title>
						<p>Figure 5</p>
					</title>
					<caption>
						<p>Sensitivity of the algorithm to incomplete or noisy data</p>
					</caption>
					<text>
						<p>Sensitivity of the algorithm to incomplete or noisy data. On the <it>y</it>-axis of each graph is the fraction correct edges, (<it>E - fn</it>)/(<it>E + fp</it>), where <it>E </it>is the number of edges in the correct graph, and <it>fn </it>and <it>fp </it>are the number of false-negative and false-positive edges in the reconstruction. On the <it>x</it>-axis is <b>(a) </b>fraction of genes for which no accessibility information is available, <b>(b) </b>fraction of false-negative accessibilities, or <b>(c) </b>fraction of false-positive accessibilities. All data is for synthetic 500-gene networks with 500, 550, or 600 edges and edge distribution as described in Materials and methods. Each data point represents the average &#177; standard deviation of 10 repetitions for each of six independent networks.</p>
					</text>
					<graphic file="gb-2004-5-4-r29-5"/>
				</fig>
				<p>To assess the effect of noise in the data, 'regulator' and 'target' genes were chosen at random, checked to make sure that no path already existed between the two, and added to the accessibility lists. Such erroneous data severely reduces the accuracy of the reconstruction (Figure <figr fid="F5">5c</figr>). For example, 10% false-positive data results in a reconstruction with only 50% or 20% correct edges for networks with 500 or 600 edges, respectively. This poor outcome results from the addition of incorrect edges to the reconstruction, some of which create circular pathways by erroneously connecting a gene to one of its upstream regulators. These 'pseudocycles' interfere with the search phase of the algorithm (Figure <figr fid="F2">2a</figr>, lines 4-14) and often result in reconstructions of poor quality. The relative sensitivity of the reconstruction algorithm to false-positive data as opposed to false-negative suggests that statistically conservative criteria should be used to define accessibility.</p>
			</sec>
			<sec>
				<st>
					<p>Improved resolution with double-mutant data</p>
				</st>
				<p>As described above, our extension to the algorithm uses double-mutant data to resolve local structures of multigene strong components. We tested the ability of this extension to improve on the reconstruction after a network is generated from single-mutant data. Because we wished to examine the ability of the algorithm to reconstruct all the edges in a network, including those within cycles, we compared each reconstruction, edge for edge, to the original synthetic network. For each of four synthetic networks of a given edge density we assessed the accuracy of the reconstruction based on the single-mutant data alone as well as the reconstruction including data for all relevant double mutants of genes within cycles and their neighbors (Figure <figr fid="F6">6a</figr>). As expected, the accuracy of the reconstruction based on single-mutant data alone declines as edge density increases, as a result of an increased number of genes in unresolved cycles. Incorporating data from double mutants substantially improves the quality of reconstructions for more highly connected networks. For example, the edges in reconstructions based on single-mutant data are only 62% correct for networks with 1.1 edges per node while they are 93% correct when double-mutant data is incorporated (Figure <figr fid="F6">6a</figr>). The improved reconstructions are not 100% correct because the synthetic random networks contain some redundant pathways that are pared away in the reconstruction process, which seeks the most parsimonious graph.</p>
				<fig id="F6">
					<title>
						<p>Figure 6</p>
					</title>
					<caption>
						<p>Quality of reconstruction using double-mutant data</p>
					</caption>
					<text>
						<p>Quality of reconstruction using double-mutant data. <b>(a) </b>Fraction of correct edges in reconstruction (including cycles) versus edge density using single-mutant data only (original) or including double-mutant data (refined). <b>(b) </b>Number of double mutants needed for accurate reconstruction of networks with different numbers of edges, as compared to the number of genes in the network, <it>N</it>. <b>(c) </b>Improvement of reconstruction as double-mutant data is added for a network with 1.2 edges per node. All synthetic networks had 500 genes and the indicated number of edges. Data in (c) is for one of the four independent networks analyzed in (a) and (b), where each data point represents the average &#177; standard deviation for four independent networks.</p>
					</text>
					<graphic file="gb-2004-5-4-r29-6"/>
				</fig>
				<p>Of particular note is the number of double mutants needed for this gain in accuracy (Figure <figr fid="F6">6b</figr>). For networks of 1.1-1.2 edges per node, the number of double mutants necessary is in the vicinity of 100-200 (17-33% of the number of genes in the network). Even for networks of 1.3-1.4 edges per node, the number of double-mutant profiles needed is on the order of <it>N</it>, the number of genes in the network, and not <it>N</it><sup>2</sup>, the total number of possible double mutants (Figure <figr fid="F6">6b</figr>).</p>
				<p>We also examined the progressive increase in accuracy as double-mutant data is incorporated into the reconstruction of an individual network (Figure <figr fid="F6">6c</figr>). For this analysis, individual pairs of genes contained in or adjacent to multigene components were chosen at random, their double-mutant gene-expression profiles analyzed, and the resulting reconstruction compared to the true network before repeating the process with another pair of genes. The reconstructions consistently improve as double-mutant data is added (Figure <figr fid="F6">6c</figr>). Thus, any available double-mutant data can be used to aid the reconstruction even if not all the relevant double mutants have been generated.</p>
			</sec>
			<sec>
				<st>
					<p>Partial reconstructions from microarray data</p>
				</st>
				<p>To test the performance of the algorithm on real data and to assess the impact of our extensions, we applied the reconstruction algorithm to the above-mentioned set of gene-expression profiles <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Using the authors' error model, we identified 16,133 potential targets of 260 perturbed genes. When this accessibility information is fed into the reconstruction algorithm, four multigene strong components (Figure <figr fid="F4">4c</figr>) are identified as described and collapsed into single nodes in a condensation graph. The basic algorithm then identifies 5,770 connections that are potentially indirect and prunes them from the network. However, when positive and negative regulation are taken into account, only 1,779 of these turn out to be consistent with indirect regulation by the mediating edges. This suggests that antagonistic feed-forward pathways ("incoherent feedforward loops" in the terminology of Shen-Orr <it>et al</it>. <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>) (Figure <figr fid="F2">2b</figr>, bottom) are common in real biological regulatory networks. Interestingly, 65% of the regulatory relationships identified in these data are apparently repressive and most of the incoherent pathways consist of three negative regulatory connections.</p>
				<p>These results imply that it is important to take positive and negative regulatory relationships into account, as the algorithm may otherwise prune genuine direct connections from the network. But are the 1,779 connections pruned by the improved algorithm really indirect? To test this, we took advantage of a dataset containing promoter-binding information for the majority of transcription factors in yeast <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. If a gene's promoter is bound by a transcription factor, and its transcript abundance changes upon deletion of that factor, we consider it to be a direct target of that transcription factor <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B28">28</abbr></abbrgrp>. There are 17 transcription factors for which both single-mutant gene-expression profiles and chromatin-binding patterns are available in the datasets examined. Of these, one (<it>RGT1</it>) does not affect the transcription of any genes, and two (<it>YAP3 </it>and <it>YAP7</it>) do not bind any promoters, according to the respective authors' criteria for statistical significance, and were therefore not examined. For the remaining 14 transcription factors, anywhere from 0% (<it>RTG1</it>) to 100% (<it>MBP1</it>) of the 'expression targets' are direct as assessed by chromatin binding. Overall, of the 1,016 transcripts whose levels are affected by deletion of one of these genes, 81 (7.97%) are genuinely bound by the relevant transcription factor (Table <tblr tid="T1">1</tblr>, and Appendix 2 in Additional data file 1). After application of the algorithm, 835 of these connections are retained in the reconstructed network, including 78 (9.34%) that are directly bound. Of the 181 eliminated targets, only 3 (1.66%, <it>p &lt; 0.001</it>) are true positives according to the binding data (Table <tblr tid="T1">1</tblr>). Thus using only a small dataset of expression profiles from single mutants, the algorithm prunes indirect edges more frequently than direct edges, resulting in enrichment for direct connections in the reconstruction. This enrichment, considered in the context of our simulation results (Figure <figr fid="F5">5a</figr>), suggests that when provided with more data the algorithm will successfully eliminate many indirect connections.</p>
				<tbl id="T1" hint_layout="single">
					<title>
						<p>Table 1</p>
					</title>
					<caption>
						<p>Fraction of direct targets before and after reconstruction</p>
					</caption>
					<tblbdy cols="4">
						<r>
							<c>
								<p/>
							</c>
							<c ca="center">
								<p>Total targets*</p>
							</c>
							<c ca="center">
								<p>Bound targets<sup>&#8224;</sup></p>
							</c>
							<c ca="center">
								<p>Percent bound<sup>&#8225;</sup></p>
							</c>
						</r>
						<r>
							<c cspan="4">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Expression data</p>
							</c>
							<c ca="center">
								<p>1,016</p>
							</c>
							<c ca="center">
								<p>81</p>
							</c>
							<c ca="center">
								<p>8.0</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Reconstruction</p>
							</c>
							<c ca="center">
								<p>835</p>
							</c>
							<c ca="center">
								<p>78</p>
							</c>
							<c ca="center">
								<p>9.34</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Eliminated targets</p>
							</c>
							<c ca="center">
								<p>181</p>
							</c>
							<c ca="center">
								<p>3</p>
							</c>
							<c ca="center">
								<p>1.66</p>
							</c>
						</r>
					</tblbdy>
					<tblfn>
						<p>*Total number of transcribed genes affected by transcription factor deletions. <sup>&#8224;</sup>Number of gene targets bound by relevant transcription factor. <sup>&#8225;</sup>Percentage of gene targets bound by relevant transcription factor.</p>
					</tblfn>
				</tbl>
				<p>The three 'true positives' incorrectly pruned, all putative Swi4 targets, are <it>BAT2</it>, <it>SNA2 </it>and <it>EXG1</it>. Interestingly, both <it>BAT2 </it>and <it>SNA2 </it>levels increase upon <it>SWI4 </it>deletion, suggesting negative regulation. By contrast, 18 of the other 19 'true targets' of Swi4 are reduced in a <it>swi4&#916; </it>mutant, consistent with Swi4's known behavior as a transcriptional activator <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. Thus, even these two may in fact be indirect expression targets, or exhibit atypical regulation by Swi4.</p>
				<p>We also utilized the gene-expression dataset to test the extension utilizing double-mutant data, as it includes transcription profiles for several double-mutant yeast strains <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. To use this data for epistasis analysis, one must compare it to the single-mutant expression profiles so that the effects of each individual perturbation can be discerned. These data are available for two of the double mutants: <it>dig1&#916;dig2&#916; </it>and <it>isw1&#916;isw2&#916;</it>, both of which bear deletions in pairs of genes believed to have redundant functions. <it>DIG1 </it>and <it>DIG2</it>, also known as <it>RST1 </it>and <it>RST2</it>, encode proteins that inhibit the Ste12 transcription factor <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. <it>ISW1 </it>and <it>ISW2 </it>encode ATP-dependent chromatin-remodeling enzymes with partially overlapping functions <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. Both of these pairs of genes, therefore, could be expected to exhibit parallel, redundant regulation of some transcripts.</p>
				<p>We generated new accessibility lists from these mutant expression profiles and used this information to refine the reconstruction; specifically, edges were added from each of the mutant genes to any new genes on the double-mutant accessibility lists (Figure <figr fid="F3">3a</figr>, lines 12-15). This process resulted in quite a few genes being added to the adjacency lists of <it>DIG1</it>, <it>DIG2</it>, <it>ISW1 </it>and <it>ISW2</it>. Genome-wide binding data is available for just one of these genes, <it>DIG1 </it><abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Of the 228 'new' <it>DIG1 </it>targets, 18 (7.9%) are directly bound by the Dig1 protein. These relationships were absent in the original reconstruction from single-mutant data, demonstrating that double-mutant data can be of use in identifying regulatory relationships. However, significantly more double-mutant data would be required to achieve a significant improvement in the overall quality of the reconstruction. Genome-wide binding profiles are not available for Dig2, Isw1 or Isw2. However, for both <it>DIG1 </it>and <it>DIG2</it>, the new targets included several known targets of the Ste12 protein such as <it>ERG24</it>, <it>PCL2</it>, and <it>STE12 </it><abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, consistent with their role as regulators of Ste12p activity. Furthermore, at least one gene known to be directly regulated by both Isw1 and Isw2, <it>FIG1 </it><abbrgrp><abbr bid="B31">31</abbr></abbrgrp>, is recognized as an Isw2 protein target only in the refined reconstruction.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st>
			<p>In this study we have extended a previously described algorithm <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> that uses information on the system-wide transcriptional effects of single-gene perturbations to reconstruct a tentative regulatory network. Several extensions presented here address weaknesses in the original algorithm and make it more effective on real biological data. One concern was that the effect of cycles in the network on the accuracy of the reconstruction was not previously assessed. We have now devised a method to compare reconstructions with different numbers of strong components and demonstrate that our method can generate a network enriched for direct regulatory relationships even when there are feedback loops in the network. The cycle-accommodating extension does not substantially change the impact of missing data observed previously <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. However, it is sensitive to false-positive data, implying that statistically conservative criteria should be used for identifying potential regulatory targets. Another shortcoming of the original algorithm was that it did not distinguish between positive and negative regulation, an important feature of biological networks that could influence the reconstruction process. We show here by application to published data of a modified algorithm that considers both activation and repression that this inclusion is critical to proper characterization of network structure. Finally, the original algorithm had no mechanism by which to use double-mutant data, which is historically proven to be of great value in pathway interpretation. The ability to incorporate data from double mutants, in the manner of traditional epistasis analysis, considerably increases the potential power of the algorithm. We have focused here on transcriptional regulation, but our approach could easily be extended to other types of data, such as protein levels or posttranslational modifications, as they become available.</p>
			<p>A great deal of effort has been applied to the task of extracting regulatory relationships from microarray data <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. A smaller, but still substantial, set of techniques addresses the specific question of deriving genetic regulatory networks from perturbed gene-expression profiles <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp>. Early efforts focused on Boolean network models <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B34">34</abbr></abbrgrp> and made a number of interesting observations about the amount and types of data necessary to characterize a network. Evidence suggests, however, that significant information is lost in Boolean models by the discretization of data into arbitrary 'on' and 'off' categories, and that simple logical gates are unlikely to encompass combinatorial regulatory interactions <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B33">33</abbr></abbrgrp>. A graph-theoretic approach, as used in our study, has the advantage of accepting continuous-valued data and placing no constraints on the number of inputs or their interactions with each other <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. Some groups have also taken a probabilistic approach to the problem using Bayesian networks <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp>. However, despite making biologically relevant predictions, this approach has not thus far been successful in identifying target genes of transcription factors <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp> and is very sensitive to model parameters <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>.</p>
			<p>The starting point for our analysis is a parsimony approach analogous to that used in phylogeny. A large number of network structures could potentially explain the experimental observations, so the algorithm seeks the simplest network consistent with available data. Similar approaches have been proposed by others <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B33">33</abbr></abbrgrp>, who also represent gene-perturbation data in the form of an accessibility or interaction matrix. The graph implied by this matrix is then reduced to a most parsimonious graph by pruning either all redundant pathways in the graph <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B11">11</abbr></abbrgrp> or only those consistent with activation and repression data (<abbrgrp><abbr bid="B33">33</abbr></abbrgrp> and this study). Our algorithm differs from that of Kyoda and colleagues <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> in that it cannot be directly applied to networks with cycles. However, the method proposed here for processing these networks avoids the problem of arbitrary gene order within cycles <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> and the higher computational cost of the alternative algorithm (<it>O</it>(<it>n<sup>3</sup></it>)) <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. When we carried out the same microarray and binding-data analysis described herein, using the Kyoda <it>et al</it>. algorithm, the results were very similar: 168 connections were pruned, of which three (the same three as in our analysis) were direct. Essentially equivalent results (183 connections pruned, three direct) could be achieved with a minor improvement that allowed pruned edges to mediate accessibility (data not shown). Two additional indirect edges, contained within cycles, were pruned in this analysis; however, the pruning process took approximately 418 central-processing-unit (CPU) seconds as opposed to around 1 CPU second for our algorithm. While neither processing time was prohibitive in this case, for larger datasets the computation cost could become a serious concern.</p>
			<p>The AIGNET (Algorithms for Inference of Genetic Networks) method <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> groups nodes into 'equivalence sets,' identical to strong components, before pruning indirect connections from the network. Relationships within these equivalence sets are then resolved and the network structure fine-tuned using time-course data and a dynamic S-system model <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. How this method fares in practice, and how it compares to the double-mutant approach described here, has yet to be tested. It may well prove complementary to the analysis of steady-state double-mutant profiles, particularly in cases where a double mutation is lethal. It is clear, however, that the straightforward Boolean approach suggested by Maki <it>et al. </it><abbrgrp><abbr bid="B11">11</abbr></abbrgrp> for the creation of the 'skeleton network' will erroneously sever direct connections as a consequence of failing to consider positive and negative regulation. In our analysis, 11 additional direct connections (14 total, as compared to 3; Table <tblr tid="T1">1</tblr>) were incorrectly pruned when repression and activation were not treated separately.</p>
			<p>A unique aspect of our algorithm is the manner in which double-mutant data is used. Other proposed methods have used double-mutant data, but as a means of fully exploring the logical states of a Boolean network <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> or expanding the range of perturbations to the system <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. We have used double-mutant data here both to resolve the order of genes within feedback loops (cycles), and to detect nontranscriptional or redundant regulatory pathways. To our knowledge, no previous computational work has addressed the question of resolving feedback loop structure with double-mutant data. Classical genetic pathway analysis, however, has been automated in the program GenePath <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, which constructs acyclic genetic regulatory networks governing specific biological processes on the basis of phenotypic data from single and double mutants. The basic logic underlying the analysis performed by GenePath <abbrgrp><abbr bid="B38">38</abbr></abbrgrp> is very similar to that described here.</p>
			<p>The power of double-mutant analysis is evident in its extensive application in classical genetics <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. These applications include the study of signaling pathways involved in transcriptional regulation, for example the repression of <it>SUC2 </it>by glucose <abbrgrp><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp>. As the number of experiments required for this method scales with the square of the number of genes under investigation, epistasis analysis cannot be easily applied on a genome-wide level. However, our algorithm generates a tentative reconstruction based on single-mutant data that can then be refined with targeted double-mutant data, bringing the number of experiments into a manageable range. The type of data required by our algorithm is rapidly becoming available through gene-deletion projects and new methods, such as RNA interference, to perturb the activity of selected genes <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr></abbrgrp>. For the budding yeast <it>S. cerevisiae</it>, a complete library of single-gene deletion mutants is available and double mutants have been created by automated high-throughput mating of these strains <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr></abbrgrp>.</p>
		</sec>
		<sec>
			<st>
				<p>Conclusions</p>
			</st>
			<p>Application of our reconstruction algorithm to published gene-expression data from a number of <it>S. cerevisiae </it>mutants led to several interesting observations. The first is that 'incoherent' feedforward pathways, in which the edges are arranged such that they act antagonistically, seem to be common in the yeast regulatory network. This is in contrast to <it>E. coli</it>, where most (34 of 40) known feedforward motifs in transcriptional regulation are coherent <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. In <it>S. cerevisiae </it>an overrepresentation of the feedforward motif, in which one gene regulates another through both direct and indirect pathways, has recently been observed in regulatory pathways documented in the literature <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>; consistent with our observations, a relatively large proportion of these (21 of 47) are incoherent. An overrepresentation of feedforward pathways has also been reported in yeast chromatin-binding data <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>; however, whether these parallel pathways act antagonistically or synergistically is unknown because the data does not indicate whether regulation is positive or negative. Our findings suggest that synergy is not necessarily the rule in the yeast network, though most of the incoherent pathways we observed are likely to be indirect at the transcriptional level.</p>
			<p>The second important observation is that the parsimony approach embodied by this algorithm can successfully identify and prune indirect connections from a regulatory network when provided with single-mutant data alone. Although only approximately 4% of the genes in <it>S. cerevisiae </it>were perturbed in the dataset examined <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, a substantial number of possible connections could be pruned from the network while maintaining all regulatory relationships observed. Comparison with DNA-binding data for a small set of transcription factors suggested that these pruned connections are much more likely to be indirect than the edges retained in the reconstruction.</p>
			<p>Finally, analysis of experimental yeast double-mutant data revealed that expression data for strains in which two genes with potentially redundant functions are both deleted can aid in the identification of true target genes. Thus, utilization of double-mutant data is likely to improve the reconstruction generated by the algorithm. Furthermore, analysis of synthetic networks suggests that data from a relatively small number of double mutants, on the order of the size of the network, <it>N</it>, could substantially improve the quality of the reconstruction.</p>
			<p>In sum, we have shown that the assumption of parsimony is a reasonable one in the context of regulatory networks and is supported by available data. We have developed a method to automate this approach that can utilize single-mutant data to generate a tentative reconstruction and double-mutant data to improve its accuracy. We anticipate that application of this algorithm will greatly simplify interpretation of experimental gene perturbation data as more mutant gene-expression profiles become available for a number of organisms.</p>
		</sec>
		<sec>
			<st>
				<p>Materials and methods</p>
			</st>
			<sec>
				<st>
					<p>Synthetic networks</p>
				</st>
				<p>We generated synthetic networks (Figures <figr fid="F4">4</figr>, <figr fid="F5">5</figr>, <figr fid="F6">6</figr>) using an adaptation of methods described in <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. Outgoing edges were distributed according to a power law with an exponential cutoff such that the probability of a node having <it>k_out </it>edges, <it>p</it>(<it>k_out</it>), is proportional to (<it>k_out</it>)<sup>-<it>&#964;</it></sup><it>e</it><sup>-<it>k_out/&#954;</it></sup>, where the constants <it>&#964; </it>and <it>&#954; </it>equal 0.7 and 1,000 respectively. Incoming edges were assigned according to an exponential distribution, where <it>p</it>(<it>k_in</it>) ~ <it>e</it><sup>-<it>&#946;k_in </it></sup>and the constant <it>&#946; </it>= 0.5. This model was based on previous analyses of regulatory networks in <it>E. coli </it>and <it>S. cerevisiae </it>(see Appendix 1 in Additional data file 1).</p>
				<p>To create the outgoing edge distribution, each gene was assigned outgoing edge 'stubs' by the following procedure <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. First, an edge number <it>k_out </it>with a distribution <it>e</it><sup>-<it>k_out/&#954; </it></sup>was generated with the transformation <it>k_out = 1 + int</it>(<it>-&#954; ln</it>(<it>1 - r</it>)) where <it>r </it>is a random real number uniformly distributed in the range <it>0 </it>&#8804; <it> r &lt; 1 </it>and <it>int</it>(<it>x</it>) indicates the largest integer smaller than <it>x</it>. This number <it>k_out </it>was then accepted with probability (<it>k_out</it>)<sup>-<it>&#964; </it></sup>as long as <it>k_out </it>was less than the total number of nodes in the network; if <it>k_out </it>was not accepted, the process was repeated. Each gene was then similarly assigned an incoming edge number generated with the transformation <it>k_in = 1 + int</it>(<it>-</it>(<it>1/&#946;</it>)<it>ln</it>(<it>1 - r</it>)) where <it>r </it>is another random real number uniformly distributed in the range <it>0 </it>&#8804; <it> r &lt; 1</it>. This number was accepted as long as <it>k_in </it>was less than one-quarter the total number of nodes in the network. These outgoing and incoming stubs were then joined to form edges until the desired number of edges was reached. Edge distributions of graphs generated in this manner were checked by eye to confirm that they displayed the desired behavior.</p>
				<p>Accessibility lists for these graphs were generated by a depth-first search <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. Double-mutant accessibility lists were created by eliminating all incoming or outgoing edges for one gene, then generating accessibility lists for all remaining genes in the network.</p>
			</sec>
			<sec>
				<st>
					<p>Error analysis</p>
				</st>
				<p>To examine the effect of not having all gene perturbations available, we deleted all accessibilities for nodes chosen at random, without replacement, in 500 gene synthetic networks. For each network, the number of nodes treated in this way ranged from 2.5% (12) to 50% (250) of the total, in 2.5% increments. We then used this limited data as input to the reconstruction algorithm, and the resulting network was compared to the initial reconstruction. Similarly, to examine the effect of incomplete data (false-negative accessibilities), we calculated the total number of accessibilities in the network, and deleted a number of accessibilities ranging from 2.5% to 50% of the total number before reconstruction. To simulate false-positive accessibilities, we added up to 50% more accessibilities to the data. In all of these cases we repeated the process 10 times for each increment. We carried out the entire analysis of the effects of incomplete, false-negative and false-positive data on the same six independent networks of each edge density.</p>
				<p>To assess the accuracy of reconstructions from incomplete or noisy data, we compared each pair of vertices in the network to the 'correct' graph with regard to the presence of an edge. We then tallied the number of edges missing in the reconstruction (false negative, <it>fn</it>) or erroneously present in the reconstruction (false positive, <it>fp</it>) and calculated the 'fraction correct edges' (<it>E - fn</it>)<it>/</it>(<it>E + fp</it>), where <it>E </it>is the number of edges in the correct graph. In Figure <figr fid="F5">5</figr>, the standard of comparison is the reconstruction generated from the correct accessibility list, while in Figure <figr fid="F6">6</figr>, the standard of comparison is the original graph.</p>
				<p>The addition or deletion of accessibilities during the error analysis resulted at times in the creation of 'pseudocycles,' circular pathways in the network that are not identified as cycles in the condensation step. Such pathways are also encountered in experimental data (data not shown) and the algorithm can get caught in these loops during the search phase (Figure <figr fid="F2">2a</figr>, lines 4-14). A simple modification that keeps track of the number of times the algorithm has 'stopped' at a given node during the search prevents this from happening. If the same node is encountered more than twice, an alternative search path is chosen.</p>
			</sec>
			<sec>
				<st>
					<p>Yeast datasets</p>
				</st>
				<p>The dataset examined contains 300 <it>S. cerevisiae </it>expression profiles, including 276 deletion mutants, 11 tetracycline-regulatable alleles of essential genes, and 13 drug treatments <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Of the deletion mutants, 7 are double mutants, and 20 are aneuploid. We excluded data from double-mutant and aneuploid strains but included data from the tetracycline-regulated alleles when initially generating accessibility lists, for a total of 260 expression profiles. Accessibility was defined solely by <it>p </it>value according to the error model of Hughes <it>et al</it>.: if the transcript level of a gene changed with <it>p </it>&lt; 0.01, it was considered accessible from the gene mutated or perturbed in the experiment. We considered the regulation positive if the level went down and negative if it went up, as all of these experiments involved gene inactivation and not overexpression. Accessibility lists for double mutants <it>dig1&#916;dig2&#916; </it>and <it>isw1&#916;isw2&#916; </it>were based on two criteria: a <it>p </it>value less than 0.01 (relative to wild type) and a fold expression change of at least 1.8 between the single mutant and double mutant.</p>
				<p>To test the ability of the algorithm to distinguish direct from indirect regulation, we utilized the chromatin binding data of Lee <it>et al. </it><abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, in which the promoter-binding profiles of 106 transcription factors in <it>S. cerevisiae </it>were determined by genome-wide location analysis. We used the same statistical significance threshold, <it>p </it>&lt; 0.001, chosen by the authors to identify true binding targets. There are 17 transcription factors in this dataset with corresponding deletion-mutant expression profiles <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>: <it>ARG80</it>, <it>CIN5</it>, <it>DIG1</it>, <it>GCN4</it>, <it>GLN3</it>, <it>HIR2</it>, <it>MAC1</it>, <it>MBP1</it>, <it>RGT1</it>, <it>RTG1</it>, <it>STE12</it>, <it>SWI4</it>, <it>SWI5</it>, <it>SWI6</it>, <it>YAP1</it>, <it>YAP3</it>, and <it>YAP7 </it><abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>. For each of these transcription factors there is a number of genes, <it>E</it>, whose expression is significantly affected by deletion of the factor and which we refer to as expression targets. There is also a number of genes, <it>B</it>, whose promoters are bound by the transcription factor and which we refer to as binding targets. We consider the intersection of these two sets to represent the <it>T </it>'true targets' (listed in Appendix 2 in Additional data file 1). The fractional overlap between the two sets (<it>f</it> = &#931;<it>T/</it>(&#931;<it>E</it> + &#931;<it>B</it>)) was tested within the range of <it>p </it>&lt; 0.001 to <it>p </it>&lt; 0.01 as defined by the error models of the respective authors for each dataset. It was maximized when expression targets were defined with a <it>p </it>&lt; 0.01 threshold <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> and binding targets were defined with a <it>p </it>&lt; 0.001 threshold <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, in agreement with the authors' own choices. Additional criteria such as requiring a certain magnitude of expression change or magnitude of binding enrichment either did not improve or only minimally improved the overlap. A &#967;<sup>2 </sup>test was used to assess the statistical significance of the results.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Additional data files</p>
			</st>
			<p>The following additional data are available with the online version of this paper: a PDF file (Additional data file <supplr sid="s1">1</supplr>) containing four appendices; Appendix 1 is a detailed discussion of the parameters of the synthetic networks used in simulations in the paper and the rationale behind their choice; Appendix 2 lists the transcription factors and targets identified as described in the main text; Appendix 3 gives the Perl code for the algorithm described in the paper; Appendix 4 is a sample text input file for the algorithm.</p>
			<suppl id="s1">
				<title>
					<p>Additional data file 1</p>
				</title>
				<caption>
					<p>A PDF file containing four appendices including a detailed discussion of the parameters of the synthetic networks used in simulations, the transcription factors and targets identified as described in the main text, the Perl code for the algorithm described in the paper and a sample text input file for the algorithm</p>
				</caption>
				<text>
					<p>A PDF file containing four appendices including a detailed discussion of the parameters of the synthetic networks used in simulations, the transcription factors and targets identified as described in the main text, the Perl code for the algorithm described in the paper and a sample text input file for the algorithm</p>
				</text>
				<file name="gb-2004-5-4-r29-s1.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>We thank E. Bedrick for assistance with statistical analysis, J. Blanchard, M. Fuller and M. Gilchrist for critical reading of the manuscript, and P. Renaud for hosting S.L.G. in his lab during part of this project. This work was supported by the W. M. Keck Foundation.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Genetic network inference: from co-expression clustering to reverse engineering.</p>
				</title>
				<aug>
					<au>
						<snm>D'Haeseleer</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Liang</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Somogyi</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2000</pubdate>
				<volume>16</volume>
				<fpage>707</fpage>
				<lpage>726</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/16.8.707</pubid>
						<pubid idtype="pmpid" link="fulltext">11099257</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Modeling and simulation of genetic regulatory systems: a literature review.</p>
				</title>
				<aug>
					<au>
						<snm>de Jong</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>J Comput Biol</source>
				<pubdate>2002</pubdate>
				<volume>9</volume>
				<fpage>67</fpage>
				<lpage>103</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1089/10665270252833208</pubid>
						<pubid idtype="pmpid" link="fulltext">11911796</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Genome-wide characterization of the Zap1p zinc-responsive regulon in yeast.</p>
				</title>
				<aug>
					<au>
						<snm>Lyons</snm>
						<fnm>TJ</fnm>
					</au>
					<au>
						<snm>Gasch</snm>
						<fnm>AP</fnm>
					</au>
					<au>
						<snm>Gaither</snm>
						<fnm>LA</fnm>
					</au>
					<au>
						<snm>Botstein</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Brown</snm>
						<fnm>PO</fnm>
					</au>
					<au>
						<snm>Eide</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2000</pubdate>
				<volume>97</volume>
				<fpage>7957</fpage>
				<lpage>7962</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1073/pnas.97.14.7957</pubid>
						<pubid idtype="pmpid" link="fulltext">10884426</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Genome-wide location and function of DNA binding proteins.</p>
				</title>
				<aug>
					<au>
						<snm>Ren</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Robert</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Wyrick</snm>
						<fnm>JJ</fnm>
					</au>
					<au>
						<snm>Aparicio</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Jennings</snm>
						<fnm>EG</fnm>
					</au>
					<au>
						<snm>Simon</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Zeitlinger</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Schreiber</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Hannett</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Kanin</snm>
						<fnm>E</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2000</pubdate>
				<volume>290</volume>
				<fpage>2306</fpage>
				<lpage>2309</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.290.5500.2306</pubid>
						<pubid idtype="pmpid" link="fulltext">11125145</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>How to reconstruct a large genetic network from n gene perturbations in fewer than n(2) easy steps.</p>
				</title>
				<aug>
					<au>
						<snm>Wagner</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2001</pubdate>
				<volume>17</volume>
				<fpage>1183</fpage>
				<lpage>1197</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/17.12.1183</pubid>
						<pubid idtype="pmpid" link="fulltext">11751227</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<aug>
					<au>
						<snm>Griffiths</snm>
						<fnm>AJF</fnm>
					</au>
					<au>
						<snm>Gelbart</snm>
						<fnm>WM</fnm>
					</au>
					<au>
						<snm>Miller</snm>
						<fnm>JH</fnm>
					</au>
					<au>
						<snm>Lewontin</snm>
						<fnm>RC</fnm>
					</au>
				</aug>
				<source>Modern Genetic Analysis</source>
				<publisher>New York: WH Freeman</publisher>
				<pubdate>1999</pubdate>
			</bibl>
			<bibl id="B7">
				<title>
					<p>Ordering gene function: the interpretation of epistasis in regulatory hierarchies.</p>
				</title>
				<aug>
					<au>
						<snm>Avery</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Wasserman</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Trends Genet</source>
				<pubdate>1992</pubdate>
				<volume>8</volume>
				<fpage>312</fpage>
				<lpage>316</lpage>
				<xrefbib>
					<pubid idtype="pmpid">1365397</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B8">
				<title>
					<p>A genetic method for determining the order of events in a biological pathway.</p>
				</title>
				<aug>
					<au>
						<snm>Jarvik</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Botstein</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1973</pubdate>
				<volume>70</volume>
				<fpage>2046</fpage>
				<lpage>2050</lpage>
				<xrefbib>
					<pubid idtype="pmpid">4579012</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<aug>
					<au>
						<snm>Harary</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Graph Theory</source>
				<publisher>Reading, MA: Perseus Books</publisher>
				<pubdate>1969</pubdate>
			</bibl>
			<bibl id="B10">
				<title>
					<p>The transitive reduction of a directed graph.</p>
				</title>
				<aug>
					<au>
						<snm>Aho</snm>
						<fnm>AV</fnm>
					</au>
					<au>
						<snm>Garey</snm>
						<fnm>MR</fnm>
					</au>
					<au>
						<snm>Ullman</snm>
						<fnm>JD</fnm>
					</au>
				</aug>
				<source>SIAM J Comput</source>
				<pubdate>1972</pubdate>
				<volume>1</volume>
				<fpage>131</fpage>
				<lpage>137</lpage>
			</bibl>
			<bibl id="B11">
				<title>
					<p>Development of a system for the inference of large scale genetic networks.</p>
				</title>
				<aug>
					<au>
						<snm>Maki</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Tominaga</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Okamoto</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Watanabe</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Eguchi</snm>
						<fnm>Y</fnm>
					</au>
				</aug>
				<source>Pac Symp Biocomput</source>
				<pubdate>2001</pubdate>
				<fpage>446</fpage>
				<lpage>458</lpage>
				<xrefbib>
					<pubid idtype="pmpid">11262963</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Engineering stability in gene networks by autoregulation.</p>
				</title>
				<aug>
					<au>
						<snm>Becskei</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Serrano</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2000</pubdate>
				<volume>405</volume>
				<fpage>590</fpage>
				<lpage>593</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35014651</pubid>
						<pubid idtype="pmpid" link="fulltext">10850721</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Negative autoregulation speeds the response times of transcription networks.</p>
				</title>
				<aug>
					<au>
						<snm>Rosenfeld</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Elowitz</snm>
						<fnm>MB</fnm>
					</au>
					<au>
						<snm>Alon</snm>
						<fnm>U</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>2002</pubdate>
				<volume>323</volume>
				<fpage>785</fpage>
				<lpage>793</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0022-2836(02)00994-4</pubid>
						<pubid idtype="pmpid" link="fulltext">12417193</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>From specific gene regulation to genomic networks: a global analysis of transcriptional regulation in <it>Escherichia coli</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Thieffry</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Huerta</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Perez-Rueda</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Collado-Vides</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>BioEssays</source>
				<pubdate>1998</pubdate>
				<volume>20</volume>
				<fpage>433</fpage>
				<lpage>440</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1002/(SICI)1521-1878(199805)20:5&lt;433::AID-BIES10&gt;3.0.CO;2-2</pubid>
						<pubid idtype="pmpid" link="fulltext">9670816</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Topological and causal structure of the yeast transcriptional regulatory network.</p>
				</title>
				<aug>
					<au>
						<snm>Guelzim</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Bottani</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Bourgine</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Kepes</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2002</pubdate>
				<volume>31</volume>
				<fpage>60</fpage>
				<lpage>63</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/ng873</pubid>
						<pubid idtype="pmpid" link="fulltext">11967534</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>Transcriptional regulatory networks in <it>Saccharomyces cerevisiae</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Lee</snm>
						<fnm>TI</fnm>
					</au>
					<au>
						<snm>Rinaldi</snm>
						<fnm>NJ</fnm>
					</au>
					<au>
						<snm>Robert</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Odom</snm>
						<fnm>DT</fnm>
					</au>
					<au>
						<snm>Bar-Joseph</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Gerber</snm>
						<fnm>GK</fnm>
					</au>
					<au>
						<snm>Hannett</snm>
						<fnm>NM</fnm>
					</au>
					<au>
						<snm>Harbison</snm>
						<fnm>CT</fnm>
					</au>
					<au>
						<snm>Thompson</snm>
						<fnm>CM</fnm>
					</au>
					<au>
						<snm>Simon</snm>
						<fnm>I</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>298</volume>
				<fpage>799</fpage>
				<lpage>804</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1075090</pubid>
						<pubid idtype="pmpid" link="fulltext">12399584</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Functional discovery via a compendium of expression profiles.</p>
				</title>
				<aug>
					<au>
						<snm>Hughes</snm>
						<fnm>TR</fnm>
					</au>
					<au>
						<snm>Marton</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Jones</snm>
						<fnm>AR</fnm>
					</au>
					<au>
						<snm>Roberts</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Stoughton</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Armour</snm>
						<fnm>CD</fnm>
					</au>
					<au>
						<snm>Bennett</snm>
						<fnm>HA</fnm>
					</au>
					<au>
						<snm>Coffey</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Dai</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>He</snm>
						<fnm>YD</fnm>
					</au>
					<etal/>
				</aug>
				<source>Cell</source>
				<pubdate>2000</pubdate>
				<volume>102</volume>
				<fpage>109</fpage>
				<lpage>126</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">10929718</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B18">
				<title>
					<p>Positive and negative regulation of a sterol biosynthetic gene (ERG3) in the post-squalene portion of the yeast ergosterol pathway.</p>
				</title>
				<aug>
					<au>
						<snm>Arthington-Skaggs</snm>
						<fnm>BA</fnm>
					</au>
					<au>
						<snm>Crowell</snm>
						<fnm>DN</fnm>
					</au>
					<au>
						<snm>Yang</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Sturley</snm>
						<fnm>SL</fnm>
					</au>
					<au>
						<snm>Bard</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>FEBS Lett</source>
				<pubdate>1996</pubdate>
				<volume>392</volume>
				<fpage>161</fpage>
				<lpage>165</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0014-5793(96)00807-1</pubid>
						<pubid idtype="pmpid" link="fulltext">8772195</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Transcriptional regulation by ergosterol in the yeast <it>Saccharomyces cerevisiae</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Smith</snm>
						<fnm>SJ</fnm>
					</au>
					<au>
						<snm>Crowley</snm>
						<fnm>JH</fnm>
					</au>
					<au>
						<snm>Parks</snm>
						<fnm>LW</fnm>
					</au>
				</aug>
				<source>Mol Cell Biol</source>
				<pubdate>1996</pubdate>
				<volume>16</volume>
				<fpage>5427</fpage>
				<lpage>5432</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8816455</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p>Upc2p and Ecm22p, dual regulators of sterol biosynthesis in <it>Saccharomyces cerevisiae</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Vik</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Rine</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Mol Cell Biol</source>
				<pubdate>2001</pubdate>
				<volume>21</volume>
				<fpage>6395</fpage>
				<lpage>6405</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1128/MCB.21.19.6395-6405.2001</pubid>
						<pubid idtype="pmpid" link="fulltext">11533229</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>The anatomy of a hypoxic operator in <it>Saccharomyces cerevisiae</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Deckert</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Torres</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Hwang</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Kastaniotis</snm>
						<fnm>AJ</fnm>
					</au>
					<au>
						<snm>Zitomer</snm>
						<fnm>RS</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1998</pubdate>
				<volume>150</volume>
				<fpage>1429</fpage>
				<lpage>1441</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9832521</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>Mot3 is a transcriptional repressor of ergosterol biosynthetic genes and is required for normal vacuolar function in <it>Saccharomyces cerevisiae</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Hongay</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Jia</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Bard</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Winston</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>EMBO J</source>
				<pubdate>2002</pubdate>
				<volume>21</volume>
				<fpage>4114</fpage>
				<lpage>4124</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/emboj/cdf415</pubid>
						<pubid idtype="pmpid" link="fulltext">12145211</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p>ROX1 and ERG regulation in <it>Saccharomyces cerevisiae</it>: implications for antifungal susceptibility.</p>
				</title>
				<aug>
					<au>
						<snm>Henry</snm>
						<fnm>KW</fnm>
					</au>
					<au>
						<snm>Nickels</snm>
						<fnm>JT</fnm>
					</au>
					<au>
						<snm>Edlind</snm>
						<fnm>TD</fnm>
					</au>
				</aug>
				<source>Eukaryot Cell</source>
				<pubdate>2002</pubdate>
				<volume>1</volume>
				<fpage>1041</fpage>
				<lpage>1044</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1128/EC.1.6.1041-1044.2002</pubid>
						<pubid idtype="pmpid" link="fulltext">12477804</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>The isolation and characterization of <it>Saccharomyces cerevisiae </it>mutants that constitutively express purine biosynthetic genes.</p>
				</title>
				<aug>
					<au>
						<snm>Guetsova</snm>
						<fnm>ML</fnm>
					</au>
					<au>
						<snm>Lecoq</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Daignan-Fornier</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1997</pubdate>
				<volume>147</volume>
				<fpage>383</fpage>
				<lpage>397</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9335580</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>Yeast AMP pathway genes respond to adenine through regulated synthesis of a metabolic intermediate.</p>
				</title>
				<aug>
					<au>
						<snm>Rebora</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Desmoucelles</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Borne</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Pinson</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Daignan-Fornier</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<source>Mol Cell Biol</source>
				<pubdate>2001</pubdate>
				<volume>21</volume>
				<fpage>7901</fpage>
				<lpage>7912</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1128/MCB.21.23.7901-7912.2001</pubid>
						<pubid idtype="pmpid" link="fulltext">11689683</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p><it>SRC1</it>: an intron-containing yeast gene involved in sister chromatid segregation.</p>
				</title>
				<aug>
					<au>
						<snm>Rodriguez-Navarro</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Igual</snm>
						<fnm>JC</fnm>
					</au>
					<au>
						<snm>Perez-Ortin</snm>
						<fnm>JE</fnm>
					</au>
				</aug>
				<source>Yeast</source>
				<pubdate>2002</pubdate>
				<volume>19</volume>
				<fpage>43</fpage>
				<lpage>54</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1002/yea.803</pubid>
						<pubid idtype="pmpid" link="fulltext">11754482</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>Network motifs in the transcriptional regulation network of <it>Escherichia coli</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Shen-Orr</snm>
						<fnm>SS</fnm>
					</au>
					<au>
						<snm>Milo</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Mangan</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Alon</snm>
						<fnm>U</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2002</pubdate>
				<volume>31</volume>
				<fpage>64</fpage>
				<lpage>68</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/ng881</pubid>
						<pubid idtype="pmpid" link="fulltext">11967538</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B28">
				<title>
					<p>Serial regulation of transcriptional regulators in the yeast cell cycle.</p>
				</title>
				<aug>
					<au>
						<snm>Simon</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Barnett</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Hannett</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Harbison</snm>
						<fnm>CT</fnm>
					</au>
					<au>
						<snm>Rinaldi</snm>
						<fnm>NJ</fnm>
					</au>
					<au>
						<snm>Volkert</snm>
						<fnm>TL</fnm>
					</au>
					<au>
						<snm>Wyrick</snm>
						<fnm>JJ</fnm>
					</au>
					<au>
						<snm>Zeitlinger</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Gifford</snm>
						<fnm>DK</fnm>
					</au>
					<au>
						<snm>Jaakkola</snm>
						<fnm>TS</fnm>
					</au>
					<etal/>
				</aug>
				<source>Cell</source>
				<pubdate>2001</pubdate>
				<volume>106</volume>
				<fpage>697</fpage>
				<lpage>708</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11572776</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B29">
				<title>
					<p>What does 'chromatin remodeling' mean?</p>
				</title>
				<aug>
					<au>
						<snm>Aalfs</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Kingston</snm>
						<fnm>RE</fnm>
					</au>
				</aug>
				<source>Trends Biochem Sci</source>
				<pubdate>2000</pubdate>
				<volume>25</volume>
				<fpage>548</fpage>
				<lpage>555</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0968-0004(00)01689-3</pubid>
						<pubid idtype="pmpid" link="fulltext">11084367</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<title>
					<p>Two regulators of Ste12p inhibit pheromone-responsive transcription by separate mechanisms.</p>
				</title>
				<aug>
					<au>
						<snm>Olson</snm>
						<fnm>KA</fnm>
					</au>
					<au>
						<snm>Nelson</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Tai</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Hung</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Yong</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Astell</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Sadowski</snm>
						<fnm>I</fnm>
					</au>
				</aug>
				<source>Mol Cell Biol</source>
				<pubdate>2000</pubdate>
				<volume>20</volume>
				<fpage>4199</fpage>
				<lpage>4209</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1128/MCB.20.12.4199-4209.2000</pubid>
						<pubid idtype="pmpid" link="fulltext">10825185</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B31">
				<title>
					<p><it>In vivo </it>chromatin remodeling by yeast ISWI homologs Isw1p and Isw2p.</p>
				</title>
				<aug>
					<au>
						<snm>Kent</snm>
						<fnm>NA</fnm>
					</au>
					<au>
						<snm>Karabetsou</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Politis</snm>
						<fnm>PK</fnm>
					</au>
					<au>
						<snm>Mellor</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Genes Dev</source>
				<pubdate>2001</pubdate>
				<volume>15</volume>
				<fpage>619</fpage>
				<lpage>626</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1101/gad.190301</pubid>
						<pubid idtype="pmpid" link="fulltext">11238381</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>A system for identifying genetic networks from gene expression patterns produced by gene disruptions and overexpressions.</p>
				</title>
				<aug>
					<au>
						<snm>Akutsu</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Kuhara</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Maruyama</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Miyano</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Genome Inform Ser Workshop Genome Inform</source>
				<pubdate>1998</pubdate>
				<volume>9</volume>
				<fpage>151</fpage>
				<lpage>160</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11072331</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B33">
				<title>
					<p>A gene network inference method from continuous-value gene expression data of wild-type and mutants.</p>
				</title>
				<aug>
					<au>
						<snm>Kyoda</snm>
						<fnm>KM</fnm>
					</au>
					<au>
						<snm>Morohashi</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Onami</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Kitano</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>Genome Inform Ser Workshop Genome Inform</source>
				<pubdate>2000</pubdate>
				<volume>11</volume>
				<fpage>196</fpage>
				<lpage>204</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11700600</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B34">
				<title>
					<p>Discovery of regulatory interactions through perturbation: inference and experimental design.</p>
				</title>
				<aug>
					<au>
						<snm>Ideker</snm>
						<fnm>TE</fnm>
					</au>
					<au>
						<snm>Thorsson</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Karp</snm>
						<fnm>RM</fnm>
					</au>
				</aug>
				<source>Pac Symp Biocomput</source>
				<pubdate>2000</pubdate>
				<volume>5</volume>
				<fpage>305</fpage>
				<lpage>316</lpage>
			</bibl>
			<bibl id="B35">
				<title>
					<p>Inferring subnetworks from perturbed expression profiles.</p>
				</title>
				<aug>
					<au>
						<snm>Pe'er</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Regev</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Elidan</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Friedman</snm>
						<fnm>N</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2001</pubdate>
				<volume>17</volume>
				<fpage>S215</fpage>
				<lpage>S224</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11473012</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p>Discovery of causal relationships in a gene-regulation pathway from a mixture of experimental and observational DNA microarray data.</p>
				</title>
				<aug>
					<au>
						<snm>Yoo</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Thorsson</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Cooper</snm>
						<fnm>GF</fnm>
					</au>
				</aug>
				<source>Pac Symp Biocomput</source>
				<pubdate>2002</pubdate>
				<fpage>498</fpage>
				<lpage>509</lpage>
				<xrefbib>
					<pubid idtype="pmpid">11928502</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B37">
				<title>
					<p>Using Bayesian networks to analyze expression data.</p>
				</title>
				<aug>
					<au>
						<snm>Friedman</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Linial</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Nachman</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Pe'er</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>J Comput Biol</source>
				<pubdate>2000</pubdate>
				<volume>7</volume>
				<fpage>601</fpage>
				<lpage>620</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1089/106652700750050961</pubid>
						<pubid idtype="pmpid">11108481</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B38">
				<title>
					<p>GenePath: a system for automated construction of genetic networks from mutant data.</p>
				</title>
				<aug>
					<au>
						<snm>Zupan</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Demsar</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Bratko</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Juvan</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Halter</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Kuspa</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Shaulsky</snm>
						<fnm>G</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2003</pubdate>
				<volume>19</volume>
				<fpage>383</fpage>
				<lpage>389</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/btf871</pubid>
						<pubid idtype="pmpid" link="fulltext">12584124</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>Mutations causing constitutive invertase synthesis in yeast: genetic interactions with <it>snf </it>mutations.</p>
				</title>
				<aug>
					<au>
						<snm>Neigeborn</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Carlson</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1987</pubdate>
				<volume>115</volume>
				<fpage>247</fpage>
				<lpage>253</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">3549450</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B40">
				<title>
					<p>Glucose repression in the yeast <it>Saccharomyces cerevisiae</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Trumbly</snm>
						<fnm>RJ</fnm>
					</au>
				</aug>
				<source>Mol Microbiol</source>
				<pubdate>1992</pubdate>
				<volume>6</volume>
				<fpage>15</fpage>
				<lpage>21</lpage>
				<xrefbib>
					<pubid idtype="pmpid">1310793</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>Systematic functional analysis of the <it>Caenorhabditis elegans </it>genome using RNAi.</p>
				</title>
				<aug>
					<au>
						<snm>Kamath</snm>
						<fnm>RS</fnm>
					</au>
					<au>
						<snm>Fraser</snm>
						<fnm>AG</fnm>
					</au>
					<au>
						<snm>Dong</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Poulin</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Durbin</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Gotta</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Kanapin</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Le Bot</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Moreno</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Sohrmann</snm>
						<fnm>M</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2003</pubdate>
				<volume>421</volume>
				<fpage>231</fpage>
				<lpage>237</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature01278</pubid>
						<pubid idtype="pmpid" link="fulltext">12529635</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>National Science Foundation-Sponsored Workshop Report: "The 2010 Project" functional genomics and the virtual plant. A blueprint for understanding how plants are built and how to improve them.</p>
				</title>
				<aug>
					<au>
						<snm>Chory</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Ecker</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Briggs</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Caboche</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Coruzzi</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Cook</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Dangl</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Grant</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Guerinot</snm>
						<fnm>ML</fnm>
					</au>
					<au>
						<snm>Henikoff</snm>
						<fnm>S</fnm>
					</au>
					<etal/>
				</aug>
				<source>Plant Physiol</source>
				<pubdate>2000</pubdate>
				<volume>123</volume>
				<fpage>423</fpage>
				<lpage>426</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1104/pp.123.2.423</pubid>
						<pubid idtype="pmpid" link="fulltext">10859172</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B43">
				<title>
					<p>The Berkeley Drosophila Genome Project gene disruption project: single P-element insertions mutating 25% of vital Drosophila genes.</p>
				</title>
				<aug>
					<au>
						<snm>Spradling</snm>
						<fnm>AC</fnm>
					</au>
					<au>
						<snm>Stern</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Beaton</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Rhem</snm>
						<fnm>EJ</fnm>
					</au>
					<au>
						<snm>Laverty</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Mozden</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Misra</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Rubin</snm>
						<fnm>GM</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1999</pubdate>
				<volume>153</volume>
				<fpage>135</fpage>
				<lpage>177</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">10471706</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B44">
				<title>
					<p>Functional profiling of the <it>Saccharomyces cerevisiae </it>genome.</p>
				</title>
				<aug>
					<au>
						<snm>Giaever</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Chu</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Ni</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Connelly</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Riles</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Veronneau</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Dow</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Lucau-Danila</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Anderson</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Andre</snm>
						<fnm>B</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2002</pubdate>
				<volume>418</volume>
				<fpage>387</fpage>
				<lpage>391</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature00935</pubid>
						<pubid idtype="pmpid" link="fulltext">12140549</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B45">
				<title>
					<p>Functional characterization of the <it>S. cerevisiae </it>genome by gene deletion and parallel analysis.</p>
				</title>
				<aug>
					<au>
						<snm>Winzeler</snm>
						<fnm>EA</fnm>
					</au>
					<au>
						<snm>Shoemaker</snm>
						<fnm>DD</fnm>
					</au>
					<au>
						<snm>Astromoff</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Liang</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Anderson</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Andre</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Bangham</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Benito</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Boeke</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Bussey</snm>
						<fnm>H</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>1999</pubdate>
				<volume>285</volume>
				<fpage>901</fpage>
				<lpage>906</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.285.5429.901</pubid>
						<pubid idtype="pmpid" link="fulltext">10436161</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B46">
				<title>
					<p>Systematic genetic analysis with ordered arrays of yeast deletion mutants.</p>
				</title>
				<aug>
					<au>
						<snm>Tong</snm>
						<fnm>AH</fnm>
					</au>
					<au>
						<snm>Evangelista</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Parsons</snm>
						<fnm>AB</fnm>
					</au>
					<au>
						<snm>Xu</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Bader</snm>
						<fnm>GD</fnm>
					</au>
					<au>
						<snm>Page</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Robinson</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Raghibizadeh</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Hogue</snm>
						<fnm>CW</fnm>
					</au>
					<au>
						<snm>Bussey</snm>
						<fnm>H</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2001</pubdate>
				<volume>294</volume>
				<fpage>2364</fpage>
				<lpage>2368</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1065810</pubid>
						<pubid idtype="pmpid" link="fulltext">11743205</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B47">
				<title>
					<p>Structure and function of the feed-forward loop network motif.</p>
				</title>
				<aug>
					<au>
						<snm>Mangan</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Alon</snm>
						<fnm>U</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2003</pubdate>
				<volume>100</volume>
				<fpage>11980</fpage>
				<lpage>11985</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1073/pnas.2133841100</pubid>
						<pubid idtype="pmpid" link="fulltext">14530388</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B48">
				<title>
					<p>Random graphs with arbitrary degree distributions and their applications.</p>
				</title>
				<aug>
					<au>
						<snm>Newman</snm>
						<fnm>ME</fnm>
					</au>
					<au>
						<snm>Strogatz</snm>
						<fnm>SH</fnm>
					</au>
					<au>
						<snm>Watts</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Phys Rev E Stat Nonlin Soft Matter Phys</source>
				<pubdate>2001</pubdate>
				<volume>64</volume>
				<fpage>026118</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1103/PhysRevE.64.026118</pubid>
						<pubid idtype="pmpid" link="fulltext">11497662</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B49">
				<aug>
					<au>
						<snm>West</snm>
						<fnm>DB</fnm>
					</au>
				</aug>
				<source>Introduction to Graph Theory</source>
				<publisher>Upper Saddle River, NJ: Prentice Hall</publisher>
				<pubdate>2001</pubdate>
			</bibl>
		</refgrp>
	</bm>
</art>
