<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2006-7-8-r77</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Method</dochead>
      <bibl>
         <title>
            <p>Statistical methods and software for the analysis of highthroughput reverse genetic assays using flow cytometry readouts</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Hahne</snm>
               <fnm>Florian</fnm>
               <insr iid="I1"/>
               <email>f.hahne@dkfz.de</email>
            </au>
            <au id="A2">
               <snm>Arlt</snm>
               <fnm>Dorit</fnm>
               <insr iid="I1"/>
               <email>d.arlt@dkfz.de</email>
            </au>
            <au id="A3">
               <snm>Sauermann</snm>
               <fnm>Mamatha</fnm>
               <insr iid="I1"/>
               <email>m.sauermann@dkfz.de</email>
            </au>
            <au id="A4">
               <snm>Majety</snm>
               <fnm>Meher</fnm>
               <insr iid="I1"/>
               <email>m.majety@dkfz.de</email>
            </au>
            <au id="A5">
               <snm>Poustka</snm>
               <fnm>Annemarie</fnm>
               <insr iid="I1"/>
               <email>a.poustka@dkfz.de</email>
            </au>
            <au id="A6">
               <snm>Wiemann</snm>
               <fnm>Stefan</fnm>
               <insr iid="I1"/>
               <email>s.wiemann@dkfz.de</email>
            </au>
            <au id="A7">
               <snm>Huber</snm>
               <fnm>Wolfgang</fnm>
               <insr iid="I2"/>
               <email>huber@ebi.ac.uk</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Division of Molecular Genome Analysis, German Cancer Research Center, INF 580, 69120 Heidelberg, Germany</p>
            </ins>
            <ins id="I2">
               <p>EMBL - European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2006</pubdate>
         <volume>7</volume>
         <issue>8</issue>
         <fpage>R77</fpage>
         <url>http://genomebiology.com/2006/7/8/R77</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">16916453</pubid>
               <pubid idtype="doi">10.1186/gb-2006-7-8-r77</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>18</day>
               <month>5</month>
               <year>2006</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>7</day>
               <month>7</month>
               <year>2006</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>17</day>
               <month>8</month>
               <year>2006</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>17</day>
               <month>08</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>Hahne et al.; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <shorttitle>
         <p>Software for high-throughput cytometry assays</p>
      </shorttitle>
      <shortabs>
         <p>A software tool for the analysis of high-throughput cell-based assays is presented.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>Highthroughput cell-based assays with flow cytometric readout provide a powerful technique for identifying components of biologic pathways and their interactors. Interpretation of these large datasets requires effective computational methods. We present a new approach that includes data pre-processing, visualization, quality assessment, and statistical inference. The software is freely available in the Bioconductor package prada. The method permits analysis of large screens to detect the effects of molecular interventions in cellular systems.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010004">Cell biology</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Cell-based assays permit functional profiling by probing the roles of molecular actors in biologic processes or phenotypes. They perturb the activity or abundance of gene products of interest and measure the resulting effect in a population of cells <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. This can be done in principle for any gene or combination of genes and any biologic process. There is a variety of technologies that rely on the availability of genomic resources such as full-length cDNA libraries <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>, small interfering RNA libraries <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>, or collections of protein-specific interfering ligands (small chemical compounds) <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Loss-of-function assays that investigate the effect of silencing or (partial) removal of a gene product or its activity <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> are distinguished from gain-of-function assays, in which the function of a gene product is analyzed after its abundance or activity is increased <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>.</p>
         <p>Depending on the process of interest, phenotypes can be assessed at various levels of complexity. In the simplest case a phenotype is a yes/no alternative, such as survival versus nonsurvival. More detail can be seen from a quantitative variable such as the activity of a reporter gene measured on a fluorescent plate reader, and even more complex features can involve time series or microscopic images. Although flow cytometry is among the standard methods in immunology, it has not been widely used in high-throughput screening, probably because of the lack of automation in data acquisition as well as in data analysis. However, the technology has evolved significantly in the recent past, and the latest generation of instruments can be equipped with high-throughput screening loaders that permit the measurement of large numbers of samples in reasonable periods of time <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. One major advantage of flow cytometry is its ability to measure multiple parameters for each individual cell of a cell population. Whereas conventional cell-based assays are limited to recording population averages, this approach allows the investigation of biologic variation at the single cell level.</p>
         <p>A broad range of tools is available for analyzing flow cytometry data at a small or intermediate scale <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>, but there is a lack of systematic computational approaches to analyze and rationally interpret the amount of data produced in high-throughput screens. Here we describe methods and software to fulfill these requirements.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <p>We demonstrate our methodology on a dataset that was collected in gain-of-function cellular screens probing for mediators of cell growth and division, in particular using assays for DNA replication, apoptosis, and mitogen-activated protein kinase (MAPK) signaling. The experiments were performed in 96-well microtiter plates in which each well contained cells transfected with a different overexpression construct. Along with the phenotype of interest, the amount of overexpression of the respective proteins was recorded via a fluorescent YFP (yellow fluorescent protein) tag. In the following discussion we refer to one microtiter plate as one experiment.</p>
         <p>The flow cytometry data consist of four values for each cell: two morphologic parameters and two fluorescence intensities. The morphologic parameters are forward light scatter (FSC) and sideward light scatter (SSC), and they measure cell size and cell granularity (the amount of light-impermeable structures within the cell). One of the fluorescence channels monitors emission from the YFP tag of the overexpressed protein, whereas the other channel detects the fluorescence of a fluorochrome-coupled antibody. Because many phenotypes are amenable to detection via specific antibodies, this can be considered a general assay design theme that, in principle, is applicable to a wide range of cellular processes.</p>
         <sec>
            <st>
               <p>Data pre-processing and quality</p>
            </st>
            <p>The pre-processing includes import of the result files from the fluorescence-activated cell sorting (FACS) instrument, assembly and cleaning up of the data, removal of systematic biases and drifts (a process often referred to as 'normalization'), and transformation to a format and scale that is suitable for the following analysis steps. Here we do not deal with the technical aspects of data import and management, and refer the interested reader to the documentation of the software package prada for a thorough discussion of these <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>.</p>
            <sec>
               <st>
                  <p>Selection of well measured cells on the basis of morphology</p>
               </st>
               <p>Most experimental cell populations are contaminated by a small amount of debris, cell conjugates, buffer precipitates, and air bubbles. The design of FACS instruments usually does not allow perfect discrimination of these contaminants from single, living cells during data acquisition, and hence they can end up in the raw data. To a certain extent we can discriminate contaminants from living cells using the morphologic properties provided by the FSC and SSC parameters. The joint distribution of FSC and SSC for transformed mammalian cells typically exhibits an elliptical shape, and most contaminants separate clearly from this main population (Figure <figr fid="F1">1a</figr>). The core distribution of healthy cells is approximated by a bivariate normal distribution in the (FSC, SSC) space, allowing the identification of outliers by their low probability density in that distribution. Thus, measured events that lie outside a certain density threshold can be regarded as contamination. We fit the bivariate normal distribution to the data by robust estimation of its center and its 2 &#215; 2 covariance matrix (Figure <figr fid="F1">1b</figr>). This is appropriate if the cell population is homogeneous, the proportion of contaminants is small, and the phenotype of interest is not itself associated with large changes in the FSC or SSC signal. A rough pre-selection using some fixed FSC and SSC threshold values, as provided by most FACS instruments, further increases robustness.</p>
               <fig id="F1">
                  <title>
                     <p>Figure 1</p>
                  </title>
                  <caption>
                     <p>Selection of well measured cells</p>
                  </caption>
                  <text>
                     <p>Selection of well measured cells. <b>(a) </b>Scatterplot of FACS data showing typical properties of morphologic parameters. FSC corresponds to cell size and SSC to cell granularity. Several subpopulations can be distinguished: (I) healthy and well measured cells, (II) cell debris, and (III) cell conjugates and air bubbles. <b>(b) </b>Robust fit of a bivariate normal distribution to the data. The ellipse represents a contour of equal probability density in the distribution and is used as a user-defined cut-off boundary (two standard deviations in this example). Points outside the ellipse (marked in red) are considered contaminants and are discarded from further analysis. Scatterplots of perturbation versus phenotype <b>(c) </b>before and <b>(d) </b>after removing contaminants. The proportion of outlier data points is reduced significantly. Here, they correspond to measurements with very small phenotype values (cell debris). FACS, fluorescence-activated cell sorting; FCS, forward light scatter; SSC, sideward light scatter.</p>
                  </text>
                  <graphic file="gb-2006-7-8-r77-1"/>
               </fig>
               <p>To see how this affects the data, Figure <figr fid="F1">1</figr> panels c and d show scatterplots of the two fluorescence channels measuring the perturbation and the phenotype before and after removal of contaminants. We observe a reduction in the proportion of data points with very small fluorescence values in both channels after removing contaminants. This is reasonable because the fluorescence staining is intracellular, and hence cell debris is not expected to emit strong fluorescence. In addition, we have removed some of the data points with very high fluorescence levels, which apparently correspond to cell conjugates.</p>
               <p>For our example data it is possible to determine global, experiment-wide parameters of the core distribution of healthy and well measured cells. However, some experimental settings may also demand adaptive estimates, for example if the cell morphology is expected to change as a result of the perturbation (as is the case for apoptotic cells) or if systematic shifts occur during the course of one experiment.</p>
            </sec>
            <sec>
               <st>
                  <p>Correlation of fluorescence and cell size</p>
               </st>
               <p>Regardless of the presence of fluorochromes, every cell emits light when it is excited by a laser - a phenomenon referred to as autofluorescence. Autofluorescence intensities frequently correlate with cell size, and through this effect often spurious correlations between different fluorescence channels can occur. In our data, the unspecific autofluorescence adds both to the specific fluorescence emitted by the fluorochrome-conjugated antibody measuring the phenotype and to that of the YFP-expressing construct, and it is positively correlated with cell size (Figure <figr fid="F2">2a,b</figr>). This results in an apparent, unspecific increase in the response variable for higher levels of perturbation (Figure <figr fid="F2">2c</figr>). To recover the specific signal we use FSC as a proxy for size, and fit the linear model:</p>
               <fig id="F2">
                  <title>
                     <p>Figure 2</p>
                  </title>
                  <caption>
                     <p>Correlation of fluorescence and cell size</p>
                  </caption>
                  <text>
                     <p>Correlation of fluorescence and cell size. Empiric cumulative distribution functions (ECDF) of fluorescence values for <b>(a) </b>perturbation and <b>(b) </b>phenotype showing their positive correlation with cell size. The fluorescence values were stratified into subsets corresponding to five quantiles (0-20%, 20-40%, 40-60%, 60-80%, and 80-100%) of cell size (forward light scatter), and the ECDF for each stratum was plotted in a different color. With increasing cell size, an increase in fluorescence values is also observed. <b>(c) </b>Regression line fitted to the data showing spurious correlation between the two parameters. In this case, the perturbation is known to cause no phenotype, and hence the correlation is considered to be artifactual. <b>(d) </b>After adjusting for cell size, the two parameters are uncorrelated.</p>
                  </text>
                  <graphic file="gb-2006-7-8-r77-2"/>
               </fig>
               <p><it>x</it><sub><it>total </it></sub>= <it>&#945; </it>+ <it>&#946;</it><it>s </it>+ <it>&#946;</it><sub><it>specific</it></sub>&#160;&#160;&#160;(1)</p>
               <p>Where <it>x</it><sub><it>total </it></sub>is the measured fluorescence intensity, <it>s </it>is the cell size as measured by the forward light scatter, <it>&#945; </it>and <it>&#946; </it>are the coefficients of the model, and <it>x</it><sub><it>specific </it></sub>is the specific fluorescence. We compute <it>&#945; </it>and <it>&#946; </it>by robust fit of a linear regression of <it>x</it><sub><it>total </it></sub>on <it>s</it>, and obtain estimates for <it>x</it><sub><it>specific </it></sub>from the residuals (Figure <figr fid="F2">2d</figr>). This is done for each fluorescence channel individually. The artifactual correlation due to autofluorescence is absorbed by <it>&#946;</it>. The parameter <it>&#945; </it>absorbs baseline fluorescence, as discussed below.</p>
            </sec>
            <sec>
               <st>
                  <p>Systematic variation in signal intensities between wells</p>
               </st>
               <p>In our data we often observe variation in the overall signal intensities for different wells on a microtiter plate (Figure <figr fid="F3">3a</figr>), which may be due to various drifts in the equipment, such as changes in laser power or pipetting efficiencies. Although such effects should ideally be avoided, and large variations should prompt reassessment of the experimental setup, small variations are adjusted by the model described by equation 1. In particular, they are fitted by the intercept term <it>&#945;</it>. The biologically relevant information is retained in the residuals. A common baseline of the adjusted values is obtained by adding the mean of <it>&#945; </it>averaged over all wells (Figure <figr fid="F3">3b</figr>).</p>
               <fig id="F3">
                  <title>
                     <p>Figure 3</p>
                  </title>
                  <caption>
                     <p>Systematic variation in signal intensities</p>
                  </caption>
                  <text>
                     <p>Systematic variation in signal intensities. <b>(a) </b>Box plot of raw fluorescence values measuring the phenotype for a 96-well microtiter plate. Differences in the mean values are identified for individual wells, and several wells are affected by a block effect. <b>(b) </b>Data after normalization.</p>
                  </text>
                  <graphic file="gb-2006-7-8-r77-3"/>
               </fig>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Statistical inference</p>
            </st>
            <p>Flow cytometry provides individual measurements for each cell of a population, and so we should like to use statistical procedures to model the behavior of the whole population and to draw significant conclusions. Choosing the appropriate statistical model is a crucial step in data analysis because we want it to represent as many features of the data as possible without imposing too many assumptions. For different biologic processes different types of responses can be expected, and so we also need different models. In our data we observe two types of response - binary and gradual.</p>
            <p>Many biologic processes can be considered on/off switches in which, after internal or external stimulation above a certain threshold, a distinct cellular event is triggered (Figure <figr fid="F4">4a</figr>). This kind of binary response is typical for apoptosis. One key player of the apoptotic pathway is the enzyme caspase-3, which is activated at the onset of apoptosis in most cell types. Activation is rapid and irreversible, and once the cell receives a signal to undergo apoptosis most or all of its caspase-3 molecules are proteolytically cleaved. This is the point of no return, and all subsequent steps inevitably lead to the death of the cell <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Thus, caspase-3 activation is essentially a binary measure of the apoptotic state of a cell. Similarly, cell proliferation is regulated in a binary manner, with cells only progressing further in the cell cycle after reception of appropriate signals.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Response types</p>
               </caption>
               <text>
                  <p>Response types. <b>(a) </b>Binary response. Above a certain threshold of perturbation, a discrete phenotype can be observed. <b>(b) </b>Continuous response. The effect size of the phenotype correlates with the amount of perturbation. It is typically measured for mild perturbation levels (x<sub>0</sub>).</p>
               </text>
               <graphic file="gb-2006-7-8-r77-4"/>
            </fig>
            <p>In contrast, many cellular signaling pathways are continuously regulated. The MAPK pathway, which plays a role in cell cycle regulation, is a prominent example. It consists of several kinases, enzymes with the ability to phosphorylate other molecules, in a hierarchical arrangement. By selective phosphorylation and de-phosphorylation reactions a signal can be passed along the hierarchy <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. The activity of this pathway can be continuously regulated both in a positive and in a negative manner. So, in contrast to apoptosis and cell proliferation, in which the response is essentially a yes/no decision, here the response is of a gradual nature (Figure <figr fid="F4">4b</figr>).</p>
            <sec>
               <st>
                  <p>Modeling binary responses</p>
               </st>
               <p>A natural approach to modeling binary responses is to dissect the data into four subtypes: perturbed versus nonperturbed cells, and cells exhibiting the effect of interest versus nonresponding cells (Figure <figr fid="F5">5a</figr>). Thresholds for this separation can be obtained either adaptively, for each well, or more globally, for the whole plate. Because of the potential problems with over-fitting in the adaptive approach, we choose the latter, making use of the premise that the values of the pre-processed data are comparable across the plate. Figure <figr fid="F5">5b</figr> shows thresholds determined from a high percentile (99%) of the data from a negative control.</p>
               <fig id="F5">
                  <title>
                     <p>Figure 5</p>
                  </title>
                  <caption>
                     <p>Setup of boundaries</p>
                  </caption>
                  <text>
                     <p>Setup of boundaries. <b>(a) </b>Discretization of data showing binary response in four subtypes. <b>(b) </b>Mock control used for setup of boundaries.</p>
                  </text>
                  <graphic file="gb-2006-7-8-r77-5"/>
               </fig>
               <p>An estimator for the odds ratio, a measure of the effect size, is defined by the following equation:</p>
               <p>
                  <m:math name="gb-2006-7-8-R77-i1" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>O</m:mi>
                           <m:mi>R</m:mi>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mi>p</m:mi>
                                 <m:mi>p</m:mi>
                                 <m:mo>+</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                              <m:mrow>
                                 <m:mi>p</m:mi>
                                 <m:mi>n</m:mi>
                                 <m:mo>+</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                           </m:mfrac>
                           <m:mo>&#8901;</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mi>n</m:mi>
                                 <m:mi>n</m:mi>
                                 <m:mo>+</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                              <m:mrow>
                                 <m:mi>n</m:mi>
                                 <m:mi>p</m:mi>
                                 <m:mo>+</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                           </m:mfrac>
                           <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                           <m:mrow>
                              <m:mo>(</m:mo>
                              <m:mn>2</m:mn>
                              <m:mo>)</m:mo>
                           </m:mrow>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGpbGaamOuaiabg2da9maalaaabaGaamiCaiaadchacqGHRaWkcaaIXaaabaGaamiCaiaad6gacqGHRaWkcaaIXaaaaiabgwSixpaalaaabaGaamOBaiaad6gacqGHRaWkcaaIXaaabaGaamOBaiaadchacqGHRaWkcaaIXaaaaiaaxMaacaWLjaWaaeWaceaacaaIYaaacaGLOaGaayzkaaaaaa@49E7@</m:annotation>
                     </m:semantics>
                  </m:math>
               </p>
               <p>The symbols on the right hand side of equation 2 are defined in Figure <figr fid="F5">5a</figr>. Pseudo-counts of 1 are added in order to avoid infinite values in the case of empty quadrants <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. It is often convenient to consider the logarithm of the odds ratio, because it is symmetric for upward and downward effects. To test for the significance against the null hypothesis of no effect, we use the Fisher test <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>.</p>
               <p>Sample results from a screen aiming to identify activators of the apoptosis pathway are shown in Figure <figr fid="F6">6</figr>. Overexpression of the Fas receptor protein in Figure <figr fid="F6">6b</figr> leads to strong activation of apoptosis, as indicated by both high effect size and a significant <it>P </it>value. This is consistent with the cellular role played by the Fas receptor, which mediates apoptosis activation as a consequence of extracellular signaling. Overexpression of the YFP protein in Figure <figr fid="F6">6a</figr> apparently does not affect apoptosis, proving that the activation in Figure <figr fid="F6">6b</figr> is not caused by the fluorescence tag alone.</p>
               <fig id="F6">
                  <title>
                     <p>Figure 6</p>
                  </title>
                  <caption>
                     <p>Example results for binary response-type assays from a screen targeting apoptosis regulation</p>
                  </caption>
                  <text>
                     <p>Example results for binary response-type assays from a screen targeting apoptosis regulation. Cell counts for the respective quadrants are indicated on the edges of the plots. <b>(a) </b>Non-affector (YFP), with effect size close to zero and insignificant <it>P </it>value. <b>(b) </b>Activator (Fas receptor), with both large effect size and significant <it>P </it>value. OR, odds ratio.</p>
                  </text>
                  <graphic file="gb-2006-7-8-r77-6"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Modeling continuous responses</p>
               </st>
               <p>The gradual nature of these types of responses supports the use of regression analysis. Because the effect may deviate from linearity in the range of perturbations that we observe, we use a robust local regression fit:</p>
               <p><it>y </it>= <it>m</it>(<it>x</it>) + <it>&#949;</it>&#160;&#160;&#160;(3)</p>
               <p>Where <it>x </it>is the perturbation signal, <it>y </it>is the response, <it>m </it>is a smooth function (for example, a piece-wise polynomial), and <it>&#949; </it>is a noise term. We obtain an estimate of <it>m </it>from the function locfit.robust in the R package locfit <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. This also calculates</p>
               <p><it>&#948; </it>= <m:math name="gb-2006-7-8-R77-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mover accent="true"><m:mi>m</m:mi><m:mo>^</m:mo></m:mover><m:mrow><m:mo>(</m:mo><m:mrow><m:msub><m:mi>x</m:mi><m:mn>0</m:mn></m:msub></m:mrow><m:mo>)</m:mo></m:mrow></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaaceWGTbGbaKaadaqadiqaaiaadIhadaWgaaWcbaGaaGimaaqabaaakiaawIcacaGLPaaaaaa@37A7@</m:annotation></m:semantics></m:math>&#160;&#160;&#160;(4)</p>
               <p>which is a robust estimate of the slope of <it>m </it>at the point <it>x</it><sub>0</sub>. <it>x</it><sub>0 </sub>is an assay-wide, user-defined parameter that corresponds to a mild perturbation that does not deviate strongly from the physiologic value. This approach is resistant to nonlinear, biologically artifactual effects caused by perturbations that are too strong, without the need for a sharp cut-off. To obtain a dimensionless measure of <it>effect size</it>, we divide</p>
               <p>
                  <m:math name="gb-2006-7-8-R77-i3" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>z</m:mi>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mi>&#948;</m:mi>
                              <m:mrow>
                                 <m:msub>
                                    <m:mi>&#948;</m:mi>
                                    <m:mn>0</m:mn>
                                 </m:msub>
                              </m:mrow>
                           </m:mfrac>
                           <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                           <m:mrow>
                              <m:mo>(</m:mo>
                              <m:mn>5</m:mn>
                              <m:mo>)</m:mo>
                           </m:mrow>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWG6bGaeyypa0ZaaSaaaeaaiiGacqWF0oazaeaacqWF0oazdaWgaaWcbaGaaGimaaqabaaaaOGaaCzcaiaaxMaadaqadiqaaiaaiwdaaiaawIcacaGLPaaaaaa@3D0C@</m:annotation>
                     </m:semantics>
                  </m:math>
               </p>
               <p>Where <it>&#948;</it><sub>0 </sub>is a scale parameter of the overall, assay-wide distribution of <it>&#948;</it>. We use the median absolute value of all <it>&#948; </it>in the assay. A simple measure of the significance against the null hypothesis of no effect is obtained through dividing the estimate <m:math name="gb-2006-7-8-R77-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mover accent="true"><m:mi>m</m:mi><m:mo>^</m:mo></m:mover><m:mrow><m:mo>(</m:mo><m:mrow><m:msub><m:mi>x</m:mi><m:mn>0</m:mn></m:msub></m:mrow><m:mo>)</m:mo></m:mrow></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaaceWGTbGbaKaadaqadiqaaiaadIhadaWgaaWcbaGaaGimaaqabaaakiaawIcacaGLPaaaaaa@37A7@</m:annotation></m:semantics></m:math> by its estimated standard deviation, and by assumption of normality a <it>P </it>value is obtained.</p>
               <p>The plots in Figure <figr fid="F7">7</figr> show the fitted local regression for three examples from a cell-based assay targeting the MAPK pathway. As a result of the overexpression of the phospholipase C <it>&#948;</it>4 (PLCD4) protein, our method detects a significant induction of extracellular signal-regulated kinase (ERK) activation (Figure <figr fid="F7">7a</figr>) - a finding that is consistent with previous reports <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. As expected, overexpression of the dual specificity protein phosphatase (DUSP)10 protein strongly inactivates MAPK signaling (Figure <figr fid="F7">7b</figr>), whereas overexpression of the YFP protein has no effect (Figure <figr fid="F7">7c</figr>).</p>
               <fig id="F7">
                  <title>
                     <p>Figure 7</p>
                  </title>
                  <caption>
                     <p>Example results for continuous responses from a MAPK screen</p>
                  </caption>
                  <text>
                     <p>Example results for continuous responses from a MAPK screen. Effect size <it>z </it>and <it>P </it>value for <b>(a) </b>an activator (PLCD4), <b>(b) </b>a repressor (DUSP10), and <b>(c) </b>a non-affector (YFP) of the MAPK signaling. DUSP, dual specificity protein phosphatase; MAPK, mitogen-activated protein kinase; PLCD4, phospholipase C <it>&#948;</it>4; YFP, yellow fluorescent protein.</p>
                  </text>
                  <graphic file="gb-2006-7-8-r77-7"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Summarizing replicate experiments</p>
               </st>
               <p>The <it>P </it>values obtained from the previous section test the statistical association between the fluorescence signals from the overexpressed YFP-tagged proteins and the reporter-specific antibodies for the cell population in one particular well. It is important to note that this only takes into account the cell-to-cell variability within that well and does not reflect higher levels of experimental and biologic variability. Hence, the results from a single well cannot simply be taken as a measure of biologic significance. To gain confidence in the biologic significance of a result, the next step is to consider measurements over several independently replicated wells.</p>
               <p>The most obvious approach to summarizing data from replicate measurements for the same gene is to combine the effect size estimates and the <it>P </it>values from the individual replicates using tools from statistical meta-analysis <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. However, because all of the data are available, the more direct and probably more efficient approach is to generalize the previous analysis methods and to deal with replicate wells. In particular, for stratified contingency tables in the case of binary responses, we use the stratified <it>&#935;</it><sup>2</sup>-statistic in the Cochran-Mantel-Haenszel test <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. For stratified continuous responses we extend equation 3:</p>
               <p><it>y </it>= <it>y</it><sub><it>i </it></sub>+ <it>m</it>(<it>x </it>- <it>x</it><sub><it>i</it></sub>) + <it>&#949;</it>&#160;&#160;&#160;(6)</p>
               <p>Where <it>i </it>= 1, 2, ... counts over the replicates and x<sub>i </sub>and y<sub>i </sub>are replicate specific offsets. Again, in both cases we obtain estimates of effect size as well as significance.</p>
            </sec>
            <sec>
               <st>
                  <p>Interpreting effect size and significance</p>
               </st>
               <p>Because of the large number of tests performed, it is necessary to adjust for multiple testing. Good software for this is available in the R packages qvalue and multtest, and we recommend the reports by Storey <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> and Pollard <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> and their coworkers for methodologic background.</p>
               <p>Even after multiple testing adjustment, one will often encounter situations in which for many of the screened genes the null hypothesis of no effect will be rejected, although the effect sizes (equations 2 and 5) may be quite small for most of them. This can happen because of the large number of cells observed for each gene, and it is a well known phenomenon of statistical testing; when the number of data points becomes large, hypothesis tests will eventually reject any null hypothesis that differs from the truth, even in the most negligible manner <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. Such cases are unlikely to be biologically interesting. Hence, for biologically relevant effectors we require both the effect size estimate to be above a certain threshold and the adjusted <it>P </it>value to be small.</p>
               <p>Finally, as with any biologic assay, to corroborate conclusively the role of a protein in the cellular process of interest, independent validation experiments must be conducted according to best experimental practice.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Visualization and quality assessment</p>
            </st>
            <p>Visualization methods exploit the most advanced pattern recognition system, the human visual system. However, it can only deal with a limited amount of dimensionality and complexity, and hence it benefits from assistance by computational methods for dimension reduction and feature extraction.</p>
            <p>Here, our main focus is on the use of visualization for quality assessment, which for our kind of data must be done on three different levels: at the level of the individual well, with resolution down to data from individual cells; at the level of a microtiter plate, with resolution down to individual wells; or at the level of the gene of interest, which usually comprises several replicate experiments.</p>
            <sec>
               <st>
                  <p>Visualization at the level of individual wells</p>
               </st>
               <p>A simple but useful way to visualize bivariate data is by means of a scatterplot. However, it is difficult to get a good impression of the distribution of the data when the number of observations is large and the points become too dense (Figure <figr fid="F8">8a</figr>). This is a problem for cytometry data with often more than 20,000 data points. A way to circumvent this limitation (which has already been applied in some of the previous figures) is by plotting the densities of the data points at a given region <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> instead of individual points (Figure <figr fid="F8">8d</figr>) or, alternatively, by plotting each single point using a color coding that represents the density at its position (Figure <figr fid="F8">8c</figr>). We prefer false color coding to the commonly used contour plots (Figure <figr fid="F8">8b</figr>) because we find it more intuitive. By further augmenting false color density plots with outlying points, one can also visualize the data in sparse regions of the plot. We compute densities using a kernel density estimate.</p>
               <fig id="F8">
                  <title>
                     <p>Figure 8</p>
                  </title>
                  <caption>
                     <p>Options to create plots with high point densities</p>
                  </caption>
                  <text>
                     <p>Options to create plots with high point densities. <b>(a) </b>Almost no features of the data distribution are visible in the simple scatter plot. <b>(b) </b>The contour plot reveals the bimodality of the data. <b>(c) </b>Coloring of points according to point density and <b>(d) </b>density map with additional points in sparse regions.</p>
                  </text>
                  <graphic file="gb-2006-7-8-r77-8"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Visualization at the level of microtiter plates</p>
               </st>
               <p>Most high-throughput applications in cell biology are carried out on microtiter plates which come in different formats, usually as a rectangular arrangement of 24, 96, 384, or 1536 wells. Each well may contain cells that have been treated in a different manner. An intuitive approach for visualization is to use the familiar spatial layout of the plate. Figure <figr fid="F9">9a</figr> shows an example of what we call a plate plot for a 96-well plate. It indicates the number of cells identified in each well. The consistently low number of cells on the edges of the plate suggests a handling problem, and subsequent analysis steps are possibly affected by this artifact. Other quantities of interest often include the average fluorescence of each well, for example to monitor expression efficiency or to detect artifactual shifts in the response.</p>
               <fig id="F9">
                  <title>
                     <p>Figure 9</p>
                  </title>
                  <caption>
                     <p>Plate plots show several aspects of the data in a format resembling a microtiter plate</p>
                  </caption>
                  <text>
                     <p>Plate plots show several aspects of the data in a format resembling a microtiter plate. This is useful for detecting spatial effects and to present concisely the data belonging to one experiment. <b>(a) </b>Quantitative values: number of cells in the well. The consistently lower number of cells at the edges of the plate indicate problems during cultivation. <b>(b) </b>Qualitative values: activators (red) and inhibitors (blue) of the process of interest. Wells that did not pass quality requirements are crossed out and wells containing cells treated with controls are indicated by capital letters. Cells in the first four rows of the plate were transfected with amino-terminally tagged expression constructs, and rows five to eight with carboxyl-terminally tagged constructs. <b>(c) </b>Comparison of results from four replicate plates. Each slice contains data from one replicate. Reproducibility between replicates is very high.</p>
                  </text>
                  <graphic file="gb-2006-7-8-r77-9"/>
               </fig>
               <p>Plate plots can also be used to present qualitative variables. Figure <figr fid="F9">9b</figr> shows the negative log transformed odds ratios from the statistical analysis of a 96-well plate from a cell proliferation assay. Negative values indicate inhibition of cell proliferation and are colored in blue, whereas positive values correspond to activation as indicated in red. The attention of the experimenter is immediately drawn to the few interesting wells and spatial regularities are easily spotted. In this example, we can compare the upper and lower halves of the plate; the top half contains cells transfected with carboxyl-terminally tagged constructs and the bottom half contains cell transfected with amino-terminally tagged constructs of the same genes. Additional information is added to the plot by using further formatting options, for instance crossing out of wells discarded from analysis or plotting additional symbols on wells with controls.</p>
               <p>The amount of information included in a plate plot can be extended further by decorating it with tool tips and hyperlinks. When viewed in a browser, a tool tip is a short textual annotation, for example a gene name, that is displayed when the mouse pointer moves over a plot element. A hyperlink can be used to display more detailed information, even a graphic, in another browser window or frame. For example, underlying each value that is displayed in a plate plot such as Figure <figr fid="F9">9b</figr> is a complex statistical analysis, the details of which can be displayed on demand by hyperlinking them to the corresponding well icons in the plate plot. The reader is directed to the online complement <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> for an interactive example. Using plate plots in this way provides a powerful organizational structure for drill-down facilities because potentially interesting candidates are easily identified on a plate and the range of detailed information enables the experimenter to audit steps of the analysis procedure.</p>
            </sec>
            <sec>
               <st>
                  <p>Gene centered visualization</p>
               </st>
               <p>Because experiments are done in replicates, another level of visualization is needed to compare multiple measurements of the same gene over several plates. For a limited number of replicates the plate plot concept can be utilized. Besides colored circles, as in Figure <figr fid="F9">9</figr> panels a and b, its implementation allows us to plot arbitrary graphs at each well position. In Figure <figr fid="F9">9c</figr> we use segmented charts to display the results from four replicate experiments (we call this a 'pizza plot'). For more extensive datasets, Figure <figr fid="F10">10</figr> shows how hyperlinked box plots can be used to display multiple relevant aspects of the data. In this example they allow exploration of the effect of the orientation of the carboxyl-terminal or amino-terminal YFP fusion in the expression vectors.</p>
               <fig id="F10">
                  <title>
                     <p>Figure 10</p>
                  </title>
                  <caption>
                     <p>Interactive box plot of effect sizes from replicate experiments for a 96-well plate</p>
                  </caption>
                  <text>
                     <p>Interactive box plot of effect sizes from replicate experiments for a 96-well plate. Proteins showing consistently high or low effect sizes can easily be identified. By clicking on the individual boxes in the upper panel, a drill-down to the underlying data is provided in the lower panel, which shows the individual measurement values for both fluorescence tags as vertical bars along the x-axis. In this example, only the expression of the amino-terminally tagged protein results in significantly elevated effect sizes.</p>
                  </text>
                  <graphic file="gb-2006-7-8-r77-10"/>
               </fig>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Application</p>
            </st>
            <p>We applied our method to the dataset introduced in the section Materials and methods (below) and verified the effects of positive and negative control genes of known function for each of the three assays with high specificity (Figure <figr fid="F11">11</figr>), thus validating the approach. The positive control for the apoptosis assay were vectors expressing CIDE3 (cell-death-inducing DFF45-like effector 3) and the Fas receptor, and the negative control were vectors expressing cyclin-dependent kinase and YFP. Positive and negative controls for the proliferation assay were vectors expressing cyclin A and YFP, respectively. In the MAPK assay, overexpression of DUSP10 was used as a positive control, and overexpression of YFP was used as a negative control. A total of 273 open reading frames (ORFs) encoding proteins of unknown function were selected based on cancer-associated alterations in their respective mRNA transcription. These ORFs were cloned in 546 amino-terminally as well as carobxyl-terminally fused expression constructs and were subsequently screened in the three assays. Eleven inhibitors and two activators of ERK phosphorylation were identified in the MAPK assay. The proliferation screen revealed four activators and five inhibitors. Eleven activators with significant effect on programmed cell death were identified in the apoptosis screen. For further details on these proteins, see Additional data file 1. The complete dataset is freely available from our web server <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>.</p>
            <fig id="F11">
               <title>
                  <p>Figure 11</p>
               </title>
               <caption>
                  <p>Separation of positive and negative controls</p>
               </caption>
               <text>
                  <p>Separation of positive and negative controls. Top panels: effect sizes of positive and negative controls (y-axis) for individual plates (x-axis). Bottom panels: density plots of the joint effect sizes for controls across all plates. <b>(a) </b>Controls for the apoptosis assay are CIDE3 (positive) and CDK (negative). <b>(b) </b>Controls for the proliferation assay are cyclin A (positive) and YFP (negative). <b>(c) </b>Controls for the MAPK assay are DUSP10 (positive) and YFP (negative). The measured effect sizes for positive and negative controls separate well. CDK, cyclin-dependent kinase; DUSP, dual specificity protein phosphatase; MAPK, mitogen-activated protein kinase; YFP, yellow fluorescent protein.</p>
               </text>
               <graphic file="gb-2006-7-8-r77-11"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>The increasing application of high-throughput technologies in cell biology has opened the way for systematic studies to be carried out on a large scale. This will allow us to gain an understanding of complex systems such as cellular pathways, because of the ability to measure the large number of parameters needed to model and reconstruct such systems (for instance, by combinatorial perturbations or time course experiments). However, the main prerequisite is a uniform, quantitative and comparable analysis of the raw data in order to integrate efficiently the information collected. Analyzing and managing the vast amount of data generated in these studies initially seems to be a daunting task.</p>
         <p>Here, we show the complete work flow from raw flow cytometry data to a list of genes that are components of or interact with the cellular process of interest. Procedures (methodologic recommendations as well as software) for data pre-processing are presented that can be used to deal with typical sources of systematic variation. We stress the importance of monitoring crucial steps during analysis and show a range of visualization tools for quality control. Techniques are suggested to assess the data on different levels and to present results in a concise and meaningful way. By applying statistical methods, we are able to identify interesting phenotypes based on a set of objective criteria rather than relying on manual selections. Because data are available for each cell of a cell population, we are able to extract several kinds of information. Stratified statistical tests and models allow us to combine results from replicate experiments, further increasing precision.</p>
         <p>To select genes of interest we consider two parameters, a threshold for the <it>P </it>value as well as one for the effect size. It is important to note that statistical significance and effect size are independent quantities, and that we must impose conditions on both of them if we are to obtain relevant results. In our screen the main focus lies on identifying candidates out of a pool of functionally unknown genes for further, in-depth analyses; thus, specificity is given preference over sensitivity, which is reflected in a rather conservative selection of threshold values.</p>
         <p>Some of the methods described here are specific to flow cytometry measurements, but most of the visualization should also be applicable to data from other sources. Here we have only considered two simple models: binary and continuous responses. However, cell-based assays can be designed to assess almost any cellular process, and as the complexity of the observed phenotypes increase, so do the necessary statistical models. However, there will always be a need to summarize and simplify data to a form that is amenable to visual inspection and that allows for drill-down to more detailed aspects. In addition to specified analyses, we also wish to provide a framework that is easily adaptable and extendable to more complex assays and phenotypes.</p>
         <p>All functionality is implemented using the statistical programming language R and is available as the software package prada through the open source Bioconductor project <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <p>A total of 273 ORFs encoding proteins of unknown function were selected based on cancer-associated alterations in their respective mRNA transcription <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. HEK 293T cells were transfected with expression constructs of the respective genes of interest fused to the YFP under the control of a cytomegalovirus promoter <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. The amino-terminal or carboxyl-terminal fluorescence tags allowed us to monitor the level of expression along with the detection of induced effects. Cells were fixed 48 hours (MAPK and DNA replication assay) or 72 hours (apoptosis assay) after transfection and stained intracellularly with specific antibodies. Different antibodies were used for the different assays, each specifically measuring the phenotype of interest. In the case of cell proliferation, the antibody detected the incorporation of the thymidine analog BrdU into the replicated DNA. An antibody specific for the activated form of the caspase-3 apoptosis regulator was employed in the apoptosis assay; a phospho-specific antibody detecting phosphorylated ERK2 was used to measure activation of MAPK signaling. The same secondary antibody coupled to Allophycocyanin (APC) was used for immunostaining in all three assays. Flow cytometry data were acquired using an automated FACS instrument (BD FACS Calibur, Becton Dickinson Biosciences, 2350 Qume Drive, San Jose, Ca, USA).</p>
      </sec>
      <sec>
         <st>
            <p>Additional data file</p>
         </st>
         <p>The following additional data are included with the online version of this article: The vignette of the accompanying R data package containing code samples and a more detailed description of the individual computational analysis steps, as well as tables of the candidates from our dataset identified in the three assays (Additional data file <supplr sid="S1">1</supplr>).</p>
         <suppl id="S1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>Sample analysis of cell-based screens</p>
            </caption>
            <text>
               <p>The vignette of the accompanying R data package containing code samples and a more detailed description of the individual computational analysis steps, as well as tables of the candidates from our dataset identified in the three assays.</p>
            </text>
            <file name="gb-2006-7-8-r77-S1.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Sarah Dyer for critical reading of the manuscript. This work was supported by the Bundesministerium f&#252;r Bildung und Forschung (BMBF) grant 01GR0420 (National Genome Research Network), the European Commission Programme '6th Framework', Marie Curie Host Fellowship, contract number MEST-CT-2004-513973, and a PhD fellowship of the German Cancer Research Center (DKFZ).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Systematic genome-wide screens of gene function.</p>
            </title>
            <aug>
               <au>
                  <snm>Carpenter</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Sabatini</snm>
                  <fnm>DM</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>11</fpage>
            <lpage>22</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg1248</pubid>
                  <pubid idtype="pmpid" link="fulltext">14708012</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Building mammalian signalling pathways with RNAi screens.</p>
            </title>
            <aug>
               <au>
                  <snm>Moffat</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Sabatini</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nat Rev Mol Cell Biol</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>177</fpage>
            <lpage>187</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrm1860</pubid>
                  <pubid idtype="pmpid" link="fulltext">16496020</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>From ORFeome to biology: a functional genomics pipeline.</p>
            </title>
            <aug>
               <au>
                  <snm>Wiemann</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Arlt</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Huber</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Wellenreuther</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Schleeger</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mehrle</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bechtel</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sauermann</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Korf</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Pepperkok</snm>
                  <fnm>R</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <fpage>2136</fpage>
            <lpage>2144</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">528930</pubid>
                  <pubid idtype="pmpid" link="fulltext">15489336</pubid>
                  <pubid idtype="doi">10.1101/gr.2576704</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Strausberg</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Feingold</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Grouse</snm>
                  <fnm>LH</fnm>
               </au>
               <au>
                  <snm>Derge</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Klausner</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Collins</snm>
                  <fnm>FS</fnm>
               </au>
               <au>
                  <snm>Wagner</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Shenmen</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Schuler</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <etal/>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>16899</fpage>
            <lpage>16903</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">139241</pubid>
                  <pubid idtype="pmpid" link="fulltext">12477932</pubid>
                  <pubid idtype="doi">10.1073/pnas.242603899</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Complete sequencing and characterization of 21,243 full-length human cDNAs.</p>
            </title>
            <aug>
               <au>
                  <snm>Ota</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Nishikawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Otsuki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sugiyama</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Irie</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wakamatsu</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hayashi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sato</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Nagai</snm>
                  <fnm>K</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2004</pubdate>
            <volume>36</volume>
            <fpage>40</fpage>
            <lpage>45</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1285</pubid>
                  <pubid idtype="pmpid" link="fulltext">14702039</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>A <it>Drosophila </it>full-length cDNA resource.</p>
            </title>
            <aug>
               <au>
                  <snm>Stapleton</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Carlson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Brokstein</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Champe</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>George</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Guarin</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kronmiller</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Pacleb</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>S</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>RESEARCH0080</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">151182</pubid>
                  <pubid idtype="pmpid" link="fulltext">12537569</pubid>
                  <pubid idtype="doi">10.1186/gb-2002-3-12-research0080</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs.</p>
            </title>
            <aug>
               <au>
                  <snm>Okazaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Furuno</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kasukawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Adachi</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bono</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kondo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nikaido</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Osato</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Saito</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>H</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>420</volume>
            <fpage>563</fpage>
            <lpage>573</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01266</pubid>
                  <pubid idtype="pmpid" link="fulltext">12466851</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Functional profiling of the <it>Saccharomyces cerevisiae </it>genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Giaever</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Chu</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Ni</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Connelly</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Riles</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Veronneau</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dow</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lucau-Danila</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Anderson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Andre</snm>
                  <fnm>B</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>418</volume>
            <fpage>387</fpage>
            <lpage>391</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature00935</pubid>
                  <pubid idtype="pmpid" link="fulltext">12140549</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Systematic functional analysis of the <it>Caenorhabditis elegans </it>genome using RNAi.</p>
            </title>
            <aug>
               <au>
                  <snm>Kamath</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Dong</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Poulin</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gotta</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kanapin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bot</snm>
                  <fnm>NL</fnm>
               </au>
               <au>
                  <snm>Moreno</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sohrmann</snm>
                  <fnm>M</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>421</volume>
            <fpage>231</fpage>
            <lpage>237</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01278</pubid>
                  <pubid idtype="pmpid" link="fulltext">12529635</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Genome-wide RNAi analysis of growth and viability in <it>Drosophila </it>cells.</p>
            </title>
            <aug>
               <au>
                  <snm>Boutros</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kiger</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Armknecht</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kerr</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hild</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Koch</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Haas</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Consortium</snm>
                  <fnm>HFA</fnm>
               </au>
               <au>
                  <snm>Paro</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Perrimon</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>303</volume>
            <fpage>832</fpage>
            <lpage>835</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1091266</pubid>
                  <pubid idtype="pmpid" link="fulltext">14764878</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>A resource for large-scale RNA-interference-based screens in mammals.</p>
            </title>
            <aug>
               <au>
                  <snm>Paddison</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Silva</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Conklin</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Schlabach</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Aruleba</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Balija</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>O'Shaughnessy</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gnoj</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Scobie</snm>
                  <fnm>K</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>428</volume>
            <fpage>427</fpage>
            <lpage>431</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature02370</pubid>
                  <pubid idtype="pmpid" link="fulltext">15042091</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>A large-scale RNAi screen in human cells identifies new components of the p53 pathway.</p>
            </title>
            <aug>
               <au>
                  <snm>Berns</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hijmans</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Mullenders</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Brummelkamp</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Velds</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Heimerikx</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kerkhoven</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Madiredjo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nijkamp</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Weigelt</snm>
                  <fnm>B</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>428</volume>
            <fpage>431</fpage>
            <lpage>437</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature02371</pubid>
                  <pubid idtype="pmpid" link="fulltext">15042092</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Small molecule inhibitor of mitotic spindle bipolarity identified in a phenotype-based screen.</p>
            </title>
            <aug>
               <au>
                  <snm>Mayer</snm>
                  <fnm>TU</fnm>
               </au>
               <au>
                  <snm>Kapoor</snm>
                  <fnm>TM</fnm>
               </au>
               <au>
                  <snm>Haggarty</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Schreiber</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Mitchison</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>286</volume>
            <fpage>971</fpage>
            <lpage>974</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.286.5441.971</pubid>
                  <pubid idtype="pmpid" link="fulltext">10542155</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Functional profiling: from microarrays via cell-based assays to novel tumor relevant modulators of the cell cycle.</p>
            </title>
            <aug>
               <au>
                  <snm>Arlt</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Huber</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Liebel</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Majety</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sauermann</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rosenfelder</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bechtel</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mehrle</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hahne</snm>
                  <fnm>F</fnm>
               </au>
               <etal/>
            </aug>
            <source>Cancer Res</source>
            <pubdate>2005</pubdate>
            <volume>65</volume>
            <fpage>7733</fpage>
            <lpage>7742</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1158/0008-5472.CAN-04-3544</pubid>
                  <pubid idtype="pmpid" link="fulltext">16140941</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Flow cytometry smaller and better.</p>
            </title>
            <aug>
               <au>
                  <snm>Bonetta</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Nat Methods</source>
            <pubdate>2005</pubdate>
            <volume>2</volume>
            <fpage>785</fpage>
            <lpage>795</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1038/nmeth1005-785</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <aug>
               <au>
                  <cnm>Tree Star Inc</cnm>
               </au>
            </aug>
            <source>FlowJo</source>
            <publisher>Ashland, OR: Tree Star Inc</publisher>
            <pubdate>2006</pubdate>
         </bibl>
         <bibl id="B17">
            <aug>
               <au>
                  <cnm>BD Bioscience</cnm>
               </au>
            </aug>
            <source>CellQuestPro</source>
            <publisher>San Jose, CA: BD Bioscience</publisher>
            <pubdate>2005</pubdate>
         </bibl>
         <bibl id="B18">
            <aug>
               <au>
                  <cnm>De Novo Software</cnm>
               </au>
            </aug>
            <source>FCS Express</source>
            <publisher>Thornhill, Ontario, Canada: De Novo Software</publisher>
            <pubdate>2006</pubdate>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Bioconductor</p>
            </title>
            <url>http://www.bioconductor.org</url>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Rapid caspase-3 activation during apoptosis revealed using fluorescence-resonance energy transfer.</p>
            </title>
            <aug>
               <au>
                  <snm>Tyas</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Brophy</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Pope</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rivett</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tavare</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>EMBO Rep</source>
            <pubdate>2000</pubdate>
            <volume>1</volume>
            <fpage>266</fpage>
            <lpage>270</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1083724</pubid>
                  <pubid idtype="pmpid" link="fulltext">11256610</pubid>
                  <pubid idtype="doi">10.1093/embo-reports/kvd050</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Meaningful relationships: the regulation of the Ras/Raf/MEK/ERK pathway by protein interactions.</p>
            </title>
            <aug>
               <au>
                  <snm>Kolch</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Biochem J</source>
            <pubdate>2000</pubdate>
            <volume>351</volume>
            <fpage>289</fpage>
            <lpage>305</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1221363</pubid>
                  <pubid idtype="pmpid" link="fulltext">11023813</pubid>
                  <pubid idtype="doi">10.1042/0264-6021:3510289</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Bayesian inference on biopolymer models.</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>CE</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <fpage>38</fpage>
            <lpage>52</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/15.1.38</pubid>
                  <pubid idtype="pmpid">10068691</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>The logic of inductive inference.</p>
            </title>
            <aug>
               <au>
                  <snm>Fisher</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>J Roy Stat Soc</source>
            <pubdate>1935</pubdate>
            <volume>98</volume>
            <fpage>39</fpage>
            <lpage>54</lpage>
            <xrefbib>
               <pubid idtype="doi">10.2307/2342435</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <aug>
               <au>
                  <snm>Loader</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Local Regression and Likelihood</source>
            <publisher>Springer, New York, USA</publisher>
            <pubdate>1999</pubdate>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Phospholipase C delta-4 overexpression upregulates ErbB1/2 expression, Erk signaling pathway, and proliferation in MCF-7 cells.</p>
            </title>
            <aug>
               <au>
                  <snm>Leung</snm>
                  <fnm>DW</fnm>
               </au>
               <au>
                  <snm>Tompkins</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Brewer</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ball</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Coon</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Morris</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Waggoner</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Singer</snm>
                  <fnm>JW</fnm>
               </au>
            </aug>
            <source>Mol Cancer</source>
            <pubdate>2004</pubdate>
            <volume>3</volume>
            <fpage>15</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">420486</pubid>
                  <pubid idtype="pmpid" link="fulltext">15140260</pubid>
                  <pubid idtype="doi">10.1186/1476-4598-3-15</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>A comparison of statistical methods for meta-analysis.</p>
            </title>
            <aug>
               <au>
                  <snm>Brockwell</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Gordon</snm>
                  <fnm>IR</fnm>
               </au>
            </aug>
            <source>Stat Med</source>
            <pubdate>2001</pubdate>
            <volume>20</volume>
            <fpage>825</fpage>
            <lpage>840</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/sim.650</pubid>
                  <pubid idtype="pmpid">11252006</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <aug>
               <au>
                  <snm>Agresti</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Categorical Data Analysis</source>
            <publisher>Hoboken, NJ: Wiley</publisher>
            <edition>2</edition>
            <pubdate>2002</pubdate>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Strong control, conservative point estimation, and simultaneous conservative consistency of false discovery rates: A unified approach.</p>
            </title>
            <aug>
               <au>
                  <snm>Storey</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Siegmund</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>J Roy Stat Soc Ser B</source>
            <pubdate>2004</pubdate>
            <volume>66</volume>
            <fpage>187</fpage>
            <lpage>205</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1111/j.1467-9868.2004.00439.x</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Multiple testing procedures: the multtest package and applications to genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Pollard</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Dudoit</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>van der Laan</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Bioinformatics and Computational Biology Solutions Using R and Bioconductor</source>
            <publisher>Springer, New York, USA</publisher>
            <edition>1</edition>
            <pubdate>2005</pubdate>
         </bibl>
         <bibl id="B30">
            <title>
               <p>A statistical paradox.</p>
            </title>
            <aug>
               <au>
                  <snm>Lindley</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Biometrika</source>
            <pubdate>1957</pubdate>
            <volume>44</volume>
            <fpage>187</fpage>
            <lpage>192</lpage>
            <xrefbib>
               <pubid idtype="doi">10.2307/2333251</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Enhancing scatterplots with smoothed densities.</p>
            </title>
            <aug>
               <au>
                  <snm>Eilers</snm>
                  <fnm>PHC</fnm>
               </au>
               <au>
                  <snm>Goeman</snm>
                  <fnm>JJ</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <fpage>623</fpage>
            <lpage>628</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg454</pubid>
                  <pubid idtype="pmpid" link="fulltext">15033868</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Statistical methods and software for the analysis of high throughput reverse genetic assays using flow cytometry readouts: web complement</p>
            </title>
            <aug>
               <au>
                  <snm>Hahne</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Arlt</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sauermann</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Majety</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Wiemann</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Poustka</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Huber</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <url>http://www.dkfz.de/mga2/GBcomplement</url>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Gene expression in kidney cancer is associated with cytogenetic abnormalities, metastasis formation, and patient survival.</p>
            </title>
            <aug>
               <au>
                  <snm>S&#252;ltmann</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>von Heydebreck</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Huber</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Kuner</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Buness</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Vogt</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gunawan</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>F&#252;zesi</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Poustka</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Clin Cancer Res</source>
            <pubdate>2005</pubdate>
            <volume>11</volume>
            <fpage>646</fpage>
            <lpage>655</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15701852</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Systematic subcellular localization of novel proteins identified by large-scale cDNA sequencing.</p>
            </title>
            <aug>
               <au>
                  <snm>Simpson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wellenreuther</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Poustka</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pepperkok</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wiemann</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>EMBO Rep</source>
            <pubdate>2000</pubdate>
            <volume>1</volume>
            <fpage>287</fpage>
            <lpage>292</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1083732</pubid>
                  <pubid idtype="pmpid" link="fulltext">11256614</pubid>
                  <pubid idtype="doi">10.1093/embo-reports/kvd058</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
