<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>gb-2005-6-8-r65</ui>
	<ji>GBJ</ji>
	<fm>
		<dochead>Research</dochead>
		<bibl>
			<title>
				<p>The molecular portrait of <it>in vitro </it>growth by meta-analysis of gene-expression profiles</p>
			</title>
			<aug>
				<au id="A1" ca="yes">
					<snm>Sandberg</snm>
					<fnm>Rickard</fnm>
					<insr iid="I1"/>
					<insr iid="I2"/>
					<email>Rickard.Sandberg@mtc.ki.se</email>
				</au>
				<au id="A2">
					<snm>Ernberg</snm>
					<fnm>Ingemar</fnm>
					<insr iid="I1"/>
					<email>Ingemar.Ernberg@mtc.ki.se</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Microbiology and Tumor Biology Center (MTC), Karolinska Institutet, S-171 77 Stockholm, Sweden</p>
				</ins>
				<ins id="I2">
					<p>Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA</p>
				</ins>
			</insg>
			<source>Genome Biology</source>
			<issn>1465-6906</issn>
			<pubdate>2005</pubdate>
			<volume>6</volume>
			<issue>8</issue>
			<fpage>R65</fpage>
			<url>http://genomebiology.com/2005/6/8/R65</url>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">16086847</pubid><pubid idtype="doi">10.1186/gb-2005-6-8-r65</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>27</day>
					<month>1</month>
					<year>2005</year>
				</date>
			</rec>
			<revrec>
				<date>
					<day>21</day>
					<month>4</month>
					<year>2005</year>
				</date>
			</revrec>
			<acc>
				<date>
					<day>21</day>
					<month>6</month>
					<year>2005</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>27</day>
					<month>7</month>
					<year>2005</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2005</year>
			<collab>Sandberg and Ernberg; licensee BioMed Central Ltd.</collab>
			<note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<shorttitle>
			<p>Expression profiling of <it>in vitro </it>growth</p>
		</shorttitle>
		<shortabs>
			<p>A meta analysis comparing 60 tumor cell lines, 135 normal tissue samples and 176 tumor tissue samples in humans shows significant differential expression between cell lines and tissues of around 30% of the 7,000 genes analyzed.</p>
		</shortabs>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>Cell lines as model systems of tumors and tissues are essential in molecular biology, although they only approximate the properties of <it>in vivo </it>cells in tissues. Cell lines have been selected under <it>in vitro </it>conditions for a long period of time, affecting many specific cellular pathways and processes.</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>To identify the transcriptional changes caused by long term <it>in vitro </it>selection, we performed a gene-expression meta-analysis and compared 60 tumor cell lines (of nine tissue origins) to 135 human tissue and 176 tumor tissue samples. Using significance analysis of microarrays we demonstrated that cell lines showed statistically significant differential expression of approximately 30% of the approximately 7,000 genes investigated compared to the tissues. Most of the differences were associated with the higher proliferation rate and the disrupted tissue organization <it>in vitro</it>. Thus, genes involved in cell-cycle progression, macromolecule processing and turnover, and energy metabolism were upregulated in cell lines, whereas cell adhesion molecules and membrane signaling proteins were downregulated.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusion</p>
					</st>
					<p>Detailed molecular understanding of how cells adapt to the <it>in vitro </it>environment is important, as it will both increase our understanding of tissue organization and result in a refined molecular portrait of proliferation. It will further indicate when to use immortalized cell lines, or when it is necessary to instead use three-dimensional cultures, primary cell cultures or tissue biopsies.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="BMC" subtype="man_spc_id" id="30010003">Cancer</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>How different are cells grown <it>in vitro </it>from cells that are part of a tissue? Human tissues and tumors are complex and heterogeneous as they are composed of different cell types that influence each other through paracrine signaling pathways and interactions with extracellular matrix (ECM). Cell lines on the other hand consist of a more or less clonal cell populations that lack interactions with other cell types and interact with an artificial support such as plastic. Cell adaptation to <it>in vitro </it>microenvironments have probably involved recalibrations of many cellular pathways through genetic alterations <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, transcriptional alterations <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, different post-transcriptional regulation <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> and changed signaling networks <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Thus, the degree to which cell lines are representative of the specific cell types they were derived from varies <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. Furthermore, among cell lines established for <it>in vitro </it>growth there is an overwhelming bias for tumor-derived cells. It has been very hard to establish non-transformed cells for long-term <it>in vitro </it>growth. Detailed comparisons of the genotypic and phenotypic characteristics of <it>in vitro </it>grown cells with a panel of normal and tumor tissues may reveal how cell lines have adapted to <it>in vitro </it>environments. Moreover, comparisons of cell lines with both tumors and the normal tissues they were derived from are needed to assess how well they represent their tissue of origin and which of their features may have been acquired in <it>vitro</it>.</p>
			<p>Analyses of mRNA expression levels using DNA microarrays have contributed to an increasingly detailed understanding of patterns of gene expression in different tissues <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp> and also how <it>in vitro </it>selection and adaptation affect basic cellular processes. So far, these studies have been focused on single cell types. Cell lines from colon <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, breast <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, lymphoma <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, leukemia <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, and lung origin <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> have been compared to their corresponding <it>in vivo </it>malignancies. These studies have consistently demonstrated that different cell lines of the same tissue origin are more similar to each other than to the tumors they derived from. From these gene-expression studies, it has also been repeatedly shown that genes associated with proliferation <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp> and ribosomal activity <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> are upregulated in cell lines. However, no study so far has addressed the issue of whether the same genes are perturbed by the <it>in vitro </it>environment in cell lines derived from tumors of different tissue origins, that is, if there may be an '<it>in vitro </it>expression profile'.</p>
			<p>Developing meta-analytical tools for comparing gene-expression data generated in different studies and laboratories is important. Some meta-analysis of gene-expression profiles of multiple tumors and normal tissues have been pursued, identifying common upregulated genes in neoplastic transformation and in relation to tumor differentiation status <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Moreover, a collection of gene-expression data from different tumor types has been used to identify upregulated or repressed modules of genes with coherent expression profiles in specific tumors <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. In both these studies, gene-expression data was gathered from multiple platforms and laboratories, although the data were analyzed independently (that is, for each dataset separately). In the first study, the expression levels in each array were normalized independently to unit length (a median expression of zero and a standard deviation of one) <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. In the second study, each gene was subtracted by the mean expression level across the samples in each dataset, respectively <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Subsequently, genes which were consistently up- or downregulated could be identified in comparisons within multiple datasets <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
			<p>In this study, we describe a cross-site approach to quantitatively integrate gene-expression profiles from three laboratories <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp> comprising 60 cell lines and 311 tissue samples. We integrated gene-expression data from cell lines derived from tumors of nine different tissue-origins (NCI60 cell lines) with two large gene-expression datasets of human tissues and human tumors. All these studies used the same platform and array-type (Affymetrix Hu6800). Using a meta-analysis we defined the transcriptional changes observed in all cell lines compared to both normal and tumor tissues independent of tissue origin. The cell lines showed statistically significant differential expression of approximately 30% of the approximately 7,000 genes investigated. Among the upregulated genes we consistently found - not surprisingly - many genes involved in macromolecular turnover, cell-cycle progression, energy metabolism, and histone modifications. Adhesion molecules and membrane signaling proteins were enriched among the downregulated genes, a possible consequence of the disrupted tissue organization <it>in vitro</it>. The origin-independent transcriptional alterations defined in this study are probably the consequence of the <it>in vitro </it>adaptation and selection. As such, our data will be important to improve our understanding of the biological consequences of <it>in vitro </it>growth and thus how well cell lines correspond to the <it>in vivo </it>tissues and tumors.</p>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<sec>
				<st>
					<p>Normalization of gene-expression profiles from multiple sources</p>
				</st>
				<p>To study the expression signature of <it>in vitro </it>growth, we collected gene-expression profiles from 60 cancer cell lines <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, 135 normal tissue samples <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp> and 176 tumor tissue samples <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> generated using the same Affymetrix Hu6800 array platform (dataset I). The cell lines were derived from nine different tumor types, the normal tissues samples 19 tissues and the tumor samples from 13 different tissues (see Materials and methods). As a control, we also used gene-expression data from an independent study, in which both cell lines and tissues were profiled within the same study <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> using Affymetrix HGU95A arrays (dataset II). Dataset II was more limited, however, as 21 of the 25 cell-line samples were of lymphoid origin. Together, these two datasets (Table <tblr tid="T1">1</tblr>) were considered as well suited to systematically evaluate how cell lines in general approximate their tissues of origin and thus their resulting validity as biological model systems.</p>
				<tbl id="T1">
					<title>
						<p>Table 1</p>
					</title>
					<caption>
						<p>Sources of gene-expression data</p>
					</caption>
					<tblbdy cols="6">
						<r>
							<c ca="left">
								<p>Source</p>
							</c>
							<c ca="center">
								<p>Number of cell lines</p>
							</c>
							<c ca="center">
								<p>Number of normal tissue samples</p>
							</c>
							<c ca="center">
								<p>Number of tumor samples</p>
							</c>
							<c ca="center">
								<p>Dataset</p>
							</c>
							<c ca="center">
								<p>Platform</p>
							</c>
						</r>
						<r>
							<c cspan="6">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>[15]</p>
							</c>
							<c ca="center">
								<p>60</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>I</p>
							</c>
							<c ca="center">
								<p>Hu6800</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>[17]</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>59</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>I</p>
							</c>
							<c ca="center">
								<p>Hu6800</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>[16]</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
							<c ca="center">
								<p>60</p>
							</c>
							<c ca="center">
								<p>189</p>
							</c>
							<c ca="center">
								<p>I</p>
							</c>
							<c ca="center">
								<p>Hu6800</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>[18]</p>
							</c>
							<c ca="center">
								<p>25</p>
							</c>
							<c ca="center">
								<p>65</p>
							</c>
							<c ca="center">
								<p>5</p>
							</c>
							<c ca="center">
								<p>II</p>
							</c>
							<c ca="center">
								<p>HGU95A</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
				<p>It must be emphasized that comparing gene-expression data from different laboratories may introduce different biases resulting from different experimental conditions and protocols. To quantitatively compare gene-expression profiles from different studies, we rescaled all samples using the global scaling algorithm (see Materials and methods). We investigated each sample after the rescaling procedure to check whether any samples were of questionable quality by computing its average correlation to all other samples. This analysis step served two purposes: first, to investigate how similar were the gene-expression patterns of the biological replicates; second, to verify that samples of the same tissues in the different datasets were more similar to each other than to other tissues. Overall, the average correlations between samples of different tissue origins were between 0.5 and 0.6. Certain samples, however, were found to have an average correlation to other samples as low as 0.15 (Figure <figr fid="F1">1a</figr>). These samples with low average correlation also had higher scaling factors (Figure <figr fid="F1">1b</figr>), indicating that they had lower signals on the chip. This could be a result of a less successful hybridization, and it is likely that our rescaling procedure worked less efficiently for these samples. Therefore, we removed the 28 samples with an average correlation of less than 0.34 (Figure <figr fid="F1">1</figr>). The removed samples were of diverse tissue origins and the low average correlation observed for these samples was not an effect of being a single sample from a specific tissue. We used the gene-expression profiles of the same normal tissues that were present in two of the datasets <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp> as an initial evaluation of the rescaling procedure. The expression profiles from the same tissues should be more similar to each other than to samples from other tissues, independently of the laboratory in which the data were generated. We compared the correlation between the 59 normal samples from Hsiao <it>et al</it>. <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> to the 91 normal samples from Ramaswamy <it>et al. </it><abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. The matrix of correlations is presented in Figure <figr fid="F2">2</figr>. Gene-expression profiles of the same tissues gathered in the two laboratories showed in general higher correlations, indicating that tissue-specific differences within each dataset were larger than a possible systematic difference between the two datasets. There were, however, high correlations between gene-expression profiles of hormone-related tissues (for example, breast, ovary and uterus) both within and between datasets.</p>
				<fig id="F1">
					<title>
						<p>Figure 1</p>
					</title>
					<caption>
						<p>Identification of outlier samples by correlation analysis and scalar factors</p>
					</caption>
					<text>
						<p>Identification of outlier samples by correlation analysis and scalar factors. <b>(a) </b>Plotting the average correlation for each sample from pairwise comparisons to all other samples (<it>y</it>-axis). The samples were sorted according to their average correlation (<it>x</it>-axis). We used an average correlation of 0.34 as a cutoff (marked with a dashed line). <b>(b) </b>Comparison of the average correlation (<it>x</it>-axis) with the scalar factor used in the global scaling procedure (<it>y</it>-axis). Many of the samples with low average correlations had been rescaled using high scaling factors, indicating that they might have had poor hybridizations. Again, the dashed line displays the average correlation cutoff.</p>
					</text>
					<graphic file="gb-2005-6-8-r65-1"/>
				</fig>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p>Correlation matrix between all normal samples from two studies</p>
					</caption>
					<text>
						<p>Correlation matrix between all normal samples from two studies. The gene-expression profiles of each normal tissue sample were compared to all other normal tissue samples from the other dataset by measuring the correlation across all genes. The normal samples from Hsiao <it>et al</it>. [17] are presented along the <it>y</it>-axis and samples from Ramaswamy <it>et al</it>. [16] along the <it>x</it>-axis. The correlation matrix displays each pairwise comparison and each entry is color-coded according to the scale bar to the right of matrix. Black rectangles highlight correlation values between the samples from the same tissues in the two different datasets.</p>
					</text>
					<graphic file="gb-2005-6-8-r65-2"/>
				</fig>
			</sec>
			<sec>
				<st>
					<p>Validation of the quantitative comparison across datasets</p>
				</st>
				<p>Singular value decomposition (SVD) has been successfully used to investigate the fundamental patterns in gene-expression data <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr></abbrgrp>. We analyzed our merged gene-expression data (dataset I) using SVD to asses the fundamental patterns within the data, and in particular the similarities between the expression data from the different laboratories. We projected each sample into a SVD subspace by calculating the correlation between the expression profiles of each array and the two eigenarrays (derived from the SVD), respectively (Figure <figr fid="F3">3a</figr>). Because the first two eigenarrays are associated with the two largest singular values <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B21">21</abbr></abbrgrp>, this procedure captures the largest variability inside the gene-expression data into a two-dimensional plot. Importantly, the gene-expression profiles of normal tissue samples from the two different studies were overlapping after the SVD projection. Moreover, normal tissue and tumor tissue samples of CNS origin, from the two different laboratories, were in proximity to each other in SVD subspace (Figure <figr fid="F3">3a</figr>). Therefore, laboratory-dependent separation of the tissue samples was not observed. However, the cell lines were distinctly separated (Figure <figr fid="F3">3a</figr>). This could reflect either a technical artifact in the merging of only the gene-expression data of the cell lines, or that the cell lines have very different gene-expression profiles compared to tissues.</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>The gene-expression profiles of cell lines compared to normal and tumor tissues</p>
					</caption>
					<text>
						<p>The gene-expression profiles of cell lines compared to normal and tumor tissues. <b>(a) </b>Projection of each sample in dataset I into SVD space drawn by the correlation of each sample to SVD eigenarray 1 (<it>x</it>-axis) and 2 (<it>y</it>-axis). The normal tissue samples of CNS origin from two laboratories (green squares, Hsiao <it>et al </it>[17]; black squares, Ramaswamy <it>et al</it>. [16]) were overlapping, as well as the tumor tissue samples (red squares, Ramaswamy <it>et al</it>. [16]). The cell lines were separated from tissue samples by the first SVD eigenarray. Samples of lymphoma and leukemia origin were also separated in the SVD analysis. <b>(b) </b>Projection of each sample in dataset II into the SVD space drawn by the correlation of each sample to SVD eigenarray 1 (<it>x</it>-axis) and 2 (<it>y</it>-axis). The cell lines (crosses) were separated from tissue samples. Whole blood samples were distinctly clustered close to the cell lines. <b>(c) </b>Other separation of normal samples. Significance analysis of microarrays (SAM) was used to identify differentially expressed genes between cell line and tissue samples in dataset I. The number of statistically significant genes (<it>x</it>-axis) as a function of the median and 90th percentile of the FDR (<it>y</it>-axis) estimated based on 1,000 permutations. <b>(d) </b>SAM analysis of cell line versus tissue samples in dataset II. Identical parameters as in (c). <b>(e) </b>Plot of the degree of differential expression between cell lines and tissues for each gene in dataset I (<it>x</it>-axis) versus dataset II (<it>y</it>-axis) respectively. The degree of differential expression was measured using the signal-to-noise metric [23].</p>
					</text>
					<graphic file="gb-2005-6-8-r65-3"/>
				</fig>
				<p>Therefore, we performed the identical analysis of dataset II (the validation dataset) comprising both cell lines and tissue samples within the same study. Using the identical SVD procedure, cell lines were again separated from tissues in their correlation with the two first eigenarrays (Figure <figr fid="F3">3b</figr>). This excluded the possibility that the cell line versus tissues distinction in dataset I was a technical artifact. Moreover, the separation of cell lines from tissue samples was captured by the first eigenarray in both datasets demonstrating that this difference was the largest in the gene-expression data. Hierarchical clustering of the gene expression in datasets I and II, were also found to repeatedly separate all cell lines from normal and tumor tissues (data not shown).</p>
			</sec>
			<sec>
				<st>
					<p>Identification of origin-independent transcriptional alterations <it>in vitro</it></p>
				</st>
				<p>We next sought to estimate the number of genes that were specifically up- or downregulated in cell lines and responsible for the distinct separation of cell lines from tissue samples. We used significance analysis of microarrays (SAM) <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> to identify the number of genes with statistically significant differential expression as a function of the false discovery rate (FDR). In dataset I, using conservative criteria, we identified 1,500 genes with an estimated FDR of zero, and 2,900 genes at a FDR of 1% (Figure <figr fid="F3">3c</figr>). For example, at a FDR of 1% only 29 false positives are estimated out of the 2,900 genes identified. In dataset II we identified 1,800 genes at a FDR of zero and 3,400 genes at a FDR of 1% (Figure <figr fid="F3">3d</figr>). In total, using a FDR of 1%, we identified 41% of the genes as differentially expressed between cell lines and tissues in dataset I and 29% in dataset II respectively. To investigate the generality of our results, we investigated whether the identical genes were identified as up- or downregulated in cell lines in dataset I and II despite the sample and platform differences. Of the 2,000 most differentially expressed genes in dataset I, we found corresponding probe sets for 1,476 of the genes on the HGU95A arrays (635 upregulated and 841 downregulated genes) using a recently published map <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. We confirmed the upregulation of 399 genes (63% of the genes; <it>p </it>&lt; 4e-70, Fisher's exact test) and 176 (21% of the genes; <it>p </it>&lt; 1e-7, Fisher's exact test) of the downregulated genes in cell lines by identifying the intersection with the genes with statistically significant differential expression in dataset II (FDR of 1%). The list of genes found to be differentially expressed in both datasets is found in Additional data file 1. Second, we also compared the score of differential expression for all genes in both datasets (Figure <figr fid="F3">3e</figr>). A correlation coefficient of 0.33 between the degree of differential expression in dataset I and II was observed, even though they are generated using two different Affymetrix arrays and the sample origins were diverse. Again, this demonstrated that the results obtained by comparing the cell lines to normal and tumor tissues in dataset I were not due to technical artifacts.</p>
			</sec>
			<sec>
				<st>
					<p>Classification of samples based upon the <it>in vitro </it>signature</p>
				</st>
				<p>To further validate that the gene-expression differences between cell lines and tissues identified in both dataset I and II (399 upregulated and 176 downregulated genes) represent true transcriptional alterations associated with long-term cultured cell lines, we evaluated the ability to classify samples on the basis of these genes (Materials and methods). First, as a control, we classified each sample in dataset I and II into either 'cell line' or 'tissue'. The accuracy of the classification was 99% and 100% respectively (Table <tblr tid="T2">2</tblr>). Second, we classified each sample in three additional datasets <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B12">12</abbr><abbr bid="B24">24</abbr></abbrgrp>, again with high accuracy (Table <tblr tid="T2">2</tblr>). Plots of the distributions of scores for each dataset can be found in Additional data file 2.</p>
				<tbl id="T2">
					<title>
						<p>Table 2</p>
					</title>
					<caption>
						<p>Classification of cell lines and tissue samples across five datasets</p>
					</caption>
					<tblbdy cols="4">
						<r>
							<c ca="left">
								<p>Dataset reference</p>
							</c>
							<c ca="center">
								<p>Accuracy (%)</p>
							</c>
							<c ca="center">
								<p>Number of cell lines</p>
							</c>
							<c ca="center">
								<p>Number of tissue samples</p>
							</c>
						</r>
						<r>
							<c cspan="4">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Dataset I</p>
							</c>
							<c ca="center">
								<p>99*</p>
							</c>
							<c ca="center">
								<p>60</p>
							</c>
							<c ca="center">
								<p>371</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Dataset II</p>
							</c>
							<c ca="center">
								<p>100</p>
							</c>
							<c ca="center">
								<p>25</p>
							</c>
							<c ca="center">
								<p>70</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Dataset III [8]</p>
							</c>
							<c ca="center">
								<p>100</p>
							</c>
							<c ca="center">
								<p>10</p>
							</c>
							<c ca="center">
								<p>123</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Dataset IV [24]</p>
							</c>
							<c ca="center">
								<p>95</p>
							</c>
							<c ca="center">
								<p>15</p>
							</c>
							<c ca="center">
								<p>64</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Dataset V [12]</p>
							</c>
							<c ca="center">
								<p>96</p>
							</c>
							<c ca="center">
								<p>10</p>
							</c>
							<c ca="center">
								<p>81</p>
							</c>
						</r>
					</tblbdy>
					<tblfn>
						<p>*One cell line (breast cell line HS578T) was misclassified as a tissue sample.</p>
					</tblfn>
				</tbl>
			</sec>
			<sec>
				<st>
					<p>Features of the <it>in vitro </it>gene-expression signature</p>
				</st>
				<p>We observed a qualitative difference in the expression patterns of the up- and downregulated genes in cell lines that might explain the higher degree of confirmation of upregulated genes in dataset II. Figure <figr fid="F4">4</figr> shows the general trends in the expression of differentially expressed genes in both dataset I and II across cell lines and tissues. The upregulated genes were highly expressed all cell lines and in general expressed in lower amounts in tissue samples (Figure <figr fid="F4">4b</figr>; some exceptions are discussed below). Genes found to be downregulated in cell lines were low in all cell lines, but highly expressed in only a subset of the tissues (Figure <figr fid="F4">4a</figr>). No genes were found to be universally expressed <it>in vivo </it>but not <it>in vitro</it>. As a consequence, the identification of downregulated genes in cell lines depends on the tissue samples present in the comparison. This might explain the lower concordance between different datasets for downregulated genes compared with upregulated genes, as large differences between the types of tissue samples in datasets I and II existed (for example, no tumor samples in dataset II).</p>
				<fig id="F4">
					<title>
						<p>Figure 4</p>
					</title>
					<caption>
						<p>The gene-expression signature of <it>in vitro </it>growth</p>
					</caption>
					<text>
						<p>The gene-expression signature of <it>in vitro </it>growth. All genes found to be differentially expressed between cell lines and tissues across two dataset I and II (576 genes) were subject to hierarchical clustering (average linkage and Euclidean distance metric) using the Genesis software [43]. Before clustering, all genes were normalized to an average expression level of zero and a standard deviation of one (that is unit length). Above the cluster image, samples are labeled as cell lines, normal tissues and tumor tissues (except for the primary cultures and FACS-sorted cells in datasets II that were not annotated). <b>(a) </b>Top part of the cluster presents the genes found to be downregulated <it>in vitro</it>. These genes were not detected <it>in vitro </it>and were often only expressed in a subset of tissue samples. It is likely that these genes represent downregulated tissue markers from the respective tissues. <b>(b) </b>In contrast, genes found to be upregulated <it>in vitro </it>were highly expressed in all cell lines, while occasionally expressed in a few tissue samples. Specific clusters of genes in (a) and (b) are annotated on the right of the cluster image (clusters A to H). Specific groups of samples are annotated in color above the cluster image and by number below the cluster image (cluster numbers 1 to 7). Cluster number 1, kidney and liver samples; cluster number 2, lung and muscle; cluster number 3, lymphomas; cluster number 4, leukemias (ALL); cluster number 5, leukemias (AML); cluster number 6, CNS tumors (medullablastoma and glioblastoma); cluster number 7, germinal center cells.</p>
					</text>
					<graphic file="gb-2005-6-8-r65-4"/>
				</fig>
				<p>The genes downregulated in cell lines and only expressed in subsets of tissues and tumors were likely to represent tissue-specific genes for which the expression was lost in cell lines (Figure <figr fid="F4">4a</figr>). Indeed, examples of tissue-specific genes that were downregulated in cell lines were identified for blood cells (Figure <figr fid="F4">4</figr>, cluster A, for example, PBXIP1, ISGF3 and IkB-alpha), brain tumors (Figure <figr fid="F4">4</figr>, cluster C and sample cluster 6, for example, CCND2 and APPBP2), renal biopsies (Figure <figr fid="F4">4</figr>, cluster E, for example, hMT-If) and brain normal and tumor biopsies (Figure <figr fid="F4">4</figr>, cluster F, for example, Protocadherin 2).</p>
				<p>Leukemias (sample clusters 4 and 5 in Figure <figr fid="F4">4</figr>), lymphomas (sample cluster 3 in Figure <figr fid="F4">4</figr>), and germinal center cells (sample cluster 7 in Figure <figr fid="F4">4</figr>) had gene-expression profiles most similar to those of the cell lines. They had downregulated a large portion of the genes similarly downregulated in cell lines (Figure <figr fid="F4">4</figr>, cluster D). They had also upregulation of genes associated with replication (cluster G, for example, TOPII, MCM2, MCM3 and MCM6) and metabolism (cluster H). The information of all genes present in Figure <figr fid="F4">4</figr> along with its presence in different subclusters can be found in Additional data file 1. A high-resolution image of Figure <figr fid="F4">4</figr> with all sample names and gene identifiers can be found in Additional data file 3.</p>
			</sec>
			<sec>
				<st>
					<p>Transcriptional alterations affect multiple biological processes</p>
				</st>
				<p>Because of the considerable and consistent differential expression of genes in cell lines, we used Gene Ontology (GO) to investigate which biological processes were affected by long-term <it>in vitro </it>selection and adaptation. Using GoMiner <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> we identified the GO categories over-represented among the differentially expressed genes defined by SAM at a FDR of 1% (735 up- and 1,699 downregulated genes). By this approach, multiple and highly overlapping GO categories showing statistical significance were identified. GoMiner corrects the <it>p</it>-values for the multiple comparisons and we set the FDR threshold to 5% for the GO category identification. We found that upregulated genes in cell lines are over-represented for multiple GO categories relating to three main cellular processes: cell cycle; macromolecular biosynthesis, processing, modification and degradation; and energy metabolism (Table <tblr tid="T3">3</tblr>). Seven genes belonging to the 'histone modification' category were also upregulated. Interestingly, among the downregulated genes we identified many genes involved in 'cell adhesion', 'cell-cell adhesion', 'enzyme linked receptor protein signaling pathway', and 'cell-cell signaling' (Table <tblr tid="T4">4</tblr>). A similar pattern of downregulated genes involved in cell-cell communication, membrane signaling and second messenger signaling was observed in dataset II (data not shown). We also identified many downregulated genes involved in immune-system functions and antigen presentation. However, these differences were dataset dependent and not observed in dataset II. Therefore these categories were excluded from Table <tblr tid="T4">4</tblr> but are given in Additional data file 4.</p>
				<tbl id="T3">
					<title>
						<p>Table 3</p>
					</title>
					<caption>
						<p>Biological process upregulated <it>in vitro</it></p>
					</caption>
					<tblbdy cols="6">
						<r>
							<c ca="left">
								<p>GO category</p>
							</c>
							<c ca="left">
								<p>Total number of genes</p>
							</c>
							<c ca="left">
								<p>Genes changed</p>
							</c>
							<c ca="left">
								<p>Log<sub>10 </sub>(<it>p</it>-value)</p>
							</c>
							<c ca="left">
								<p>FDR</p>
							</c>
							<c ca="left">
								<p>GO ID</p>
							</c>
						</r>
						<r>
							<c cspan="6">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>Translation</b>
								</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Translation</p>
							</c>
							<c ca="left">
								<p>76</p>
							</c>
							<c ca="left">
								<p>36</p>
							</c>
							<c ca="left">
								<p>-7.95</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0043037</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Ribosome biogenesis and assembly</p>
							</c>
							<c ca="left">
								<p>42</p>
							</c>
							<c ca="left">
								<p>19</p>
							</c>
							<c ca="left">
								<p>-4.10</p>
							</c>
							<c ca="left">
								<p>0.0037</p>
							</c>
							<c ca="left">
								<p>GO:0042254</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Ribosome biogenesis</p>
							</c>
							<c ca="left">
								<p>41</p>
							</c>
							<c ca="left">
								<p>19</p>
							</c>
							<c ca="left">
								<p>-4.27</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0007046</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Regulation of translation</p>
							</c>
							<c ca="left">
								<p>33</p>
							</c>
							<c ca="left">
								<p>14</p>
							</c>
							<c ca="left">
								<p>-2.81</p>
							</c>
							<c ca="left">
								<p>0.0077</p>
							</c>
							<c ca="left">
								<p>GO:0006445</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Translational initiation</p>
							</c>
							<c ca="left">
								<p>23</p>
							</c>
							<c ca="left">
								<p>13</p>
							</c>
							<c ca="left">
								<p>-4.20</p>
							</c>
							<c ca="left">
								<p>0.0042</p>
							</c>
							<c ca="left">
								<p>GO:0006413</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>tRNA metabolism</p>
							</c>
							<c ca="left">
								<p>27</p>
							</c>
							<c ca="left">
								<p>12</p>
							</c>
							<c ca="left">
								<p>-2.69</p>
							</c>
							<c ca="left">
								<p>0.0070</p>
							</c>
							<c ca="left">
								<p>GO:0006399</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>tRNA modification</p>
							</c>
							<c ca="left">
								<p>23</p>
							</c>
							<c ca="left">
								<p>11</p>
							</c>
							<c ca="left">
								<p>-2.82</p>
							</c>
							<c ca="left">
								<p>0.0078</p>
							</c>
							<c ca="left">
								<p>GO:0006400</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>tRNA aminoacylation for protein translation</p>
							</c>
							<c ca="left">
								<p>21</p>
							</c>
							<c ca="left">
								<p>10</p>
							</c>
							<c ca="left">
								<p>-2.59</p>
							</c>
							<c ca="left">
								<p>0.0125</p>
							</c>
							<c ca="left">
								<p>GO:0006418</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>tRNA aminoacylation</p>
							</c>
							<c ca="left">
								<p>21</p>
							</c>
							<c ca="left">
								<p>10</p>
							</c>
							<c ca="left">
								<p>-2.59</p>
							</c>
							<c ca="left">
								<p>0.0125</p>
							</c>
							<c ca="left">
								<p>GO:0043039</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>rRNA processing</p>
							</c>
							<c ca="left">
								<p>17</p>
							</c>
							<c ca="left">
								<p>10</p>
							</c>
							<c ca="left">
								<p>-3.53</p>
							</c>
							<c ca="left">
								<p>0.0056</p>
							</c>
							<c ca="left">
								<p>GO:0006364</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>rRNA metabolism</p>
							</c>
							<c ca="left">
								<p>17</p>
							</c>
							<c ca="left">
								<p>10</p>
							</c>
							<c ca="left">
								<p>-3.53</p>
							</c>
							<c ca="left">
								<p>0.0056</p>
							</c>
							<c ca="left">
								<p>GO:0016072</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Regulation of translational initiation</p>
							</c>
							<c ca="left">
								<p>14</p>
							</c>
							<c ca="left">
								<p>8</p>
							</c>
							<c ca="left">
								<p>-2.79</p>
							</c>
							<c ca="left">
								<p>0.0075</p>
							</c>
							<c ca="left">
								<p>GO:0006446</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Translational elongation</p>
							</c>
							<c ca="left">
								<p>14</p>
							</c>
							<c ca="left">
								<p>7</p>
							</c>
							<c ca="left">
								<p>-2.07</p>
							</c>
							<c ca="left">
								<p>0.0400</p>
							</c>
							<c ca="left">
								<p>GO:0006414</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Transcription from Pol I promoter</p>
							</c>
							<c ca="left">
								<p>7</p>
							</c>
							<c ca="left">
								<p>5</p>
							</c>
							<c ca="left">
								<p>-2.44</p>
							</c>
							<c ca="left">
								<p>0.0141</p>
							</c>
							<c ca="left">
								<p>GO:0006360</p>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>Splicing</b>
								</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>RNA processing</p>
							</c>
							<c ca="left">
								<p>123</p>
							</c>
							<c ca="left">
								<p>52</p>
							</c>
							<c ca="left">
								<p>-9.02</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0006396</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>RNA metabolism</p>
							</c>
							<c ca="left">
								<p>130</p>
							</c>
							<c ca="left">
								<p>52</p>
							</c>
							<c ca="left">
								<p>-8.00</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0016070</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>mRNA metabolism</p>
							</c>
							<c ca="left">
								<p>64</p>
							</c>
							<c ca="left">
								<p>21</p>
							</c>
							<c ca="left">
								<p>-2.27</p>
							</c>
							<c ca="left">
								<p>0.0217</p>
							</c>
							<c ca="left">
								<p>GO:0016071</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>mRNA processing</p>
							</c>
							<c ca="left">
								<p>57</p>
							</c>
							<c ca="left">
								<p>20</p>
							</c>
							<c ca="left">
								<p>-2.56</p>
							</c>
							<c ca="left">
								<p>0.0123</p>
							</c>
							<c ca="left">
								<p>GO:0006397</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>RNA splicing</p>
							</c>
							<c ca="left">
								<p>41</p>
							</c>
							<c ca="left">
								<p>18</p>
							</c>
							<c ca="left">
								<p>-3.70</p>
							</c>
							<c ca="left">
								<p>0.0030</p>
							</c>
							<c ca="left">
								<p>GO:0008380</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>RNA splicing, via transesterification reactions with bulged adenosine as nucleophile</p>
							</c>
							<c ca="left">
								<p>33</p>
							</c>
							<c ca="left">
								<p>15</p>
							</c>
							<c ca="left">
								<p>-3.37</p>
							</c>
							<c ca="left">
								<p>0.0050</p>
							</c>
							<c ca="left">
								<p>GO:0000377</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>RNA splicing, via transesterification reactions</p>
							</c>
							<c ca="left">
								<p>33</p>
							</c>
							<c ca="left">
								<p>15</p>
							</c>
							<c ca="left">
								<p>-3.37</p>
							</c>
							<c ca="left">
								<p>0.0050</p>
							</c>
							<c ca="left">
								<p>GO:0000375</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Nuclear mRNA splicing, via spliceosome</p>
							</c>
							<c ca="left">
								<p>33</p>
							</c>
							<c ca="left">
								<p>15</p>
							</c>
							<c ca="left">
								<p>-3.37</p>
							</c>
							<c ca="left">
								<p>0.0050</p>
							</c>
							<c ca="left">
								<p>GO:0000398</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>RNA modification</p>
							</c>
							<c ca="left">
								<p>25</p>
							</c>
							<c ca="left">
								<p>11</p>
							</c>
							<c ca="left">
								<p>-2.46</p>
							</c>
							<c ca="left">
								<p>0.0143</p>
							</c>
							<c ca="left">
								<p>GO:0009451</p>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>Nucleotide metabolism</b>
								</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Nucleobase, nucleoside, nucleotide and nucleic acid metabolism</p>
							</c>
							<c ca="left">
								<p>806</p>
							</c>
							<c ca="left">
								<p>192</p>
							</c>
							<c ca="left">
								<p>-4.43</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0006139</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Nucleotide metabolism</p>
							</c>
							<c ca="left">
								<p>61</p>
							</c>
							<c ca="left">
								<p>20</p>
							</c>
							<c ca="left">
								<p>-2.18</p>
							</c>
							<c ca="left">
								<p>0.0304</p>
							</c>
							<c ca="left">
								<p>GO:0009117</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Nucleotide biosynthesis</p>
							</c>
							<c ca="left">
								<p>45</p>
							</c>
							<c ca="left">
								<p>16</p>
							</c>
							<c ca="left">
								<p>-2.21</p>
							</c>
							<c ca="left">
								<p>0.0303</p>
							</c>
							<c ca="left">
								<p>GO:0009165</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Ribonucleotide metabolism</p>
							</c>
							<c ca="left">
								<p>28</p>
							</c>
							<c ca="left">
								<p>13</p>
							</c>
							<c ca="left">
								<p>-3.09</p>
							</c>
							<c ca="left">
								<p>0.0047</p>
							</c>
							<c ca="left">
								<p>GO:0009259</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Ribonucleotide biosynthesis</p>
							</c>
							<c ca="left">
								<p>27</p>
							</c>
							<c ca="left">
								<p>13</p>
							</c>
							<c ca="left">
								<p>-3.28</p>
							</c>
							<c ca="left">
								<p>0.0048</p>
							</c>
							<c ca="left">
								<p>GO:0009260</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Purine nucleotide metabolism</p>
							</c>
							<c ca="left">
								<p>29</p>
							</c>
							<c ca="left">
								<p>12</p>
							</c>
							<c ca="left">
								<p>-2.37</p>
							</c>
							<c ca="left">
								<p>0.0164</p>
							</c>
							<c ca="left">
								<p>GO:0006163</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Purine nucleotide biosynthesis</p>
							</c>
							<c ca="left">
								<p>26</p>
							</c>
							<c ca="left">
								<p>12</p>
							</c>
							<c ca="left">
								<p>-2.86</p>
							</c>
							<c ca="left">
								<p>0.0080</p>
							</c>
							<c ca="left">
								<p>GO:0006164</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Purine ribonucleotide metabolism</p>
							</c>
							<c ca="left">
								<p>25</p>
							</c>
							<c ca="left">
								<p>11</p>
							</c>
							<c ca="left">
								<p>-2.46</p>
							</c>
							<c ca="left">
								<p>0.0143</p>
							</c>
							<c ca="left">
								<p>GO:0009150</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Purine ribonucleotide biosynthesis</p>
							</c>
							<c ca="left">
								<p>24</p>
							</c>
							<c ca="left">
								<p>11</p>
							</c>
							<c ca="left">
								<p>-2.63</p>
							</c>
							<c ca="left">
								<p>0.0115</p>
							</c>
							<c ca="left">
								<p>GO:0009152</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Nucleoside triphosphate metabolism</p>
							</c>
							<c ca="left">
								<p>23</p>
							</c>
							<c ca="left">
								<p>10</p>
							</c>
							<c ca="left">
								<p>-2.23</p>
							</c>
							<c ca="left">
								<p>0.0299</p>
							</c>
							<c ca="left">
								<p>GO:0009141</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Ribonucleoside triphosphate metabolism</p>
							</c>
							<c ca="left">
								<p>20</p>
							</c>
							<c ca="left">
								<p>9</p>
							</c>
							<c ca="left">
								<p>-2.17</p>
							</c>
							<c ca="left">
								<p>0.0295</p>
							</c>
							<c ca="left">
								<p>GO:0009199</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Ribonucleoside triphosphate biosynthesis</p>
							</c>
							<c ca="left">
								<p>19</p>
							</c>
							<c ca="left">
								<p>9</p>
							</c>
							<c ca="left">
								<p>-2.35</p>
							</c>
							<c ca="left">
								<p>0.0167</p>
							</c>
							<c ca="left">
								<p>GO:0009201</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Nucleoside triphosphate biosynthesis</p>
							</c>
							<c ca="left">
								<p>20</p>
							</c>
							<c ca="left">
								<p>9</p>
							</c>
							<c ca="left">
								<p>-2.17</p>
							</c>
							<c ca="left">
								<p>0.0295</p>
							</c>
							<c ca="left">
								<p>GO:0009142</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Purine ribonucleoside triphosphate metabolism</p>
							</c>
							<c ca="left">
								<p>20</p>
							</c>
							<c ca="left">
								<p>9</p>
							</c>
							<c ca="left">
								<p>-2.17</p>
							</c>
							<c ca="left">
								<p>0.0295</p>
							</c>
							<c ca="left">
								<p>GO:0009205</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Purine ribonucleoside triphosphate biosynthesis</p>
							</c>
							<c ca="left">
								<p>19</p>
							</c>
							<c ca="left">
								<p>9</p>
							</c>
							<c ca="left">
								<p>-2.35</p>
							</c>
							<c ca="left">
								<p>0.0167</p>
							</c>
							<c ca="left">
								<p>GO:0009206</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Purine nucleoside triphosphate metabolism</p>
							</c>
							<c ca="left">
								<p>21</p>
							</c>
							<c ca="left">
								<p>9</p>
							</c>
							<c ca="left">
								<p>-2.00</p>
							</c>
							<c ca="left">
								<p>0.0413</p>
							</c>
							<c ca="left">
								<p>GO:0009144</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Purine nucleoside triphosphate biosynthesis</p>
							</c>
							<c ca="left">
								<p>19</p>
							</c>
							<c ca="left">
								<p>9</p>
							</c>
							<c ca="left">
								<p>-2.35</p>
							</c>
							<c ca="left">
								<p>0.0167</p>
							</c>
							<c ca="left">
								<p>GO:0009145</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Nucleoside metabolism</p>
							</c>
							<c ca="left">
								<p>14</p>
							</c>
							<c ca="left">
								<p>7</p>
							</c>
							<c ca="left">
								<p>-2.07</p>
							</c>
							<c ca="left">
								<p>0.0400</p>
							</c>
							<c ca="left">
								<p>GO:0009116</p>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>Protein modifiication and degradation</b>
								</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Protein metabolism</p>
							</c>
							<c ca="left">
								<p>836</p>
							</c>
							<c ca="left">
								<p>210</p>
							</c>
							<c ca="left">
								<p>-6.86</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0019538</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Protein biosynthesis</p>
							</c>
							<c ca="left">
								<p>207</p>
							</c>
							<c ca="left">
								<p>72</p>
							</c>
							<c ca="left">
								<p>-7.76</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0006412</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Intracellular transport</p>
							</c>
							<c ca="left">
								<p>176</p>
							</c>
							<c ca="left">
								<p>63</p>
							</c>
							<c ca="left">
								<p>-7.36</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0046907</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Protein transport</p>
							</c>
							<c ca="left">
								<p>149</p>
							</c>
							<c ca="left">
								<p>52</p>
							</c>
							<c ca="left">
								<p>-5.75</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0015031</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Intracellular protein transport</p>
							</c>
							<c ca="left">
								<p>138</p>
							</c>
							<c ca="left">
								<p>50</p>
							</c>
							<c ca="left">
								<p>-6.11</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0006886</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Amino acid and derivative metabolism</p>
							</c>
							<c ca="left">
								<p>126</p>
							</c>
							<c ca="left">
								<p>36</p>
							</c>
							<c ca="left">
								<p>-2.31</p>
							</c>
							<c ca="left">
								<p>0.0198</p>
							</c>
							<c ca="left">
								<p>GO:0006519</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Amino acid metabolism</p>
							</c>
							<c ca="left">
								<p>98</p>
							</c>
							<c ca="left">
								<p>29</p>
							</c>
							<c ca="left">
								<p>-2.19</p>
							</c>
							<c ca="left">
								<p>0.0297</p>
							</c>
							<c ca="left">
								<p>GO:0006520</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Ubiquitin-dependent protein catabolism</p>
							</c>
							<c ca="left">
								<p>48</p>
							</c>
							<c ca="left">
								<p>26</p>
							</c>
							<c ca="left">
								<p>-7.39</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0006511</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Modification-dependent protein catabolism</p>
							</c>
							<c ca="left">
								<p>48</p>
							</c>
							<c ca="left">
								<p>26</p>
							</c>
							<c ca="left">
								<p>-7.39</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0019941</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Protein targeting</p>
							</c>
							<c ca="left">
								<p>70</p>
							</c>
							<c ca="left">
								<p>23</p>
							</c>
							<c ca="left">
								<p>-2.44</p>
							</c>
							<c ca="left">
								<p>0.0139</p>
							</c>
							<c ca="left">
								<p>GO:0006605</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Protein folding</p>
							</c>
							<c ca="left">
								<p>46</p>
							</c>
							<c ca="left">
								<p>22</p>
							</c>
							<c ca="left">
								<p>-5.14</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0006457</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Ubiquitin cycle</p>
							</c>
							<c ca="left">
								<p>31</p>
							</c>
							<c ca="left">
								<p>12</p>
							</c>
							<c ca="left">
								<p>-2.10</p>
							</c>
							<c ca="left">
								<p>0.0351</p>
							</c>
							<c ca="left">
								<p>GO:0006512</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Amino acid activation</p>
							</c>
							<c ca="left">
								<p>21</p>
							</c>
							<c ca="left">
								<p>10</p>
							</c>
							<c ca="left">
								<p>-2.59</p>
							</c>
							<c ca="left">
								<p>0.0125</p>
							</c>
							<c ca="left">
								<p>GO:0043038</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Polyamine metabolism</p>
							</c>
							<c ca="left">
								<p>5</p>
							</c>
							<c ca="left">
								<p>4</p>
							</c>
							<c ca="left">
								<p>-2.27</p>
							</c>
							<c ca="left">
								<p>0.0271</p>
							</c>
							<c ca="left">
								<p>GO:0006595</p>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>Metabolism</b>
								</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Metabolism</p>
							</c>
							<c ca="left">
								<p>2008</p>
							</c>
							<c ca="left">
								<p>457</p>
							</c>
							<c ca="left">
								<p>-12.88</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0008152</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Biosynthesis</p>
							</c>
							<c ca="left">
								<p>423</p>
							</c>
							<c ca="left">
								<p>119</p>
							</c>
							<c ca="left">
								<p>-6.33</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0009058</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Energy pathways</p>
							</c>
							<c ca="left">
								<p>128</p>
							</c>
							<c ca="left">
								<p>38</p>
							</c>
							<c ca="left">
								<p>-2.74</p>
							</c>
							<c ca="left">
								<p>0.0074</p>
							</c>
							<c ca="left">
								<p>GO:0006091</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Energy derivation by oxidation of organic compounds</p>
							</c>
							<c ca="left">
								<p>89</p>
							</c>
							<c ca="left">
								<p>32</p>
							</c>
							<c ca="left">
								<p>-4.02</p>
							</c>
							<c ca="left">
								<p>0.0036</p>
							</c>
							<c ca="left">
								<p>GO:0015980</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Main pathways of carbohydrate metabolism</p>
							</c>
							<c ca="left">
								<p>56</p>
							</c>
							<c ca="left">
								<p>20</p>
							</c>
							<c ca="left">
								<p>-2.67</p>
							</c>
							<c ca="left">
								<p>0.0069</p>
							</c>
							<c ca="left">
								<p>GO:0006092</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Coenzyme and prosthetic group metabolism</p>
							</c>
							<c ca="left">
								<p>55</p>
							</c>
							<c ca="left">
								<p>18</p>
							</c>
							<c ca="left">
								<p>-2.00</p>
							</c>
							<c ca="left">
								<p>0.0419</p>
							</c>
							<c ca="left">
								<p>GO:0006731</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Coenzyme metabolism</p>
							</c>
							<c ca="left">
								<p>44</p>
							</c>
							<c ca="left">
								<p>16</p>
							</c>
							<c ca="left">
								<p>-2.32</p>
							</c>
							<c ca="left">
								<p>0.0200</p>
							</c>
							<c ca="left">
								<p>GO:0006732</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Glucose catabolism</p>
							</c>
							<c ca="left">
								<p>30</p>
							</c>
							<c ca="left">
								<p>12</p>
							</c>
							<c ca="left">
								<p>-2.23</p>
							</c>
							<c ca="left">
								<p>0.0307</p>
							</c>
							<c ca="left">
								<p>GO:0006007</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Coenzyme and prosthetic group biosynthesis</p>
							</c>
							<c ca="left">
								<p>31</p>
							</c>
							<c ca="left">
								<p>12</p>
							</c>
							<c ca="left">
								<p>-2.10</p>
							</c>
							<c ca="left">
								<p>0.0351</p>
							</c>
							<c ca="left">
								<p>GO:0046138</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Oxidative phosphorylation</p>
							</c>
							<c ca="left">
								<p>13</p>
							</c>
							<c ca="left">
								<p>11</p>
							</c>
							<c ca="left">
								<p>-6.25</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0006119</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Coenzyme biosynthesis</p>
							</c>
							<c ca="left">
								<p>23</p>
							</c>
							<c ca="left">
								<p>10</p>
							</c>
							<c ca="left">
								<p>-2.23</p>
							</c>
							<c ca="left">
								<p>0.0299</p>
							</c>
							<c ca="left">
								<p>GO:0009108</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Cellular respiration</p>
							</c>
							<c ca="left">
								<p>11</p>
							</c>
							<c ca="left">
								<p>9</p>
							</c>
							<c ca="left">
								<p>-4.94</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0045333</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Aerobic respiration</p>
							</c>
							<c ca="left">
								<p>9</p>
							</c>
							<c ca="left">
								<p>8</p>
							</c>
							<c ca="left">
								<p>-4.92</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0009060</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Tricarboxylic acid cycle</p>
							</c>
							<c ca="left">
								<p>18</p>
							</c>
							<c ca="left">
								<p>8</p>
							</c>
							<c ca="left">
								<p>-1.94</p>
							</c>
							<c ca="left">
								<p>0.0462</p>
							</c>
							<c ca="left">
								<p>GO:0006099</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>ATP synthesis coupled electron transport (<it>sensu </it>Eukarya)</p>
							</c>
							<c ca="left">
								<p>6</p>
							</c>
							<c ca="left">
								<p>5</p>
							</c>
							<c ca="left">
								<p>-2.91</p>
							</c>
							<c ca="left">
								<p>0.0061</p>
							</c>
							<c ca="left">
								<p>GO:0042775</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>ATP synthesis coupled electron transport</p>
							</c>
							<c ca="left">
								<p>6</p>
							</c>
							<c ca="left">
								<p>5</p>
							</c>
							<c ca="left">
								<p>-2.91</p>
							</c>
							<c ca="left">
								<p>0.0061</p>
							</c>
							<c ca="left">
								<p>GO:0042773</p>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>Cell-cycle progression</b>
								</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Cell cycle</p>
							</c>
							<c ca="left">
								<p>324</p>
							</c>
							<c ca="left">
								<p>89</p>
							</c>
							<c ca="left">
								<p>-4.32</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0007049</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Cell organization and biogenesis</p>
							</c>
							<c ca="left">
								<p>315</p>
							</c>
							<c ca="left">
								<p>83</p>
							</c>
							<c ca="left">
								<p>-3.38</p>
							</c>
							<c ca="left">
								<p>0.0054</p>
							</c>
							<c ca="left">
								<p>GO:0016043</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>DNA metabolism</p>
							</c>
							<c ca="left">
								<p>188</p>
							</c>
							<c ca="left">
								<p>64</p>
							</c>
							<c ca="left">
								<p>-6.53</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0006259</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Mitotic cell cycle</p>
							</c>
							<c ca="left">
								<p>153</p>
							</c>
							<c ca="left">
								<p>58</p>
							</c>
							<c ca="left">
								<p>-7.84</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0000278</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Cytoplasm organization and biogenesis</p>
							</c>
							<c ca="left">
								<p>202</p>
							</c>
							<c ca="left">
								<p>55</p>
							</c>
							<c ca="left">
								<p>-2.73</p>
							</c>
							<c ca="left">
								<p>0.0073</p>
							</c>
							<c ca="left">
								<p>GO:0007028</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>DNA replication and chromosome cycle</p>
							</c>
							<c ca="left">
								<p>83</p>
							</c>
							<c ca="left">
								<p>30</p>
							</c>
							<c ca="left">
								<p>-3.85</p>
							</c>
							<c ca="left">
								<p>0.0033</p>
							</c>
							<c ca="left">
								<p>GO:0000067</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M phase</p>
							</c>
							<c ca="left">
								<p>62</p>
							</c>
							<c ca="left">
								<p>26</p>
							</c>
							<c ca="left">
								<p>-4.68</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0000279</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Nuclear organization and biogenesis</p>
							</c>
							<c ca="left">
								<p>79</p>
							</c>
							<c ca="left">
								<p>25</p>
							</c>
							<c ca="left">
								<p>-2.36</p>
							</c>
							<c ca="left">
								<p>0.0176</p>
							</c>
							<c ca="left">
								<p>GO:0006997</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>DNA packaging</p>
							</c>
							<c ca="left">
								<p>69</p>
							</c>
							<c ca="left">
								<p>25</p>
							</c>
							<c ca="left">
								<p>-3.31</p>
							</c>
							<c ca="left">
								<p>0.0049</p>
							</c>
							<c ca="left">
								<p>GO:0006323</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>S phase of mitotic cell cycle</p>
							</c>
							<c ca="left">
								<p>72</p>
							</c>
							<c ca="left">
								<p>25</p>
							</c>
							<c ca="left">
								<p>-3.00</p>
							</c>
							<c ca="left">
								<p>0.0043</p>
							</c>
							<c ca="left">
								<p>GO:0000084</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Chromosome organization and biogenesis (<it>sensu </it>Eukarya)</p>
							</c>
							<c ca="left">
								<p>77</p>
							</c>
							<c ca="left">
								<p>24</p>
							</c>
							<c ca="left">
								<p>-2.20</p>
							</c>
							<c ca="left">
								<p>0.0300</p>
							</c>
							<c ca="left">
								<p>GO:0007001</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>DNA replication</p>
							</c>
							<c ca="left">
								<p>67</p>
							</c>
							<c ca="left">
								<p>23</p>
							</c>
							<c ca="left">
								<p>-2.72</p>
							</c>
							<c ca="left">
								<p>0.0071</p>
							</c>
							<c ca="left">
								<p>GO:0006260</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Nuclear division</p>
							</c>
							<c ca="left">
								<p>54</p>
							</c>
							<c ca="left">
								<p>22</p>
							</c>
							<c ca="left">
								<p>-3.82</p>
							</c>
							<c ca="left">
								<p>0.0031</p>
							</c>
							<c ca="left">
								<p>GO:0000280</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Establishment and/or maintenance of chromatin architecture</p>
							</c>
							<c ca="left">
								<p>64</p>
							</c>
							<c ca="left">
								<p>21</p>
							</c>
							<c ca="left">
								<p>-2.27</p>
							</c>
							<c ca="left">
								<p>0.0217</p>
							</c>
							<c ca="left">
								<p>GO:0006325</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M phase of mitotic cell cycle</p>
							</c>
							<c ca="left">
								<p>45</p>
							</c>
							<c ca="left">
								<p>20</p>
							</c>
							<c ca="left">
								<p>-4.15</p>
							</c>
							<c ca="left">
								<p>0.0040</p>
							</c>
							<c ca="left">
								<p>GO:0000087</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>DNA repair</p>
							</c>
							<c ca="left">
								<p>59</p>
							</c>
							<c ca="left">
								<p>20</p>
							</c>
							<c ca="left">
								<p>-2.36</p>
							</c>
							<c ca="left">
								<p>0.0173</p>
							</c>
							<c ca="left">
								<p>GO:0006281</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Mitosis</p>
							</c>
							<c ca="left">
								<p>42</p>
							</c>
							<c ca="left">
								<p>19</p>
							</c>
							<c ca="left">
								<p>-4.10</p>
							</c>
							<c ca="left">
								<p>0.0037</p>
							</c>
							<c ca="left">
								<p>GO:0007067</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Microtubule-based process</p>
							</c>
							<c ca="left">
								<p>45</p>
							</c>
							<c ca="left">
								<p>19</p>
							</c>
							<c ca="left">
								<p>-3.61</p>
							</c>
							<c ca="left">
								<p>0.0059</p>
							</c>
							<c ca="left">
								<p>GO:0007017</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>DNA-dependent DNA replication</p>
							</c>
							<c ca="left">
								<p>35</p>
							</c>
							<c ca="left">
								<p>15</p>
							</c>
							<c ca="left">
								<p>-3.04</p>
							</c>
							<c ca="left">
								<p>0.0044</p>
							</c>
							<c ca="left">
								<p>GO:0006261</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Microtubule cytoskeleton organization and biogenesis</p>
							</c>
							<c ca="left">
								<p>27</p>
							</c>
							<c ca="left">
								<p>14</p>
							</c>
							<c ca="left">
								<p>-3.94</p>
							</c>
							<c ca="left">
								<p>0.0034</p>
							</c>
							<c ca="left">
								<p>GO:0000226</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>G1/S transition of mitotic cell cycle</p>
							</c>
							<c ca="left">
								<p>35</p>
							</c>
							<c ca="left">
								<p>13</p>
							</c>
							<c ca="left">
								<p>-2.06</p>
							</c>
							<c ca="left">
								<p>0.0396</p>
							</c>
							<c ca="left">
								<p>GO:0000082</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>G2/M transition of mitotic cell cycle</p>
							</c>
							<c ca="left">
								<p>21</p>
							</c>
							<c ca="left">
								<p>9</p>
							</c>
							<c ca="left">
								<p>-2.00</p>
							</c>
							<c ca="left">
								<p>0.0413</p>
							</c>
							<c ca="left">
								<p>GO:0000086</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M-phase specific microtubule process</p>
							</c>
							<c ca="left">
								<p>12</p>
							</c>
							<c ca="left">
								<p>7</p>
							</c>
							<c ca="left">
								<p>-2.56</p>
							</c>
							<c ca="left">
								<p>0.0132</p>
							</c>
							<c ca="left">
								<p>GO:0000072</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Chromosome segregation</p>
							</c>
							<c ca="left">
								<p>14</p>
							</c>
							<c ca="left">
								<p>7</p>
							</c>
							<c ca="left">
								<p>-2.07</p>
							</c>
							<c ca="left">
								<p>0.0400</p>
							</c>
							<c ca="left">
								<p>GO:0007059</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Microtubule nucleation</p>
							</c>
							<c ca="left">
								<p>9</p>
							</c>
							<c ca="left">
								<p>6</p>
							</c>
							<c ca="left">
								<p>-2.65</p>
							</c>
							<c ca="left">
								<p>0.0117</p>
							</c>
							<c ca="left">
								<p>GO:0007020</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>DNA replication initiation</p>
							</c>
							<c ca="left">
								<p>10</p>
							</c>
							<c ca="left">
								<p>6</p>
							</c>
							<c ca="left">
								<p>-2.32</p>
							</c>
							<c ca="left">
								<p>0.0190</p>
							</c>
							<c ca="left">
								<p>GO:0006270</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Spindle assembly</p>
							</c>
							<c ca="left">
								<p>8</p>
							</c>
							<c ca="left">
								<p>6</p>
							</c>
							<c ca="left">
								<p>-3.05</p>
							</c>
							<c ca="left">
								<p>0.0045</p>
							</c>
							<c ca="left">
								<p>GO:0007051</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Tubulin folding</p>
							</c>
							<c ca="left">
								<p>9</p>
							</c>
							<c ca="left">
								<p>6</p>
							</c>
							<c ca="left">
								<p>-2.65</p>
							</c>
							<c ca="left">
								<p>0.0117</p>
							</c>
							<c ca="left">
								<p>GO:0007021</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Mitotic spindle assembly</p>
							</c>
							<c ca="left">
								<p>6</p>
							</c>
							<c ca="left">
								<p>5</p>
							</c>
							<c ca="left">
								<p>-2.91</p>
							</c>
							<c ca="left">
								<p>0.0061</p>
							</c>
							<c ca="left">
								<p>GO:0007052</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Pre-replicative complex formation and maintenance</p>
							</c>
							<c ca="left">
								<p>5</p>
							</c>
							<c ca="left">
								<p>4</p>
							</c>
							<c ca="left">
								<p>-2.27</p>
							</c>
							<c ca="left">
								<p>0.0271</p>
							</c>
							<c ca="left">
								<p>GO:0006267</p>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>Chromatin modifications</b>
								</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Histone modification</p>
							</c>
							<c ca="left">
								<p>12</p>
							</c>
							<c ca="left">
								<p>7</p>
							</c>
							<c ca="left">
								<p>-2.56</p>
							</c>
							<c ca="left">
								<p>0.0132</p>
							</c>
							<c ca="left">
								<p>GO:0016570</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Covalent chromatin modification</p>
							</c>
							<c ca="left">
								<p>12</p>
							</c>
							<c ca="left">
								<p>7</p>
							</c>
							<c ca="left">
								<p>-2.56</p>
							</c>
							<c ca="left">
								<p>0.0132</p>
							</c>
							<c ca="left">
								<p>GO:0016569</p>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>Others</b>
								</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Physiological process</p>
							</c>
							<c ca="left">
								<p>2917</p>
							</c>
							<c ca="left">
								<p>574</p>
							</c>
							<c ca="left">
								<p>-3.84</p>
							</c>
							<c ca="left">
								<p>0.0032</p>
							</c>
							<c ca="left">
								<p>GO:0007582</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Macromolecule biosynthesis</p>
							</c>
							<c ca="left">
								<p>345</p>
							</c>
							<c ca="left">
								<p>100</p>
							</c>
							<c ca="left">
								<p>-5.98</p>
							</c>
							<c ca="left">
								<p>0.0000</p>
							</c>
							<c ca="left">
								<p>GO:0009059</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Response to endogenous stimulus</p>
							</c>
							<c ca="left">
								<p>77</p>
							</c>
							<c ca="left">
								<p>23</p>
							</c>
							<c ca="left">
								<p>-1.89</p>
							</c>
							<c ca="left">
								<p>0.0486</p>
							</c>
							<c ca="left">
								<p>GO:0009719</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Response to DNA damage stimulus</p>
							</c>
							<c ca="left">
								<p>71</p>
							</c>
							<c ca="left">
								<p>22</p>
							</c>
							<c ca="left">
								<p>-2.02</p>
							</c>
							<c ca="left">
								<p>0.0412</p>
							</c>
							<c ca="left">
								<p>GO:0006974</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
				<tbl id="T4">
					<title>
						<p>Table 4</p>
					</title>
					<caption>
						<p>Biological process downregulated <it>in vitro</it></p>
					</caption>
					<tblbdy cols="6">
						<r>
							<c ca="left">
								<p>GO category</p>
							</c>
							<c ca="center">
								<p>Total number of genes</p>
							</c>
							<c ca="center">
								<p>Genes changed</p>
							</c>
							<c ca="center">
								<p>Log<sub>10 </sub>(<it>p</it>-value)</p>
							</c>
							<c ca="center">
								<p>FDR</p>
							</c>
							<c ca="center">
								<p>ID</p>
							</c>
						</r>
						<r>
							<c cspan="6">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Membrane signaling and cell adhesion</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Cell communication</p>
							</c>
							<c ca="center">
								<p>1088</p>
							</c>
							<c ca="center">
								<p>565</p>
							</c>
							<c ca="center">
								<p>-13.76</p>
							</c>
							<c ca="center">
								<p>0.0000</p>
							</c>
							<c ca="center">
								<p>GO:0007154</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Signal transduction</p>
							</c>
							<c ca="center">
								<p>831</p>
							</c>
							<c ca="center">
								<p>428</p>
							</c>
							<c ca="center">
								<p>-8.86</p>
							</c>
							<c ca="center">
								<p>0.0000</p>
							</c>
							<c ca="center">
								<p>GO:0007165</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Cell surface receptor linked signal transduction</p>
							</c>
							<c ca="center">
								<p>413</p>
							</c>
							<c ca="center">
								<p>232</p>
							</c>
							<c ca="center">
								<p>-8.66</p>
							</c>
							<c ca="center">
								<p>0.0000</p>
							</c>
							<c ca="center">
								<p>GO:0007166</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Cell adhesion</p>
							</c>
							<c ca="center">
								<p>257</p>
							</c>
							<c ca="center">
								<p>139</p>
							</c>
							<c ca="center">
								<p>-4.10</p>
							</c>
							<c ca="center">
								<p>0.0000</p>
							</c>
							<c ca="center">
								<p>GO:0007155</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Cell-cell signaling</p>
							</c>
							<c ca="center">
								<p>240</p>
							</c>
							<c ca="center">
								<p>132</p>
							</c>
							<c ca="center">
								<p>-4.38</p>
							</c>
							<c ca="center">
								<p>0.0000</p>
							</c>
							<c ca="center">
								<p>GO:0007267</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Cell motility</p>
							</c>
							<c ca="center">
								<p>197</p>
							</c>
							<c ca="center">
								<p>105</p>
							</c>
							<c ca="center">
								<p>-2.91</p>
							</c>
							<c ca="center">
								<p>0.0171</p>
							</c>
							<c ca="center">
								<p>GO:0006928</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>G-protein coupled receptor protein signaling pathway</p>
							</c>
							<c ca="center">
								<p>175</p>
							</c>
							<c ca="center">
								<p>102</p>
							</c>
							<c ca="center">
								<p>-4.87</p>
							</c>
							<c ca="center">
								<p>0.0000</p>
							</c>
							<c ca="center">
								<p>GO:0007186</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Enzyme linked receptor protein signaling pathway</p>
							</c>
							<c ca="center">
								<p>107</p>
							</c>
							<c ca="center">
								<p>61</p>
							</c>
							<c ca="center">
								<p>-2.78</p>
							</c>
							<c ca="center">
								<p>0.0231</p>
							</c>
							<c ca="center">
								<p>GO:0007167</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Cell-cell adhesion</p>
							</c>
							<c ca="center">
								<p>87</p>
							</c>
							<c ca="center">
								<p>53</p>
							</c>
							<c ca="center">
								<p>-3.41</p>
							</c>
							<c ca="center">
								<p>0.0000</p>
							</c>
							<c ca="center">
								<p>GO:0016337</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>G-protein signaling, coupled to IP3 second messenger (phospholipase C activating)</p>
							</c>
							<c ca="center">
								<p>35</p>
							</c>
							<c ca="center">
								<p>23</p>
							</c>
							<c ca="center">
								<p>-2.32</p>
							</c>
							<c ca="center">
								<p>0.0490</p>
							</c>
							<c ca="center">
								<p>GO:0007200</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Extracellular structure organization and biogenesis</p>
							</c>
							<c ca="center">
								<p>17</p>
							</c>
							<c ca="center">
								<p>14</p>
							</c>
							<c ca="center">
								<p>-3.02</p>
							</c>
							<c ca="center">
								<p>0.0029</p>
							</c>
							<c ca="center">
								<p>GO:0043062</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Extracellular matrix organization and biogenesis</p>
							</c>
							<c ca="center">
								<p>16</p>
							</c>
							<c ca="center">
								<p>13</p>
							</c>
							<c ca="center">
								<p>-2.73</p>
							</c>
							<c ca="center">
								<p>0.0225</p>
							</c>
							<c ca="center">
								<p>GO:0030198</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st>
			<p>The use of immortalized cell lines as model systems of normal and pathological tissues is controversial <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>. There are obvious general differences between the environment of cells growing <it>in vitro </it>and that of <it>in vivo </it>tissue cells, including oxidative pressure, nutrient accessibility, cell-cell contact and interactions with ECM, as well as in growth rate. These differences influence the gene expression and the phenotype of the cells grown <it>in vitro</it>. Many gene-expression studies have analyzed the differences between cell lines derived from a specific tumor tissue to the corresponding tumor tissues and primary cultures <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B10">10</abbr><abbr bid="B12">12</abbr><abbr bid="B29">29</abbr></abbrgrp>. These studies are important to asses how cell-line model systems have maintained the gene expression of their tumor origins, that is, their tissue identities. We have previously developed a method to assess how gene expression in individual cell lines relates to tumors of different tissue origins <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. It is, however of equal importance to pinpoint the cellular processes affected by long term <it>in vitro </it>growth irrespectively of tissue origin. Therefore we have performed a comprehensive analysis of gene-expression profiles of 60 cell lines and 311 samples from multiple tissue origins. The analyses showed that approximately 30% of the genes investigated were differentially expressed in immortalized cell lines.</p>
			<p>We used GO to characterize the cellular processes that were transcriptionally altered in cell lines. This analysis identified the common biological processes that were transcriptionally altered in rapidly dividing cells, that is, a molecular portrait of proliferation. In support of previous findings <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B10">10</abbr></abbrgrp>, these data confirmed an upregulation of genes involved in translation, cell-cycle regulation and DNA replication. In addition, this comparison identified many other cellular processes that were upregulated (Table <tblr tid="T2">2</tblr>). Genes involved in energy metabolism, nucleotide metabolism, splicing, protein modifications and degradation, and chromatin regulation were enriched among the upregulated genes <it>in vitro</it>. As expected, many of the upregulated genes seem to be directly involved in cell divisions. For example, the maintenance methylation enzyme, DNA methyltransferase 1 (DNMT1), was consistently upregulated in the rapidly dividing cell lines. DNMT1 methylates newly synthesized DNA and is directly involved in the DNA replication process. The <it>de novo </it>DNA methylation enzymes DNMT3A and DNMT3B were not, however, upregulated in cell lines. Therefore, it is tempting to speculate that the list of upregulated genes is enriched in genes directly involved in the essential cellular processes for rapidly diving cells (for example, DNA replication). The gene list might therefore be used to predict which cellular factors are general and which factors have more specialized regulatory roles. Certain histone-modifying proteins (HDAC1, EZH2, and HP1 beta and gamma subunits) were upregulated in cell lines whereas others were not. Could these factors also be directly involved in DNA replication?</p>
			<p>Among the genes downregulated <it>in vitro </it>we detected many involved in cell communication, membrane signaling, and adhesion to ECM. A downregulation of genes involved in ECM interactions were previously found in a serial analysis of gene expression (SAGE) study <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. Our results confirm their observation. We further demonstrate that additional membrane signaling proteins, working downstream of G-protein-coupled receptors, were downregulated <it>in vitro</it>. The downregulation of many proteins involved in membrane signaling, cell-cell communication and adhesion to ECM probably reflect the altered environment for cells growing <it>in vitro </it>and in defined cell-culture media and in contrast to the organization of cells in tissues <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>. Indeed, when transplanting tumor cell lines into immunodeficient mice and analyzing the resulting tumors, genes involved in ECM and cell adhesion were again upregulated <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. The gene-expression comparison presented in this study could also be used for detailed characterization of particular pathways <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> to identify which are up- or downregulated as part of the cell-line adaptation to <it>in vitro </it>conditions.</p>
			<p>This study compared immortalized cell lines to solid tumors of diverse origins. Tissues are complex, heterogeneous mixtures of cell types, whereas cell lines contain just one more-or-less clonal cell type, selected for its ability to grow under <it>in vitro </it>conditions. It is likely that the expression of genes in tumor-derived cell lines is more similar to that in the malignant cells within the tumor tissue. Thus the <it>in vitro </it>signature is a combined effect of <it>in vitro </it>adaptation and selection for subtypes of cells from the tissue. Although at present it would be methodologically very hard to establish the contribution from either of these two phenomena, some general remarks can be made. Genes more highly expressed in the malignant cell would appear upregulated in cell lines as a result of the enrichment of this cell in culture. Because the tumor samples contained at least 50% malignant cells (usually more, see Materials and methods) this 'enrichment effect' could never result in an artificial fold-change of more than 2. In our data, 344 genes (dataset I) and 1,159 genes (dataset II) were upregulated in cell lines with a fold-change exceeding 2. It is therefore impossible that the enrichment effect explains the major part of the observed upregulation of genes <it>in vitro</it>. It could only bias the numbers to a limited extent. On the other hand, the degrees of infiltration of stromal cells vary between different solid tumors <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. There is a possibility that genes upregulated in stromal cells appear downregulated in cell lines as a result of the lack of these cells in culture. This dilution effect could potentially result in an apparent downregulation in cell lines of genes with a fold-change value exceeding 2. This requires that there is a sixfold change in the expression in the stromal compartment comprising 20% of the cells in the tumor, for a gene to appear downregulated by more than twofold in cell lines. One extreme, but interesting, possibility would be that the cells growing <it>in vitro </it>are derived from a putative 'cancer stem cell' <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. In that case the enrichment effect could be profound, and the observed expression signature would then be a combination of the <it>in vitro </it>adaptation and selection for a common cancer stem cell signature. These intriguing issues might be resolved using laser-capture microdissection <abbrgrp><abbr bid="B35">35</abbr></abbrgrp> on specific subpopulations of cells within the tumor for cases where reliable stem-cell markers can be established or applying tissue modeling in <it>in vitro </it>three-dimensional culture systems <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>. It must be emphasized, however, that the tumor tissue phenotype is very much dictated by the interplay between different cell types, which is decisively interrupted by growth <it>in vitro </it><abbrgrp><abbr bid="B28">28</abbr><abbr bid="B33">33</abbr></abbrgrp>. The interplay between malignant cells and stroma can be dissected using xenografts. In a recent study, human cell lines were injected into mice and the effect of stromal components on the gene expression of the malignant cell was specifically investigated <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Finally, it is of fundamental importance to pinpoint the common transcriptional differences and similarities of these cell lines to their tissues of origin irrespective of their causes, as in our study. These cell lines are routinely used as model systems of tumors and normal tissues. Therefore the nature and volume of effects related to <it>in vitro </it>culture are profoundly relevant.</p>
			<p>It would be interesting to investigate the temporal aspects of the establishment of the <it>in vitro </it>signature. In a recent study 6- and 24-hour primary cultures of hepatocytes were compared to liver tissues <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. Not surprisingly, it was found that the gene-expression profiles separated gradually with time. However, the genes reported to be upregulated at 6 and 24 hours are not the same as the ones that were found to be universally upregulated in our tumor-derived cell lines, indicating the need for a longer period of time before the <it>in vitro </it>signature gets established. Other studies have identified higher expression of a limited set of proliferation-associated genes in immortalized cancer cell lines when compared with primary cultures <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B29">29</abbr></abbrgrp>. Therefore, it is likely that the extensive differential expression observed in this study occur as a result of long-term adaptation due to <it>in vitro </it>selection and adaptation.</p>
			<p>This study also introduced a fruitful cross-site approach for quantitative comparison of gene-expression data from different laboratories. The growing wealth of gene-expression data available in public databases offers great opportunities for computational experiments. It must, however, be emphasized that a successful comparison of gene-expression data from different laboratories depends on the quality of the data and similarities in the experimental protocols used <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. Therefore, careful quality controls and validations of gene-expression comparisons must always be performed. If available, raw data files (that is, CEL files) would enable additional quality controls (such as checking the image for hybridization scratches) and the use of different methods to estimate transcript levels <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. We developed a quality-control procedure by examining the scalar factors, correlation between similar samples, SVD, and an independent validation dataset. This approach was successful in the analysis of gene-expression data from three different laboratories (using the same Affymetrix Hu6800 platform). Thus, quantitative comparisons of gene-expression data from different sites may be feasible.</p>
		</sec>
		<sec>
			<st>
				<p>Conclusion</p>
			</st>
			<p>This cross-site comparison of gene expression in cell lines, normal, and tumor tissues revealed a distinct <it>in vitro </it>gene-expression signature. This signature deserves attention as a biological phenomenon itself, as it can elucidate and teach us about the impressive consequences of <it>in vitro </it>selection and adaptation, with implications for tissue organization and future tissue engineering <it>in vitro</it>.</p>
		</sec>
		<sec>
			<st>
				<p>Materials and methods</p>
			</st>
			<sec>
				<st>
					<p>Gene-expression data</p>
				</st>
				<p>We compiled gene-expression data on cell lines, normal, and tumor samples from three different studies <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp> that all used Affymetrix Hu6800 arrays. The National Cancer Institute NCI60 cell-line gene-expression data <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> were downloaded from Cancer Program Data Sets <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. The tab-delimited text file (NCI60_aug99_resfile.txt) contained scaled expression data together with 'absolute calls' (absent, present and marginal). The 60 cell lines came from the following tissues: lung (<it>n </it>= 9), colon (<it>n </it>= 7), breast (<it>n </it>= 8), ovary (<it>n </it>= 6), leukemia (<it>n </it>= 6), renal (<it>n </it>= 8), melanoma (<it>n </it>= 8), prostate (<it>n </it>= 2), nervous system (<it>n </it>= 6). Gene-expression data for 59 human tissue samples <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> were downloaded from Human Gene Expression Index <abbrgrp><abbr bid="B40">40</abbr></abbrgrp> in an already normalized format and represented the following samples: blood (<it>n </it>= 1), brain (<it>n </it>= 11), breast (<it>n </it>= 2), colon (<it>n </it>= 1), cervix (<it>n </it>= 1), endometrium (<it>n </it>= 2), esophagus (<it>n </it>= 1), kidney (<it>n </it>= 6), liver (<it>n </it>= 6), lung (<it>n </it>= 6), muscle (<it>n </it>= 6), myometrium (<it>n </it>= 2), ovary (<it>n </it>= 2), placenta (<it>n </it>= 2), prostate (<it>n </it>= 4), spleen (<it>n </it>= 1), stomach (<it>n </it>= 1), testis (<it>n </it>= 1), vulva (<it>n </it>= 3). Gene-expression profiles of 60 normal and 189 tumor samples from 14 different tissue origins <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> were downloaded as raw (unscaled) gene-expression data (GCM_Total.res) from Cancer Program Data Sets <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. Tumor tissue origins were: breast, prostate, lung, colon, lymphoma, melanoma, bladder, uterus, leukemia, kidney, ovary, mesothelioma, and central nervous system. Normal samples were from the following tissues: breast, prostate, lung, colon, germinal center, bladder, uterus, peripheral blood, kidney, pancreas, ovary and central nervous system. All tumors were biopsy specimens from primary sites, obtained before any treatment and enriched for at least 50% malignant cells <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. For further details see <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>.</p>
				<p>An independent validation dataset (dataset II) that contained both <it>in vivo </it>samples (<it>n </it>= 70) and cell lines (<it>n </it>= 25) hybridized to Affymetrix HGU95A arrays <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> was downloaded from the Gene Expression Atlas <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. The gene-expression data had previously been scaled using the GeneChip Global Scaling algorithm to a target intensity of 200.</p>
				<p>Three datasets were used to assess our ability to classify samples into either cell lines or tissues. Dataset III comprised 10 cell lines and 123 tissue samples <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. Genes were matched between U133A and HGU95A on the basis of best-match spreadsheets from Affymetrix NetAffx <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. Dataset IV <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> comprised 15 cell lines and 64 tumors (mostly lymphomas) <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Dataset V comprised 10 cell lines and 81 lung tumors and normal biopsies <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> and we used UniGene identifiers to map their genes to our Affymetrix array identifiers. Only a limited number of genes (<it>n </it>= 36) of the 576 had a UniGene match. Nevertheless, using only 36 genes most samples were correctly classified as cell lines or tissues. The HUVEC cells of unknown passage from dataset II and FACS-purified cells were excluded from this classification of cell lines and tissues.</p>
			</sec>
			<sec>
				<st>
					<p>Normalization</p>
				</st>
				<p>To compare the gene-expression data generated in different laboratories we rescaled each sample to equal global chip intensity. The global scaling algorithm was calculated from the positive average difference values excluding the top and bottom 2% average difference values. A reference sample (lung-derived cell line: NSCLC_H460) was chosen on the basis of its average percent present and its average global chip intensity before rescaling. All other samples were rescaled to the equal average chip intensity as the reference sample. We thereafter 'thresholded' the data using a ceiling of 16,000 units and a floor of 20 units.</p>
			</sec>
			<sec>
				<st>
					<p>Singular value decomposition</p>
				</st>
				<p>Singular value decomposition (SVD) is a standard method in linear algebra and the mathematical details of SVD for gene-expression analysis have been described in detail elsewhere <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. In brief, a gene-expression matrix (with rows of genes and columns of arrays) after SVD is decomposed into three matrices USV<sup>T</sup>. The left singular vectors (hereafter called eigenarrays) are the columns of matrix U, the diagonal in S are the singular values and the rows of V<sup>T </sup>the right singular vectors. We projected the gene-expression pattern of each sample into a two-dimensional SVD subspace, by measuring the correlation between the gene expression of each sample to the first two eigenarrays. Before SVD calculation we pre-processed the expression data for each gene independently to an average expression level of zero and a standard deviation of one. We used the SVD implementation in Numerical Python (version 23.1) for Python 2.3.3.</p>
			</sec>
			<sec>
				<st>
					<p>Significance analysis of microarrays</p>
				</st>
				<p>We used the significance analysis of microrrays (SAM) <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> available as an Excel add-in (version 1.21) to identify the number of differentially expressed genes, as a function of the false discovery rate (FDR). We identified statistically significant genes at estimated FDR of zero and 1% (based on 1,000 permutations) and using a fold-change cutoff of 1.5.</p>
			</sec>
			<sec>
				<st>
					<p>Classification of gene-expression profiles</p>
				</st>
				<p>We used the genes identified as differentially expressed in dataset I and II (<it>n </it>= 576) to assess whether we could classify samples in five different datasets into either 'cell lines' or 'tissues'. Dataset I and II correspond to the datasets detailed above (table <tblr tid="T1">1</tblr>) and were used as initial controls. Before calculation we pre-processed the expression data for each gene independently to an average expression level of zero and a standard deviation of one for each dataset separately. For each dataset, we then calculated the mean gene-expression levels for each gene independently across all cell lines and tissues, respectively. The average cell line expression profile and tissue profile within each dataset were referred to as the 'cell line centroid' and 'tissue centroid'. Then we calculated the Euclidean distance (D<sub>e</sub>) between each sample and the cell line centroid and tissue centroid, respectively. We integrated the two distances into a simple score by calculating the difference between the Euclidean distance to the tissue centroid and cell line centroid. Thus, samples that resemble cell lines more than tissues would have a short Euclidean distance towards the cell line centroid and a longer distance towards the tissue centroid and therefore get a positive score. For all datasets a bimodal distribution of scores was observed (see Additional data file 2 for the distributions of scores for samples in the five datasets). We defined a threshold for each dataset that gave equal amounts of false positives and false negatives. Then all scores above threshold were classified as 'cell line' and all scores below threshold as 'tissue'. The performance of the classification was reported as the accuracy, that is, the sum of the true positives and true negatives divided by the total number of predictions for each dataset.</p>
			</sec>
			<sec>
				<st>
					<p>GO analysis</p>
				</st>
				<p>We used GoMiner <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> to analyze the lists of up- and downregulated genes for GO categories that were significantly statistically over-represented. We used the second generation GoMiner program that first estimates the <it>p</it>-value using Fisher's exact test and then corrects the <it>p</it>-values for the multiple comparisons by estimating the FDR. We reported only GO categories that had corrected <it>p</it>-values of less than 0.05.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Additional data files</p>
			</st>
			<p>The following additional data are available with the online version of this paper. Additional data file <supplr sid="S1">1</supplr> lists the genes found to be differentially expressed in cell lines versus tissues in both datasets, with corresponding gene names, probe identifiers, SAM <it>d </it>scores and fold-change values. The order of the genes in this table is identical to Figure <figr fid="F4">4</figr>. Additional data file <supplr sid="S2">2</supplr> contains a figure with a graph of the distribution of scores for all samples in the five different datasets respectively. Additional data file <supplr sid="S3">3</supplr> is a high-resolution image of Figure <figr fid="F4">4</figr> in which all sample names and gene identifiers can be found. Additional data file <supplr sid="S4">4</supplr> lists the dataset-specific GO categories downregulated in only cell lines from dataset I. These categories were mainly of immunological processes and are listed with corresponding statistics and GO identifiers. Additional data file <supplr sid="S5">5</supplr> describes the calculations used in the discussion to estimate cell composition effects on gene-expression comparisons.</p>
			<suppl id="S1">
				<title>
					<p>Additional Data File 1</p>
				</title>
				<caption>
					<p>A table listing the genes found to be differentially expressed in cell lines versus tissues in both datasets</p>
				</caption>
				<text>
					<p>Corresponding gene names, probe identifiers, SAM d scores and fold change values are given. The order of the genes in this table is identical to Figure <figr fid="F4">4</figr>.</p>
				</text>
				<file name="gb-2005-6-8-r65-S1.xls">
					<p>Click here for file</p>
				</file>
			</suppl>
			<suppl id="S2">
				<title>
					<p>Additional Data File 2</p>
				</title>
				<caption>
					<p>A figure with a graph of the distribution of scores for all samples in the five different datasets respectively</p>
				</caption>
				<text>
					<p>A figure with a graph of the distribution of scores for all samples in the five different datasets respectively.</p>
				</text>
				<file name="gb-2005-6-8-r65-S2.pdf">
					<p>Click here for file</p>
				</file>
			</suppl>
			<suppl id="S3">
				<title>
					<p>Additional Data File 3</p>
				</title>
				<caption>
					<p>A high-resolution image of Figure <figr fid="F4">4</figr> in which all sample names and gene identifiers can be found</p>
				</caption>
				<text>
					<p>A high-resolution image of Figure <figr fid="F4">4</figr> in which all sample names and gene identifiers can be found.</p>
				</text>
				<file name="gb-2005-6-8-r65-S3.png">
					<p>Click here for file</p>
				</file>
			</suppl>
			<suppl id="S4">
				<title>
					<p>Additional Data File 4</p>
				</title>
				<caption>
					<p>A table listing the dataset-specific GO categories downregulated in only cell lines from dataset I</p>
				</caption>
				<text>
					<p>These categories were mainly of immunological processes and are listed with corresponding statistics and GO identifiers.</p>
				</text>
				<file name="gb-2005-6-8-r65-S4.xls">
					<p>Click here for file</p>
				</file>
			</suppl>
			<suppl id="S5">
				<title>
					<p>Additional Data File 5</p>
				</title>
				<caption>
					<p>A description of the calculations used in the discussion to estimate cell composition effects on gene-expression comparisons</p>
				</caption>
				<text>
					<p>A description of the calculations used in the discussion to estimate cell composition effects on gene-expression comparisons.</p>
				</text>
				<file name="gb-2005-6-8-r65-S5.doc">
					<p>Click here for file</p>
				</file>
			</suppl>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>We thank Alexey Kutsenko, Ola Larsson and Ebba Brakenhielm for comments on earlier versions of the manuscript and members of the Ernberg lab for fruitful discussions. The research was funded by the Swedish Knowledge Foundation, the Swedish Cancer Society and the Swedish Research Council.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Karyotypic complexity of the NCI-60 drug-screening panel.</p>
				</title>
				<aug>
					<au>
						<snm>Roschke</snm>
						<fnm>AV</fnm>
					</au>
					<au>
						<snm>Tonon</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Gehlhaus</snm>
						<fnm>KS</fnm>
					</au>
					<au>
						<snm>McTyre</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Bussey</snm>
						<fnm>KJ</fnm>
					</au>
					<au>
						<snm>Lababidi</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Scudiero</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Weinstein</snm>
						<fnm>JN</fnm>
					</au>
					<au>
						<snm>Kirsch</snm>
						<fnm>IR</fnm>
					</au>
				</aug>
				<source>Cancer Res</source>
				<pubdate>2003</pubdate>
				<volume>63</volume>
				<fpage>8634</fpage>
				<lpage>8647</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">14695175</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Systematic variation in gene expression patterns in human cancer cell lines.</p>
				</title>
				<aug>
					<au>
						<snm>Ross</snm>
						<fnm>DT</fnm>
					</au>
					<au>
						<snm>Scherf</snm>
						<fnm>U</fnm>
					</au>
					<au>
						<snm>Eisen</snm>
						<fnm>MB</fnm>
					</au>
					<au>
						<snm>Perou</snm>
						<fnm>CM</fnm>
					</au>
					<au>
						<snm>Rees</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Spellman</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Iyer</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Jeffrey</snm>
						<fnm>SS</fnm>
					</au>
					<au>
						<snm>Van de Rijn</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Waltham</snm>
						<fnm>M</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2000</pubdate>
				<volume>24</volume>
				<fpage>227</fpage>
				<lpage>235</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/73432</pubid>
						<pubid idtype="pmpid" link="fulltext">10700174</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays.</p>
				</title>
				<aug>
					<au>
						<snm>Johnson</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Castle</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Garrett-Engele</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Kan</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Loerch</snm>
						<fnm>PM</fnm>
					</au>
					<au>
						<snm>Armour</snm>
						<fnm>CD</fnm>
					</au>
					<au>
						<snm>Santos</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Schadt</snm>
						<fnm>EE</fnm>
					</au>
					<au>
						<snm>Stoughton</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Shoemaker</snm>
						<fnm>DD</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2003</pubdate>
				<volume>302</volume>
				<fpage>2141</fpage>
				<lpage>2144</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1090100</pubid>
						<pubid idtype="pmpid" link="fulltext">14684825</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Single cell profiling of potentiated phospho-protein networks in cancer cells.</p>
				</title>
				<aug>
					<au>
						<snm>Irish</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Hovland</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Krutzik</snm>
						<fnm>PO</fnm>
					</au>
					<au>
						<snm>Perez</snm>
						<fnm>OD</fnm>
					</au>
					<au>
						<snm>Bruserud</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Gjertsen</snm>
						<fnm>BT</fnm>
					</au>
					<au>
						<snm>Nolan</snm>
						<fnm>GP</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>2004</pubdate>
				<volume>118</volume>
				<fpage>217</fpage>
				<lpage>228</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.cell.2004.06.028</pubid>
						<pubid idtype="pmpid" link="fulltext">15260991</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>Human cancer cell lines: fact and fantasy.</p>
				</title>
				<aug>
					<au>
						<snm>Masters</snm>
						<fnm>JR</fnm>
					</au>
				</aug>
				<source>Nat Rev Mol Cell Biol</source>
				<pubdate>2000</pubdate>
				<volume>1</volume>
				<fpage>233</fpage>
				<lpage>236</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35043102</pubid>
						<pubid idtype="pmpid" link="fulltext">11252900</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>Taking the study of cancer cell survival to a new dimension.</p>
				</title>
				<aug>
					<au>
						<snm>Jacks</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Weinberg</snm>
						<fnm>RA</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>2002</pubdate>
				<volume>111</volume>
				<fpage>923</fpage>
				<lpage>925</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0092-8674(02)01229-1</pubid>
						<pubid idtype="pmpid" link="fulltext">12507419</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>Regional and strain-specific gene expression mapping in the adult mouse brain.</p>
				</title>
				<aug>
					<au>
						<snm>Sandberg</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Yasuda</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Pankratz</snm>
						<fnm>DG</fnm>
					</au>
					<au>
						<snm>Carter</snm>
						<fnm>TA</fnm>
					</au>
					<au>
						<snm>Del Rio</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Wodicka</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Mayford</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Lockhart</snm>
						<fnm>DJ</fnm>
					</au>
					<au>
						<snm>Barlow</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2000</pubdate>
				<volume>97</volume>
				<fpage>11038</fpage>
				<lpage>11043</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">27144</pubid>
						<pubid idtype="pmpid" link="fulltext">11005875</pubid>
						<pubid idtype="doi">10.1073/pnas.97.20.11038</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B8">
				<title>
					<p>A gene atlas of the mouse and human protein-encoding transcriptomes.</p>
				</title>
				<aug>
					<au>
						<snm>Su</snm>
						<fnm>AI</fnm>
					</au>
					<au>
						<snm>Wiltshire</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Batalov</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Lapp</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Ching</snm>
						<fnm>KA</fnm>
					</au>
					<au>
						<snm>Block</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Soden</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Hayakawa</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Kreiman</snm>
						<fnm>G</fnm>
					</au>
					<etal/>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2004</pubdate>
				<volume>101</volume>
				<fpage>6062</fpage>
				<lpage>6067</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">395923</pubid>
						<pubid idtype="pmpid" link="fulltext">15075390</pubid>
						<pubid idtype="doi">10.1073/pnas.0400782101</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.</p>
				</title>
				<aug>
					<au>
						<snm>Alon</snm>
						<fnm>U</fnm>
					</au>
					<au>
						<snm>Barkai</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Notterman</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Gish</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Ybarra</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Mack</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Levine</snm>
						<fnm>AJ</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1999</pubdate>
				<volume>96</volume>
				<fpage>6745</fpage>
				<lpage>6750</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">21986</pubid>
						<pubid idtype="pmpid" link="fulltext">10359783</pubid>
						<pubid idtype="doi">10.1073/pnas.96.12.6745</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>Distinctive gene expression patterns in human mammary epithelial cells and breast cancers.</p>
				</title>
				<aug>
					<au>
						<snm>Perou</snm>
						<fnm>CM</fnm>
					</au>
					<au>
						<snm>Jeffrey</snm>
						<fnm>SS</fnm>
					</au>
					<au>
						<snm>van de Rijn</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Rees</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Eisen</snm>
						<fnm>MB</fnm>
					</au>
					<au>
						<snm>Ross</snm>
						<fnm>DT</fnm>
					</au>
					<au>
						<snm>Pergamenschikov</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Williams</snm>
						<fnm>CF</fnm>
					</au>
					<au>
						<snm>Zhu</snm>
						<fnm>SX</fnm>
					</au>
					<au>
						<snm>Lee</snm>
						<fnm>JC</fnm>
					</au>
					<etal/>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1999</pubdate>
				<volume>96</volume>
				<fpage>9212</fpage>
				<lpage>9217</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">17759</pubid>
						<pubid idtype="pmpid" link="fulltext">10430922</pubid>
						<pubid idtype="doi">10.1073/pnas.96.16.9212</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling.</p>
				</title>
				<aug>
					<au>
						<snm>Alizadeh</snm>
						<fnm>AA</fnm>
					</au>
					<au>
						<snm>Eisen</snm>
						<fnm>MB</fnm>
					</au>
					<au>
						<snm>Davis</snm>
						<fnm>RE</fnm>
					</au>
					<au>
						<snm>Ma</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Lossos</snm>
						<fnm>IS</fnm>
					</au>
					<au>
						<snm>Rosenwald</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Boldrick</snm>
						<fnm>JC</fnm>
					</au>
					<au>
						<snm>Sabet</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Tran</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Yu</snm>
						<fnm>X</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2000</pubdate>
				<volume>403</volume>
				<fpage>503</fpage>
				<lpage>511</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35000501</pubid>
						<pubid idtype="pmpid" link="fulltext">10676951</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Integrated classification of lung tumors and cell lines by expression profiling.</p>
				</title>
				<aug>
					<au>
						<snm>Virtanen</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Ishikawa</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Honjoh</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Kimura</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Shimane</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Miyoshi</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Nomura</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Jones</snm>
						<fnm>MH</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2002</pubdate>
				<volume>99</volume>
				<fpage>12357</fpage>
				<lpage>12362</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">129449</pubid>
						<pubid idtype="pmpid" link="fulltext">12218176</pubid>
						<pubid idtype="doi">10.1073/pnas.192240599</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression.</p>
				</title>
				<aug>
					<au>
						<snm>Rhodes</snm>
						<fnm>DR</fnm>
					</au>
					<au>
						<snm>Yu</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Shanker</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Deshpande</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Varambally</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Ghosh</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Barrette</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Pandey</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Chinnaiyan</snm>
						<fnm>AM</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2004</pubdate>
				<volume>101</volume>
				<fpage>9309</fpage>
				<lpage>9314</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">438973</pubid>
						<pubid idtype="pmpid" link="fulltext">15184677</pubid>
						<pubid idtype="doi">10.1073/pnas.0401994101</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>A module map showing conditional activity of expression modules in cancer.</p>
				</title>
				<aug>
					<au>
						<snm>Segal</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Friedman</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Koller</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Regev</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2004</pubdate>
				<volume>36</volume>
				<fpage>1090</fpage>
				<lpage>1098</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">15448693</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Chemosensitivity prediction by transcriptional profiling.</p>
				</title>
				<aug>
					<au>
						<snm>Staunton</snm>
						<fnm>JE</fnm>
					</au>
					<au>
						<snm>Slonim</snm>
						<fnm>DK</fnm>
					</au>
					<au>
						<snm>Coller</snm>
						<fnm>HA</fnm>
					</au>
					<au>
						<snm>Tamayo</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Angelo</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Park</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Scherf</snm>
						<fnm>U</fnm>
					</au>
					<au>
						<snm>Lee</snm>
						<fnm>JK</fnm>
					</au>
					<au>
						<snm>Reinhold</snm>
						<fnm>WO</fnm>
					</au>
					<au>
						<snm>Weinstein</snm>
						<fnm>JN</fnm>
					</au>
					<etal/>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2001</pubdate>
				<volume>98</volume>
				<fpage>10787</fpage>
				<lpage>10792</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">58553</pubid>
						<pubid idtype="pmpid" link="fulltext">11553813</pubid>
						<pubid idtype="doi">10.1073/pnas.191368598</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>Multiclass cancer diagnosis using tumor gene expression signatures.</p>
				</title>
				<aug>
					<au>
						<snm>Ramaswamy</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Tamayo</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Rifkin</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Mukherjee</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Yeang</snm>
						<fnm>CH</fnm>
					</au>
					<au>
						<snm>Angelo</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Ladd</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Reich</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Latulippe</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Mesirov</snm>
						<fnm>JP</fnm>
					</au>
					<etal/>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2001</pubdate>
				<volume>98</volume>
				<fpage>15149</fpage>
				<lpage>15154</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">64998</pubid>
						<pubid idtype="pmpid" link="fulltext">11742071</pubid>
						<pubid idtype="doi">10.1073/pnas.211566398</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>A compendium of gene expression in normal human tissues.</p>
				</title>
				<aug>
					<au>
						<snm>Hsiao</snm>
						<fnm>LL</fnm>
					</au>
					<au>
						<snm>Dangond</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Yoshida</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Hong</snm>
						