Log on / register
BioMed Central home | Journals A-Z | Feedback | Support
.deposited research
 |  |  | 


 

As a service to the research community, Genome Biology used to publish non-peer-reviewed articles in a 'preprint' depository to which any research can be submitted and which all individuals can access free of charge.From January 2006 Genome Biology no longer publishes new articles in this section. Any article could be submitted by authors, who have sole responsibility for the article's content. The only screening process is to ensure relevance of the preprint to Genome Biology's scope and to avoid abusive, libellous or indecent articles. Articles in this section of the journal have not been peer-reviewed. Each preprint has a permanent URL, by which it can be cited. Research submitted to the preprint depository may be simultaneously or subsequently submitted to Genome Biology or any other publication for peer review; the only requirement is an explicit citation of, and link to, the preprint in the article that is eventually published. If possible, Genome Biology will provide a reciprocal link from the preprint depository to the published article.

deposited

Deposited research article

A tool for comparing different statistical methods on identifying differentially expressed genes

Paul Fogel* 1, Li Liu* 2 email, Bruno Dumas3 and Nanxiang Ge2

1Paul Fogel Consultant, 4 rue Le Goff, 75005 Paris, France

2Biometrics and Data Management, Sanofi-Aventis, Mail Stop B-203A, PO Box 6800, 1041 Route 202-206, Bridgewater, NJ 08873, USA

3Yeast Genomics, Functional Genomics, Sanofi-Aventis,13 Quai Jules Guesde, 94403 Vitry sur Seine Cedex, France

author email corresponding author email* Contributed equally

Genome Biology 2004, 6:P2doi:10.1186/gb-2004-6-1-p2



This was the first version of this article to be made available publicly.

Subject areas: Methods, Bioinformatics

The electronic version of this article is the complete one and can be found online at: http://genomebiology.com/2004/6/1/P2

Received: 7 December 2004
Posted: 8 December 2004

© 2004 BioMed Central Ltd

Abstract

Background

Many different statistical methods have been developed to deal with two group comparison microarray experiments. Most often, a substantial number of genes may be selected or not, depending on which method was actually used. Practical guidance on the application of these methods is therefore required. We developed a procedure based on bootstrap and a criterion to allow viewing and quantifying differences between method-dependent selections. We applied this procedure on three datasets that cover a range of possible sample sizes to compare three well known methods, namely: t-test, LPE and SAM.

Results

Our visualization method and associated variability conformation rate (VCR) criterion show that standard t-test is appropriate for large sample sizes to allow accurate variance estimates. LPE borrows strength from neighboring genes to estimate the variances and is therefore more appropriate for small sample sizes whenever gene variances are similar for similar gene intensity levels. SAM has both advantages of considering gene specific variance like t-test and adjusting multiple tests by permutation based false discovery rate. However, for small sample sizes and in cases of numerous expressed genes, the distribution based on permutated datasets may not approximate the null distribution well, resulting in an inaccurate false discovery rate. Moreover, genes with low variances may be filtered because of the fudge factor.

Conclusions

We proposed using VCR to assess different statistical methods available for analyzing microarray data and developed a bootstrap method - on which our criterion is based - to estimate the 2-d distribution of treated vs. control gene intensity levels, under the null hypothesis that there is no difference between the treatment and control group. The biological evaluation of selected genes according to one or another method confirmed that this criterion is indeed appropriate to help identifying the most suitable method.

Additional data files

The following additional data files are provided with this article: Additional data file 1, depicting a table showing the overlap among different methods for the yeast data; Additional data file 2, showing additional Figure 1; Additional data file 3, showing additional Figure 2; Additional data file 4, showing additional Figure 3; Additional data file 5, showing additional Figure 4.

Additional data file 1. A table showing the overlap among different methods for the yeast data

Format: PDF Size: 106KB Download file

This file can be viewed with: Adobe Acrobat Reader

Additional data file 2. Additional figure 1

Format: PNG Size: 18KB Download file

Additional data file 3. Additional figure 2

Format: PNG Size: 16KB Download file

Additional data file 4. Additional figure 3

Format: PNG Size: 16KB Download file

Additional data file 5. Additional figure 4

Format: PNG Size: 17KB Download file

Have something to say? Post a comment on this article!


Published by
© 1999-2008 BioMed Central Ltd unless otherwise stated < info@genomebiology.com >   Terms and conditions