|
Resolution: standard / high Figure 2.
Type-I error control. The panels show empirical cumulative distribution functions (ECDFs) for P values from a comparison of one replicate from condition A of the fly RNA-Seq data with the other one. No genes are truly differentially expressed,
and the ECDF curves (blue) should remain below the diagonal (gray). Panel (a): top
row corresponds to DESeq, middle row to edgeR and bottom row to a Poisson-based χ2 test. The right column shows the distributions for all genes, the left and middle
columns show them separately for genes below and above a mean of 100. Panel (b) shows
the same data, but zooms into the range of small P values. The plots indicate that edgeR and DESeq control type I error at (and in fact slightly below) the nominal rate, while the Poisson-based
χ2 test fails to do so. edgeR has an excess of small P values for low counts: the blue line lies above the diagonal. This excess is, however,
compensated by the method being more conservative for high counts. All methods show
a point mass at p = 1, this is due to the discreteness of the data, whose effect is particularly evident
at low counts.
Anders and Huber Genome Biology 2010 11:R106 doi:10.1186/gb-2010-11-10-r106 |