|
Resolution: standard / high Figure 1.
Method summary. Figure 1A: Data preprocessing. High-confidence TF target genes of four primary classes are
collected from several datasets (Steps 1-2) and merged into composite lists with four
extra classes for multiple lines of evidence (Step 3). Process-specific gene lists
(Step 4) and TF target genes are assembled into a regulatory matrix (Step 5) such
that unrelated genes are assigned to an additional "baseline" class. Figure 1B: TF significance tests with multinomial logistic regression models. For a given TF,
the alternative model H1 is a univariate multinomial regression model that associates response (process genes)
and one predictor (TF target genes), such that TF targets are linearly associated
to probabilities of process gene classes (Step 6). The null model H0 associates response (process genes) to their relative frequency in the dataset (Step
7). Log-likelihood ratio test measures if H1 provides a better fit to data than the simpler H0 model (Step 8). All TFs are subject to independent testing (Step 9) and subsequent
multiple testing correction (Step 10). TF, transcription factor knockout strain; ChIP,
chromatin immunoprecipitation; TF, transcription factor; TFBS, transcription factor
binding site.
Reimand et al. Genome Biology 2012 13:R55 doi:10.1186/gb-2012-13-6-r55 |