Resolution:
## Figure 1.
Method summary. Figure 1A: Data preprocessing. High-confidence TF target genes of four primary classes are
collected from several datasets (Steps 1-2) and merged into composite lists with four
extra classes for multiple lines of evidence (Step 3). Process-specific gene lists
(Step 4) and TF target genes are assembled into a regulatory matrix (Step 5) such
that unrelated genes are assigned to an additional "baseline" class. Figure 1B: TF significance tests with multinomial logistic regression models. For a given TF,
the alternative model H_{1 }is a univariate multinomial regression model that associates response (process genes)
and one predictor (TF target genes), such that TF targets are linearly associated
to probabilities of process gene classes (Step 6). The null model H_{0 }associates response (process genes) to their relative frequency in the dataset (Step
7). Log-likelihood ratio test measures if H_{1 }provides a better fit to data than the simpler H_{0 }model (Step 8). All TFs are subject to independent testing (Step 9) and subsequent
multiple testing correction (Step 10). TF, transcription factor knockout strain; ChIP,
chromatin immunoprecipitation; TF, transcription factor; TFBS, transcription factor
binding site.
Reimand |