Table 2 |
||
|
Top features in CoreBoost |
||
|
Classifier type |
Features |
|
|
|
||
|
CpG |
P versus U |
Log-likelihood ratios from third order Markov chain, log-likelihood ratios from TSS weight matrix |
|
GC-box score, weighted score of transcription factor NFY, weighted energy score at position +1 |
||
|
Weighted score of transcription factor YY1, TATA score, weighted score of transcription factor ELK1 |
||
|
MTE score, weighted score of transcription factor CREB |
||
|
P versus D |
Log-likelihood ratios from third order Markov chain, GC-box score |
|
|
Weighted score of transcription factor NFY |
||
|
Log-likelihood ratios from TSS weight matrix |
||
|
Difference between the energy score around positions -25 and +1 and the average from surroundings |
||
|
Log-likelihood ratios from transcription factor ELK1, frequency of G+C |
||
|
Log-likelihood ratios from transcription factor YY1, TATA score, frequency of G |
||
|
Non-CpG |
P versus U |
Correlation between vector of energy scores and empirical average energy profile |
|
Log-likelihood ratios from third order Markov chain, TATA score |
||
|
Difference between the energy score around positions -25 and +1 and the average from surroundings |
||
|
Weighted energy at position +1 |
||
|
Proportion of Inr and GC-box pair within 10 bp of observed distance, Inr score. |
||
|
P versus D |
Correlation between vector of energy scores and empirical average energy profile, TATA score |
|
|
Log-likelihood ratios from third order Markov chain |
||
|
Weighted energy at position +1 |
||
|
Correlation between vector of flexibility scores and empirical average flexibility profile, Inr score |
||
|
Difference between the flexibility score around position +1 and the average from surroundings, GC-box score |
||
|
|
||
|
bp, base pairs; D, immediate downstream sequence; P, promoter; TSS, transcription start site; U, immediate upstream sequence. |
||
|
Zhao et al. Genome Biology 2007 8:R17 doi:10.1186/gb-2007-8-2-r17 |
||