Table 1 |
|||||
|
Summary of sequence correction models |
|||||
| Model |
Formalism |
Number of parameters |
Average R2 |
Average adjusted R2 |
Number of transcriptionally active regions |
|
|
|||||
| Uncorrected |
NA |
NA |
NA |
NA |
47,463 |
| GC |
log I = β0 + βGC(NC + NG) |
2 |
0.0293 |
0.0284 |
52,384 |
| Nucleotide-specific |
log I = β0 + βANA + βCNC + βGNG |
4 |
0.0412 |
0.0373 |
53,982 |
| Bilinear |
log I = β0 +
|
41 = 36 + 4 + 1 |
0.0980 |
0.0604 |
61,731 |
| Full Position-specific |
log I = β0 +
|
109 = 36 × 3 + 1 |
0.1709 |
0.0703 |
71,400 |
|
|
|||||
|
Overview of the models used to relate probe sequence to signal intensity. The Full Position-specific model has the highest R2 and also the highest adjusted R2, indicating that overfitting is not a concern. The rightmost column shows the number of probed loci classified as transcriptionally active, which varies greatly with the sequence model used. NA, not applicable. |
|||||
|
Halasz et al. Genome Biology 2006 7:R59 doi:10.1186/gb-2006-7-7-r59 |
|||||