Table 1

Summary of sequence correction models

Model
Formalism
Number of parameters
Average R2
Average adjusted R2
Number of transcriptionally active regions

Uncorrected
NA
NA
NA
NA
47,463
GC
log I = β0 + βGC(NC + NG)
2
0.0293
0.0284
52,384
Nucleotide-specific
log I = β0 + βANA + βCNC + βGNG
4
0.0412
0.0373
53,982
Bilinear
log I = β0 + Math
41 = 36 + 4 + 1
0.0980
0.0604
61,731
Full Position-specific
log I = β0 + Math
109 = 36 × 3 + 1
0.1709
0.0703
71,400

Overview of the models used to relate probe sequence to signal intensity. The Full Position-specific model has the highest R2 and also the highest adjusted R2, indicating that overfitting is not a concern. The rightmost column shows the number of probed loci classified as transcriptionally active, which varies greatly with the sequence model used. NA, not applicable.

Halasz et al. Genome Biology 2006 7:R59   doi:10.1186/gb-2006-7-7-r59