|
Summary of prediction accuracy results |
|||||
| Data |
Parameters |
EWUSC |
USC |
SC |
Published results |
|
|
|||||
| NCI 60 data* |
ρ0 |
NA |
0.6 |
1.0 |
NA |
| Δ |
NA |
0.6 |
0.9 |
NA |
|
| # relevant genes |
NA |
2,116 (2315) |
3,998 |
200 |
|
| Prediction accuracy |
NA |
72% |
72% |
~40–60% [4] |
|
| Multiple tumor data (estimated optimal parameters) † |
ρ0 |
0.8 |
0.8 |
1.0 |
NA |
| Δ |
4.8 (5.6) |
4 (5.6) |
8.8 |
NA |
|
| # relevant genes |
241 (680) |
356 (735) |
3902 |
All genes |
|
| Prediction accuracy |
93% |
82%(85%) |
63%(78%) |
78% [5] |
|
| Multiple tumor data (global optimal parameters) ‡ |
ρ0 |
0.9 |
0.9 |
1.0 |
NA |
| Δ |
0 |
0 |
0.4 |
NA |
|
| # relevant genes |
1,622 (1626) |
1634 |
7129 |
All genes |
|
| Prediction accuracy |
74% (78%) |
74% |
59%(74%) |
78% [5] |
|
| Breast cancer data |
ρ0 |
0.6 (0.7) |
0.6 |
1.0 |
NA |
| Δ |
0.80 |
0.55 (1.15) |
0.5 (1.1) |
NA |
|
| # relevant genes |
189 (271) |
1,114 (82) |
3,193(187) |
70 |
|
| Prediction accuracy |
84% (89%) |
84% (79%) |
84% |
89% [6] |
|
|
Results different from those previously reported are highlighted in bold. Previous results are in brackets. Results improved over previously reported are highlighted in italic, while results worse than previously reported are underlined. The optimal parameters (ρ0 and Δ), number of relevant genes chosen, and prediction accuracy for the NCI 60 data, multiple tumor data and breast cancer data are summarized here. Both EWUSC (error-weighted, uncorrelated shrunken centroid) and USC (uncorrelated shrunken centroid) were motivated by SC (shrunken centroid) [2]. Both EWUSC and USC take advantage of interdependence between genes by removing highly correlated relevant genes. EWUSC makes use of error estimates or variability over repeated measurements. SC [2] is equivalent to USC at ρ0 = 1. The optimal parameters (Δ, ρ0) for EWUSC are estimated from the cross-validation results of EWUSC, while the optimal parameters (Δ, ρ0) for USC are independently estimated from the cross-validation results of USC. *Since no repeated measurements or error estimates are available, EWUSC is not applicable to the NCI 60 data. In addition, there is no separate test set available for the NCI 60 data, typical results of random partitions of the original 61 samples into training and test sets are shown. †The prediction accuracy and number of relevant genes are produced using optimal parameters (Δ, ρ0) estimated by visual observation of 'bends' in the random cross-validation curves. ‡The prediction accuracy and number of relevant genes are produced using global optimal parameters, that is (Δ, ρ0) that produces the minimum average numbers of cross-validation errors over all Δ and all ρ0. | |||||
Yeung and Bumgarner Genome Biology 2005 6:405 doi:10.1186/gb-2005-6-13-405 |
|||||