Table 4

Biomarker of tobacco smoke exposure constructed using the 28 irreversible genes

Training set

Test set

GSE4115

GSE5372





Never

Former

All

Current

Former

Current

All

Non-smokers

Smokers

All


Number Classified Correctly

21

31

52

46

38

38

76

4

4

8

Total Number

21

31

52

52

47

38

85

4

5

9

Accuracy

100.0%

100.0%

100.0%

88.5%

80.9%

100.0%

89.4%

100.0%

80.0%

88.9%

Mean of Random sets

52.5%

59.2%

59.2%

50.1%

P value

0

0.102

0.013

0.001


The accuracy of the biomarker is reported for the training set samples, test set samples, samples from dataset GSE4115 that do not overlap with the present study, and samples from GSE5372. The P values represent the proportion of 1,000 random training sets that have the same or better accuracy on the tested samples as the actual biomarker.

Beane et al. Genome Biology 2007 8:R201   doi:10.1186/gb-2007-8-9-r201

Open Data