Table 2 |
||||||||||
|
Benchmarking the performance of ngLOC (7-gram) against its confidence score |
||||||||||
|
Confidence score |
||||||||||
|
|
||||||||||
|
0 |
10 |
20 |
30 |
40 |
50 |
60 |
70 |
80 |
90 |
|
|
|
||||||||||
|
% of dataset |
0.0 |
2.4 |
11.8 |
6.1 |
4.4 |
4.5 |
5.8 |
9.3 |
18.1 |
37.5 |
|
% overall accuracy |
0.0 |
56.2 |
41.4 |
70.1 |
88.3 |
93.0 |
97.0 |
98.1 |
99.2 |
99.8 |
|
Cumulative % of data: |
100.0 |
100.0 |
97.6 |
85.7 |
79.6 |
75.2 |
70.7 |
64.9 |
55.6 |
37.5 |
|
Cumulative % overall accuracy |
88.8 |
88.8 |
89.6 |
96.2 |
98.3 |
98.8 |
99.2 |
99.4 |
99.6 |
99.8 |
|
|
||||||||||
|
This table shows how the confidence score associated with each prediction relates to the overall accuracy. The higher the score, the more likely the prediction is to be the correct one. For example, all sequences scoring 90 or better had an accuracy of 99.8%. About 80% of the dataset was scored 40 or higher with a cumulative accuracy of 98.3%. |
||||||||||
|
King and Guda Genome Biology 2007 8:R68 doi:10.1186/gb-2007-8-5-r68 |
||||||||||