Table 2

Benchmarking the performance of ngLOC (7-gram) against its confidence score


Confidence score



0
10
20
30
40
50
60
70
80
90

% of dataset
0.0
2.4
11.8
6.1
4.4
4.5
5.8
9.3
18.1
37.5
% overall accuracy
0.0
56.2
41.4
70.1
88.3
93.0
97.0
98.1
99.2
99.8
Cumulative % of data:
100.0
100.0
97.6
85.7
79.6
75.2
70.7
64.9
55.6
37.5
Cumulative % overall accuracy
88.8
88.8
89.6
96.2
98.3
98.8
99.2
99.4
99.6
99.8

This table shows how the confidence score associated with each prediction relates to the overall accuracy. The higher the score, the more likely the prediction is to be the correct one. For example, all sequences scoring 90 or better had an accuracy of 99.8%. About 80% of the dataset was scored 40 or higher with a cumulative accuracy of 98.3%.

King and Guda Genome Biology 2007 8:R68   doi:10.1186/gb-2007-8-5-r68