Table 1

Classification of sequences by linear discriminant analysis (LDA)


Known trans-splicing regions
Known coding regions

Predicted trans-splicing region
True positive
False positive

20
0.9

(14-24)
(0-3)



Predicted coding region
False negative
True negative

0.7
18.5

(0-2)
(10-31)




Sensitivity: 0.97
Specificity: 0.96

(0.91-1.00)
(0.88-1.00)




Accuracy: 0.96

(0.90-1.00)

The overall performance of the LDA method after tenfold cross-validation using 214 known trans-splicing regions and 198 coding regions is shown here. The average across all ten testing sets is reported, with the range of values indicated in parentheses for each class of sequence. Each test dataset had on average 20.7 known trans-splicing regions and 19.4 known coding regions.

Gopal et al. Genome Biology 2005 6:R95   doi:10.1186/gb-2005-6-11-r95