Table 2 |
||||||
|
Assessment, by keyword recovery, of the functional linkages established by the Operon method at various distance thresholds |
||||||
| Threshold (bp) |
Functional links between SwissProt Annotated Proteins |
Functional links with no keywords in common |
Correct keywords recovered |
Total keywords |
Maximum false positive fraction* |
Keyword recovery† |
|
|
||||||
| 0 |
308 |
78 |
446 |
883 |
0.25 |
0.51 |
| 25 |
642 |
180 |
856 |
1766 |
0.28 |
0.48 |
| 50 |
818 |
254 |
1044 |
2226 |
0.31 |
0.47 |
| 75 |
912 |
326 |
1080 |
2453 |
0.36 |
0.44 |
| 100 |
1044 |
362 |
1224 |
2726 |
0.35 |
0.45 |
|
|
||||||
|
*The maximum false positive fractions were calculated as the fraction of pairwise links that do not have any SWISS-PROT keywords in common (ignoring the keywords 'hypothetical protein', 'three-dimensional structure', 'transmembrane' and 'complete proteome'). †Keyword recovery was calculated by comparing the SWISS-PROT keyword annotation between each pair of linked M. tuberculosis genes. The keyword recovery of all linkages was calculated as:
where X is the total number of query protein keywords, Y is the total number of linked gene pairs, x is the number of query protein SWISS-PROT keywords, and nj is the number of times the query protein keyword j occurs in the linked protein. Notice that at 0 bp the keyword recovery is quite high, about 50%, while the maximum false positive rate is about 25%. As the distance threshold increases from 0 bp to 100 bp the keyword recovery decreases, while the maximum false positive fraction increases. |
||||||
|
Strong et al. Genome Biology 2003 4:R59 doi:10.1186/gb-2003-4-9-r59 |
||||||