|
Resolution: standard / high Figure 1.
bioPIXIE network recovery evaluation. (a-c) Typical network recovery performance for three KEGG pathways. For all pathways,
ten proteins from the pathway were randomly picked as a query set. The results of
100 independent query set samplings are shown. The fraction of the total known process
components recovered is plotted versus the size of the graph grown from the query
set. (d-f) An average over 31 KEGG pathways, GO biological processes, and MIPS complexes. Performance
is measured and reported as the trade-off between precision (the proportion of correct
pathway components returned to the total size of the returned network) and recall
(the proportion of correct pathway components returned to the number of total non-query
pathway proteins). Precision and recall are derived from true positives (TP), false
positives (FP), and false negatives (FN) as noted in the axis labels. (d) The improvement gained by using our network prediction algorithm on a Bayesian integration
of genomic evidence compared to separate evidence types. bioPIXIE shows considerable
improvement in both the number of known member proteins recovered and the precision
of predicted members for the integrated evidence over any individual evidence type.
(e) The improved network recovery offered by the bioPIXIE algorithm versus more naïve
approaches to integration and graph search. Specifically, we plot the performance
of bioPIXIE on integrated data against a naïve binary approach for which information
from all evidence types is used but only as a binary 'yes' or 'no' relationship, and
a more sophisticated approach where overlapping evidence receives higher weights and
connected proteins are recovered in order of confidence. (f) Comparison of the performance of bioPIXIE to two existing methods for query-based
protein complex recovery [13,14].
Myers et al. Genome Biology 2005 6:R114 doi:10.1186/gb-2005-6-13-r114 |