|
Resolution: standard / high Figure 1.
Schematic representation of the data integration and data mining methodology. (a) Public databases with heterogeneous biomedical relations are integrated into a common
network. (b) Illustratively, genes (green circles), diseases (red boxes) and protein domains (blue
diamonds) are related through gene-disease associations, gene-gene interactions and
gene-domain annotations and integrated into a unified graph. (c) The a priori accessibility of each concept is computed by performing stochastic random walks to
detect highly connected hubs in the network (area of a node scales with its rank score).
(d) The a posteriori rank of each concept with respect to a source concept, in this case disease A, is
computed by performing random walks with restarts in the source. (e) The posterior probabilities are adjusted using the prior probabilities to score the
importance of each concept, specific to the source target (area of node scales with
log of rank score). Genes (green circles) are ranked according to this score, gene
1 being most specific to disease A and gene 8 least specific.
Liekens et al. Genome Biology 2011 12:R57 doi:10.1186/gb-2011-12-6-r57 |