Figure 5.
Hierarchical correction using BFS sub-networks. (a) Schematic of BFS sub-networks. Each GO term is represented by a blank node while the
SVM classifier output for that GO node is represented by a shaded node. Starting from
each GO term (Y1), we did a BFS to construct the local hierarchy, until a maximum
of 30 terms were included. In this process, we considered only GO terms with five
or more annotations in the training set. (b) Improvement of AUC for the novel set using the HIER-BFS classifier compared to single
SVM predictions for selected terms for biological process terms of size 101 to 300
(genes annotated to this GO term in the training set). For each GO term, the best-performing
sub-hierarchy was selected, and the ones that performed better than single SVM (characterized
by held-out values in the training set) are plotted in this figure. (c) Median improvement of predictions for selected GO terms over different biological
process GO term sizes. Again, hierarchical correction using BFS structure performs
better for smaller terms when they were selected. AUC, area under receiver operating
characteristic curve; BFS, breadth-first search; GO, Gene Ontology; HIER-BFS, breadth-first
search hierarchical correction; SVM, support vector machine.
Guan et al. Genome Biology 2008 9(Suppl 1):S3 doi:10.1186/gb-2008-9-s1-s3 |