|
Resolution: standard / high Figure 2.
Effect of different clustering algorithms using all available microarray experiments.
We compared the ability of different clustering algorithms to identify co-regulated
genes using all 273 microarray experiments from the subset of compendium data with
215 genes. The clustering algorithms we compared include hierarchical average-link
using correlation and Euclidean distance as the similarity measure, hierarchical complete-link
using correlation and Euclidean distance as the similarity measure, and model-based
clustering algorithms MCLUST and IMM on standardized data. A high z-score indicates
a high proportion of co-regulated genes from clustering results compared to those
from random partitions with the same numbers of clusters and cluster size distributions.
Since the optimal number of clusters is not known, we compared the performance of
clustering algorithms over a range of different numbers of clusters (from 5 to 100).
(a) The transcription factor database SCPD is used as the evaluation criterion for co-regulated
genes. (b) ChIP data is used as the evaluation criterion for co-regulated genes. The model-based
clustering algorithm MCLUST produces relatively high z-scores using either SCPD or
ChIP as our evaluation criterion.
Yeung et al. Genome Biology 2004 5:R48 doi:10.1186/gb-2004-5-7-r48 |