Figure 3.
Overview of the FuzzyK method. Genes are represented as points in space, where genes
that are similarly expressed are close together. (a) In the first fuzzy-clustering cycle, k/3 centroids are defined as the most informative k/3 eigen vectors identified by PCA of the input dataset (large colored circles). (b) The centroids are refined by iteratively calculating the gene-cluster memberships
and updating the centroid positions until convergence (see Figure 2b). (c,d) Genes that are correlated >0.7 to the identified centroids are removed from the dataset,
gene and array weights are recalculated, and the entire fuzzy k-means clustering process
is repeated on the data subset for an additional k/3 clusters (see Materials and methods for details). (e,f) Steps c and d are repeated for a third round of fuzzy clustering. (g) The output of the algorithm is a list of unique centroids and a table of gene-cluster
memberships.
Gasch and Eisen Genome Biology 2002 3:research0059.1 doi:10.1186/gb-2002-3-11-research0059 |