MotifCluster: an interactive online tool for clustering and visualizing sequences using shared motifs
-
* Corresponding author: Rob Knight rob@spot.colorado.edu
1 Department of Computer Science, University of Colorado, Boulder, CO 80309, USA
2 Department of Chemistry and Biochemistry, University of Colorado, Boulder, CO 80309, USA
3 Department of Molecular, Cellular and Developmental Biology and Cooperative Institute for Research in Environmental Sciences (CIRES), University of Colorado, Boulder, CO 80309, USA
Genome Biology 2008, 9:R128 doi:10.1186/gb-2008-9-8-r128
Published: 15 August 2008Abstract
MotifCluster finds related motifs in a set of sequences, and clusters the sequences into families using the motifs they contain. MotifCluster, at http://bmf.colorado.edu/motifcluster webcite, lets users test whether proteins are related, cluster sequences by shared conserved motifs, and visualize motifs mapped onto trees, sequences and three-dimensional structures. We demonstrate MotifCluster's accuracy using gold-standard protein superfamilies; using recommended settings, families were assigned to the correct superfamilies with 0.17% false positive and no false negative assignments.