Genome Biology

official impact factor 6.89

Open Access Highly Access Method

Evidence-ranked motif identification

Stoyan Georgiev1,2, Alan P Boyle1,2, Karthik Jayasurya1,2, Xuan Ding2, Sayan Mukherjee3,4,2,5 and Uwe Ohler3,2,6*

Author Affiliations

1 Program for Computational Biology and Bioinformatics, Duke University, 102 North Building, Durham, NC 27708, USA

2 Institute for Genome Sciences and Policy, Duke University, 101 Science Drive, Durham, NC 27708, USA

3 Department of Computer Science, Duke University, 450 Research Drive, Durham, NC 27708, USA

4 Department of Statistical Science, Duke University, 214 Old Chemistry Building, Durham, NC 27708, USA

5 Mathematics Department, Duke University, 102 Science Drive, Durham, NC 27708, USA

6 Department of Biostatistics and Bioinformatics, Duke University, Duke University School of Medicine, 2424 Erwin Road, Durham NC 27710, USA

For all author emails, please log on.

Genome Biology 2010, 11:R19 doi:10.1186/gb-2010-11-2-r19

Published: 15 February 2010

Abstract

cERMIT is a computationally efficient motif discovery tool based on analyzing genome-wide quantitative regulatory evidence. Instead of pre-selecting promising candidate sequences, it utilizes information across all sequence regions to search for high-scoring motifs. We apply cERMIT on a range of direct binding and overexpression datasets; it substantially outperforms state-of-the-art approaches on curated ChIP-chip datasets, and easily scales to current mammalian ChIP-seq experiments with data on thousands of non-coding regions.