XB-ART-46556Bioinformatics. January 15, 2011; 27 (2): 270-1.
GimmeMotifs: a de novo motif prediction pipeline for ChIP-sequencing experiments.
Accurate prediction of transcription factor binding motifs that are enriched in a collection of sequences remains a computational challenge. Here we report on GimmeMotifs, a pipeline that incorporates an ensemble of computational tools to predict motifs de novo from ChIP-sequencing (ChIP-seq) data. Similar redundant motifs are compared using the weighted information content (WIC) similarity score and clustered using an iterative procedure. A comprehensive output report is generated with several different evaluation metrics to compare and evaluate the results. Benchmarks show that the method performs well on human and mouse ChIP-seq datasets. GimmeMotifs consists of a suite of command-line scripts that can be easily implemented in a ChIP-seq analysis pipeline.GimmeMotifs is implemented in Python and runs on Linux. The source code is freely available for download at http://email@example.comSupplementary data are available at Bioinformatics online.
PubMed ID: 21081511
PMC ID: PMC3018809
Article link: Bioinformatics.
Grant support: R01 HD054356-04 NICHD NIH HHS , R01HD054356 NICHD NIH HHS
Genes referenced: tbx2 tp63
Article Images: [+] show captions
|Fig. 1. An example of the GimmeMotifs output for p63 (Kouwenhoven et al., 2010). Shown are the sequence logo of the predicted motif (Schneider and Stephens, 1990), the best matching motif in the JASPAR database (Sandelin et al., 2004), the ROC curve, the positional preference plot and several statistics to evaluate the motif performance. See the Supplementary Material for a complete example.|