Peak calling with different methods and consensus peak identification
Annotate peaks
Differential binding analysis
Gene set enrichment analysis
Motif prediction to identify putative TF binding sites
Challenge Project: Comparison of motif enrichment and finding methods
Run workflow from start to finish (steps 1-8) on ChIP-Seq data set from Kaufman et al. (2010)
Challenge project tasks
Prioritize/rank peaks by FDR from differential binding analysis
Parse peak sequences from genome
Determine which A. thaliana motifs in the Jaspar database (motifDB) show the highest enrichment in the peak sequences. The motif enrichment tests can be performed with the PWMEnrich package. Basic starter code for accomplishing these tasks is provided here.
Optional: use different statistical methods to test for motif enrichment.
References
Alipanahi B, Delong A, Weirauch MT, Frey BJ (2015) Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol 33: 831–838. PubMed
Frith, Martin C., Yutao Fu, Liqun Yu, Jiang‐fan Chen, Ulla Hansen, and Zhiping Weng. 2004. “Detection of Functional DNA Motifs via Statistical Over‐representation.” Nucleic Acids Research 32 (4): 1372–81. PubMed
Kaufmann, K, F Wellmer, J M Muiño, T Ferrier, S E Wuest, V Kumar, A Serrano-Mislata, et al. 2010. “Orchestration of Floral Initiation by APETALA1.” Science 328 (5974): 85–89. PubMed
Machanick P, Bailey TL (2011) MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics 27: 1696–1697. PubMed
McLeay, Robert C, and Timothy L Bailey. 2010. “Motif Enrichment Analysis: A Unified Framework and an Evaluation on ChIP Data.” BMC Bioinformatics 11: 165. PubMed
Tompa, M, N Li, T L Bailey, G M Church, B De Moor, E Eskin, A V Favorov, et al. 2005. “Assessing Computational Tools for the Discovery of Transcription Factor Binding Sites.” Nature Biotechnology 23 (1): 137–44. PubMed