Perform read counting for required ranges (e.g. exonic gene ranges)
Normalization of read counts
Identification of differentially expressed genes (DEGs)
Clustering of gene expression profiles
Gene set enrichment analysis
Challenge Project: Comparison of DEG analysis methods
Run workflow from start to finish (steps 1-7) on RNA-Seq data set from Howard et al. (2013)
Challenge project tasks
Run at least 2 RNA-Seq DEG analysis methods (e.g. edgeR, DESeq2, limma/voom) and compare the results as follows:
Analyze the similarities and differences in the DEG lists obtained from the two methods
Does it affect the results from the downstream gene set enrichment analysis?
Plot the performance of the DEG methods in form of an ROC curve. The DEG set from the Howard et al., 2013 paper could be used as benchmark (true result).
References
Howard, B.E. et al., 2013. High-throughput RNA sequencing of pseudomonas-infected Arabidopsis reveals hidden transcriptome complexity and novel splice variants. PloS one, 8(10), p.e74183. PubMed
Guo Y, Li C-I, Ye F, Shyr Y (2013) Evaluation of read count based RNAseq analysis methods. BMC Genomics 14 Suppl 8: S2 PubMed
Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26: 139–140 PubMed
Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15: 550 PubMed
Zhou X, Lindsay H, Robinson MD (2014) Robustly detecting differential expression in RNA sequencing data using observation weights. Nucleic Acids Res 42: e91 PubMed