Cluster and Network Analysis Methods

2 minute read



RNA-Seq Workflow

  1. Read quality assessment, filtering and trimming
  2. Map reads against reference genome
  3. Perform read counting for required ranges (e.g. exonic gene ranges)
  4. Normalization of read counts
  5. Identification of differentially expressed genes (DEGs)
  6. Clustering of gene expression profiles
  7. Gene set enrichment analysis

Challenge Projects

1. Cluster and network analysis methods

  • Run workflow from start to finish (steps 1-7) on RNA-Seq data set from Howard et al. (2013)
  • Challenge project tasks
    • Compare at least 2-3 cluster analysis methods (e.g. hierarchical, k-means, Fuzzy C-Means, WGCNA, other) and assess the performance differences as follows:
      • Analyze the similarities and differences in the cluster groupings obtained from the two methods.
      • Do the differences affect the results of the downstream functional enrichment analysis?
      • Plot the performance of the clustering methods in form of ROC curves and/or record their AUC values. Functional annotations (e.g. GO, KEGG, Pfam) could be used as a benchmark for defining true results.

2. Cluster and network analysis methods

  • Similar as above but with different combination of clustering methods and/or performance testing approach.

References

  • Howard, B.E. et al., 2013. High-throughput RNA sequencing of pseudomonas-infected Arabidopsis reveals hidden transcriptome complexity and novel splice variants. PloS one, 8(10), p.e74183. PubMed
  • Langfelder P, Luo R, Oldham MC, Horvath S (2011) Is my network module preserved and reproducible? PLoS Comput Biol 7: e1001057. PubMed
  • Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9: 559–559. PubMed
  • Rodriguez MZ, Comin CH, Casanova D, Bruno OM, Amancio DR, Costa L da F, Rodrigues FA (2019) Clustering algorithms: A comparative approach. PLoS One 14: e0210236. PubMed
Last modified 2021-06-02: some edits (3094a6c30)