Challenge Project: Identification coding variants affecting conserved protein residues
Run workflow from start to finish (steps 1-8) on data set from Lu et al (2012)
Challenge project tasks
Map all coding variants to one or both of the following protein features:
Pfam domains
Prosite motifs
Rank variants mapping to above protein features by the degree of conservation of AA residues
References
Lu P, Han X, Qi J, Yang J, Wijeratne AJ, Li T, Ma H (2012) Analysis of Arabidopsis genome-wide variations before and after meiosis and meiotic recombination by resequencing Landsberg erecta and all four products of a single meiosis. Genome Res 22: 508–518 PubMed
DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, et al (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43: 491–498 PubMed
Shihab HA, Rogers MF, Gough J, Mort M, Cooper DN, Day INM, Gaunt TR, Campbell C (2015) An integrative approach to predicting the functional effects of non-coding and coding sequence variation. Bioinformatics 31: 1536–1543 PubMed