Professor of Bioinformatics
Institute for Integrative Genome Biology (IIGB)
Department of Botany and Plant Sciences
1207F Genomics Building
University of California, Riverside, CA 92521
Phone: (951) 732-7072
Postdoctorate, Michigan State University
My research focuses on the development of computational data analysis methods for genome biology and small molecule discovery. This includes discovery-oriented data mining projects, as well as algorithm and software development projects for data types from a variety of high-throughput technologies such as next generation sequencing (NGS), genome-wide profiling approaches and chemical genomics. As part of the multidisciplinary nature of my field, I frequently collaborate with experimental scientists on data analysis projects of complex biological networks. Another important activity is the development of integrated data analysis systems for the open source software projects R and Bioconductor. The following gives a short summary of a few selected projects in my group.
Tools for Analyzing Next Generation Sequence Data
Reference-Assisted Transcriptome Assembly
Modeling Gene Expression Networks from RNA-Seq and ChIP-Seq Data
As part of several collaborative research projects, my group has developed a variety of data analysis pipelines for profiling data from next generation sequencing projects (e.g. RNA-Seq and ChIP-Seq), microarray experiments and high-throughput small molecule screens. Most of the data analysis resources developed by these projects are described in the associated online manuals for next generation data analysis. Recent research publications of these projects include Yang et al., 2013, Zou et al., 2013, Yadav et al., 2013, Yadav et al., 2011, Mustroph et al., 2009, etc.
Software Resources for Small Molecule Discovery and Chemical Genomics
Figure 3: Selectivity Analysis with ChemmineR and bioassayR
Functional Annotation of Gene and Protein Sequences
Computational methods for characterizing the functions of protein sequences play an important role in the discovery of novel molecular, biochemical and regulatory activities. To facilitate this process, we have developed the sub-HMM algorithm that extends the application spectrum of profile HMMs to motif discovery and active site prediction in protein sequences (Horan et al. 2010). Its most interesting utility is the identification of the functionally relevant residues in proteins of known and unknown function (Figure 4). Additionally, sub-HMMs can be used for highly localized sequence similarity searches that focus on shorter conserved features rather than entire domains or global similarities. As part of this study we have predicted a comprehensive set of putative active sites for all protein families available in the Pfam database which has become a valuable knowledge resource for characterizing protein functions in the future.
Figure 4: Illustration of the sub-HMM extraction process from conserved protein domains, here Pfam desaturase domain (PF00487).
This position requires several years of experience in computational biology, statistics, database design and data mining with strong publications records in several of these areas. The candidate should be proficient in at least one of the common programming languages that are used in bioinformatics: C, Python, Java, Perl or R. Experience with web and database programming is also beneficial. To apply for this position, please email your CV and a statement of research interests to firstname.lastname@example.org.
Postdoctoral position in cheminformatics - filled
This position requires experience in computational chemistry, drug-informatics, QSAR/phramacophore modeling and data mining with strong publications records in several of these areas. The candidate should be proficient in at least one of the common programming languages that are used in cheminformatics: C, Python, Java, Perl or R. Experience with web and database programming is also beneficial. To apply for this position, please email your CV and a statement of research interests to email@example.com.
Undergraduate research positions - filled
Several postions are available for undergraduate students who are interested in participating in challenging database and software development projects for various research projects in bioinformatics and cheminformatics. The required skills are experience with open-source databases (MySQL/PostgreSQL), application and web programming with Python, Perl and/or Java. The minimum time commitment is 15-20h per week for a period of at least 18 months. Full-time employment during the summer break is possible. To apply for these positions, please email your CV with a detailed outline of your computational skills to firstname.lastname@example.org.