```{r style, echo = FALSE, results = 'asis'} BiocStyle::markdown() options(width=100, max.print=1000) knitr::opts_chunk$set( eval=as.logical(Sys.getenv("KNITR_EVAL", "TRUE")), cache=as.logical(Sys.getenv("KNITR_CACHE", "TRUE"))) ``` ```{r setup, echo=FALSE, messages=FALSE, warnings=FALSE} suppressPackageStartupMessages({ library(limma) library(ggplot2) }) ``` Alternative formats of this tutorial: [`.Rmd HTML`](http://girke.bioinformatics.ucr.edu/manuals/vignettes/Rbasics/Rbasics.html), [`.Rmd Source`](http://girke.bioinformatics.ucr.edu/manuals/vignettes/Rbasics/Rbasics.Rmd), [`.R Source`](http://girke.bioinformatics.ucr.edu/manuals/vignettes/Rbasics/Rbasics.R), [`PDF Slides`](http://faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Workshop_Dec_5_8_2014/Rbasics/Introduction_into_R.pdf) # Overview ## What is R? [R](http://cran.at.r-project.org) is a powerful statistical environment and programming language for the analysis and visualization of data. The associated [Bioconductor](http://bioconductor.org/) and CRAN package repositories provide many additional R packages for statistical data analysis for a wide array of research areas. The R software is free and runs on all common operating systems. ## Why Using R? * Complete statistical environment and programming language * Efficient functions and data structures for data analysis * Powerful graphics * Access to fast growing number of analysis packages * Most widely used language in bioinformatics * Is standard for data mining and biostatistical analysis * Technical advantages: free, open-source, available for all OSs ## Books and Documentation * simpleR - Using R for Introductory Statistics (John Verzani, 2004) \href{http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf}} * Bioinformatics and Computational Biology Solutions Using R and Bioconductor (Gentleman et al., 2005) \href{http://www.bioconductor.org/help/publications/books/bioinformatics-and-computational-biology-solutions/}} * More on this see "Finding Help" section in UCR Manual \href{http://manuals.bioinformatics.ucr.edu/home/R\_BioCondManual\#TOC-Finding-Help}} ## R Working Environments ## Working environments (IDEs) for R
R Projects and Interfaces
Some R working environments with support for syntax highlighting and utilities to send code to the R console: * [RStudio](https://www.rstudio.com/products/rstudio/features): excellent choice for beginners ([Cheat Sheet](http://www.rstudio.com/wp-content/uploads/2016/01/rstudio-IDE-cheatsheet.pdf)) * Basic R code editors provided by Rguis * [gedit](https://wiki.gnome.org/Apps/Gedit), [Rgedit](http://rgedit.sourceforge.net/), [RKWard](https://rkward.kde.org/), [Eclipse](http://www.walware.de/goto/statet), [Tinn-R](http://www.sciviews.org/Tinn-R/), [Notepad++](https://notepad-plus-plus.org/), [NppToR](http://sourceforge.net/projects/npptor/) * [Vim-R-Tmux](http://manuals.bioinformatics.ucr.edu/home/programming-in-r/vim-r): R working environment based on vim and tmux * [Emacs](http://www.xemacs.org/Download/index.html) ([ESS add-on package](http://ess.r-project.org/)) ### Example: RStudio New integrated development environment (IDE) for [R](http://www.rstudio.com/ide/download/). Highly functional for both beginners and advanced.
RStudio IDE
Some userful shortcuts: `Ctrl+Enter` (send code), `Ctrl+Shift+C` (comment/uncomment), `Ctrl+1/2` (switch window focus) ### Example: Vim-R-Tmux Terminal-based Working Environment for R: [Vim-R-Tmux](http://manuals.bioinformatics.ucr.edu/home/programming-in-r/vim-r)
Vim-R-Tmux IDE for R
# R Package Repositories * CRAN (>8,000 packages) general data analysis \href{http://cran.at.r-project.org/}} * Bioconductor (>1,100 packages) bioscience data analysis \href{http://www.bioconductor.org/}} * Omegahat (>90 packages) programming interfaces \href{http://www.omegahat.org/}} # Installation of R and Add-on Packages 1. Install R for your operating system from [CRAN](http://cran.at.r-project.org/). 2. Install RStudio from [RStudio](http://www.rstudio.com/ide/download). 3. Install CRAN Packages from R console like this: ```{r install_cran, eval=FALSE} install.packages(c("pkg1", "pkg2")) install.packages("pkg.zip", repos=NULL) ``` 4. Install Bioconductor packages as follows: ```{r install_bioc, eval=FALSE} source("http://www.bioconductor.org/biocLite.R") library(BiocInstaller) BiocVersion() biocLite() biocLite(c("pkg1", "pkg2")) ``` 5. For more details consult the [Bioc Install page](http://www.bioconductor.org/install/) and [BiocInstaller](http://www.bioconductor.org/packages/release/bioc/html/BiocInstaller.html) package. # Getting Around ## Startup and Closing Behavior * __Starting R__: The R GUI versions, including RStudio, under Windows and Mac OS X can be opened by double-clicking their icons. Alternatively, one can start it by typing `R` in a terminal (default under Linux). * __Startup/Closing Behavior__: The R environment is controlled by hidden files in the startup directory: `.RData`, `.Rhistory` and `.Rprofile` (optional). * __Closing R__: ```{r closing_r, eval=FALSE} q() Save workspace image? [y/n/c]: ``` * __Note__: When responding with `y`, then the entire R workspace will be written to the `.RData` file which can become very large. Often it is sufficient to just save an analysis protocol in an R source file. This way one can quickly regenerate all data sets and objects. ## Navigating directories Create an object with the assignment operator `<-` or `=` ```{r r_assignment, eval=FALSE} object <- ... ``` List objects in current R session ```{r r_ls, eval=FALSE} ls() ``` Return content of current working directory ```{r r_dirshow, eval=FALSE} dir() ``` Return path of current working directory ```{r r_dirpath, eval=FALSE} getwd() ``` Change current working directory ```{r r_setwd, eval=FALSE} setwd("/home/user") ``` # Basic Syntax General R command syntax ```{r r_syntax, eval=FALSE} object <- function_name(arguments) object <- object[arguments] ``` Finding help ```{r r_find_help, eval=FALSE} ?function_name ``` Load a library/package ```{r r_package_load, eval=FALSE} library("my_library") ``` List functions defined by a library ```{r r_package_functions, eval=FALSE} library(help="my_library") ``` Load library manual (PDF or HTML file) ```{r r_load_vignette, eval=FALSE} vignette("my_library") ``` Execute an R script from within R ```{r r_execute_script, eval=FALSE} source("my_script.R") ``` Execute an R script from command-line (the first of the three options is preferred) ```{sh sh_execute_script, eval=FALSE} $ Rscript my_script.R $ R CMD BATCH my_script.R $ R --slave < my_script.R ``` # Data Types __Numeric data__: `1, 2, 3, ...` ```{r r_numeric_data, eval=TRUE} x <- c(1, 2, 3) x is.numeric(x) as.character(x) ``` __Character data__: `"a", "b", "c", ...` ```{r r_character_data, eval=TRUE} x <- c("1", "2", "3") x is.character(x) as.numeric(x) ``` __Complex data__: mix of both ```{r r_complex_data, eval=TRUE} c(1, "b", 3) ``` __Logical data__: `TRUE` of `FALSE` ```{r r_logical_data, eval=TRUE} x <- 1:10 < 5 x !x which(x) # Returns index for the 'TRUE' values in logical vector ``` # Data objects ## Object types __Vectors (1D)__: `numeric` or `character` ```{r r_vector_object, eval=TRUE} myVec <- 1:10; names(myVec) <- letters[1:10] myVec[1:5] myVec[c(2,4,6,8)] myVec[c("b", "d", "f")] ``` __Factors (1D)__: vectors with grouping information ```{r r_factor_object, eval=TRUE} factor(c("dog", "cat", "mouse", "dog", "dog", "cat")) ``` __Matrices (2D)__: two dimensional structures with data of same type ```{r r_matrix_object, eval=TRUE} myMA <- matrix(1:30, 3, 10, byrow = TRUE) class(myMA) myMA[1:2,] myMA[1, , drop=FALSE] ``` __Data Frames (2D)__: two dimensional objects with data of variable types ```{r r_dataframe_object, eval=TRUE} myDF <- data.frame(Col1=1:10, Col2=10:1) myDF[1:2, ] ``` __Arrays__: data structure with one, two or more dimensions __Lists__: containers for any object type ```{r r_list_object, eval=TRUE} myL <- list(name="Fred", wife="Mary", no.children=3, child.ages=c(4,7,9)) myL myL[[4]][1:2] ``` __Functions__: piece of code ```{r r_function_object, eval=FALSE} myfct <- function(arg1, arg2, ...) { function_body } ``` ## Subsetting of data objects __Subsetting by positive or negative index/position numbers__ ```{r r_subset_by_index, eval=TRUE} myVec <- 1:26; names(myVec) <- LETTERS myVec[1:4] ``` __Subsetting by same length logical vectors__ ```{r r_subset_by_logical, eval=TRUE} myLog <- myVec > 10 myVec[myLog] ``` __Subsetting by field names__ ```{r r_subset_by_names, eval=TRUE} myVec[c("B", "K", "M")] ``` __Subset with `$` sign__: references a single column or list component by its name ```{r r_subset_by_dollar, eval=TRUE} iris$Species[1:8] ``` # Graphics example Plotting example ```{r plot_example, eval=TRUE} barplot(1:10, col="green") ``` # Session Info ```{r sessionInfo} sessionInfo() ```