May 30, 2023

How to Navigate this Slide Show?


  • This ioslides presentation contains scrollable slides.
  • Which slides are scrollable, is indicated by a tag at the bottom of the corresponding slides stating:

[ Scroll down to continue ]

  • The following single character keyboard shortcuts enable alternate display modes of ioslides:
    • f: enable fullscreen mode
    • w: toggle widescreen mode
    • o: enable overview mode
    • h: enable code highlight mode
  • Pressing Esc exits all of these modes. Additional details can be found here.

Review Topics

  • Course project presentations
    • Structure: on course site here
    • Schedule: see Course Planning Sheet linked on Canvas here
  • Graphics overview here
    • Specialty Graphics: ROC
    • Genome Graphics
  • More on scheduler and projects: slides and tutorial
    • Simple slurm submission
    • Slurm from R via batchtools
    • Submit workflow from command-line to HPCC cluster (see here)

Outline

  • Overview
  • Package Development with R Base et al.
  • Package Development with devtools et al.
  • References

Overview

Motivation for building R packages

  1. Organization
    • Consolidate functions with related utilties in single place
    • Interdepencies among less complex functions make coding more efficient
    • Minimizes duplications
  2. Documentation
    • Help page infrastructure improves documentation of functions
    • Big picture of utilties provided by package vignettes (manuals)
  3. Sharability
    • Package can be easier shared with colleagues and public
    • Increases code accessibilty for other users
  4. Extendibility
    • Makes software more extentible and maintainable

Package development environments



This following introduces two approaches for building R packages:

  1. R Base and related functionalities
  2. devtools and related packages (e.g. usethis, roxygen2 and sinew)



The sample code provided below creates for each method a simple test package that can be installed and loaded on a user’s system. The instructions for the second appoach are more detailed since it is likely to provide the most practical soluton for newer users of R.

Outline

  • Overview
  • Package Development with R Base et al.
  • Package Development with devtools et al.
  • References

R Base et al.

  • R packages can be built with the package.skeleton function. The most comprehensive documentation on package development is provided by the Writing R Extensions page on CRAN.
  • The basic workflow example below will create a directory named mypackage containing the skeleton of the package for all functions, methods and classes defined in the R script(s) passed on to the code_files argument.
  • The basic structure of the package directory is described here.
  • The package directory will also contain a file named Read-and-delete-me with instructions for completing the package:
## Download R script (here pkg_build_fct.R) containing two sample functions                                                                                                         
download.file("https://raw.githubusercontent.com/tgirke/GEN242/main/content/en/tutorials/rpackages/helper_functions/pkg_build_fct.R", "pkg_build_fct.R")                           
## Build package skeleton based on functions in pkg_build_fct.R                                                                                                                     
package.skeleton(name="mypackage", code_files=c("pkg_build_fct.R")) 
  • Once a package skeleton is available one can build the package from the command-line (Linux/OS X).
  • This will create a tarball of the package with its version number encoded in the file name. Subequently, the package tarball needs to be checked for errors with:

[ Scroll down to continue ]





system("R CMD build mypackage")
system("R CMD check mypackage_1.0.tar.gz") 

Install package from source

install.packages("mypackage_1.0.tar.gz", repos=NULL) 

Outline

  • Overview
  • Package Development with R Base et al.
  • Package Development with devtools et al.
  • References

R devtools et al.

Several package develpment routines of the traditional method outlined above are manual, such as updating the NAMESPACE file and documenting functions in separate help (*.Rd) files. This process can be simplified and partially automated by taking advantage of a more recent R package development environment composed of several helper packages including devtools, usethis, roxygen2 and sinew (Wickham and Bryan, n.d.). Many books and web sites document this process in more detail. Here is a small selection of useful online documentation about R package development:

Workflow for building R packages

The following outlines the basic workflow for building, testing and extending R packages with the package development environment functionalities outlined above.

(a) Create package skeleton

library("devtools"); library("roxygen2"); library("usethis"); library(sinew) # If not availble install these packages with 'install.packages(...)'
create("myfirstpkg") # Creates package skeleton. The chosen name (here myfirstpkg) will be the name of the package.
setwd("myfirstpkg") # Set working directory of R session to package directory 'myfirstpkg'
use_mit_license() # Add license information to description file (here MIT). To look up alternatives, do ?use_mit_license

(b) Add R functions

Next, R functions can be added to *.R file(s) under the R directory of the new package. Several functions can be organized in one *.R file, each in its own file or any combination. For demonstration purposes, the following will download an R file (pkg_build_fct.R from here) defining two functions (named:myMAcomp and talkToMe) and save it to the R directory of the package.

[ Scroll down to continue ]



download.file("https://raw.githubusercontent.com/tgirke/GEN242/main/content/en/tutorials/rpackages/helper_functions/pkg_build_fct.R", "R/pkg_build_fct.R")

(c) Auto-generate roxygen comment lines

The makeOxygen function from the sinew package creates roxygen2 comment skeletons based on the information from each function (below for myMAcomp example). The roxygen comment lines need to be added above the code of each function. This can be done by copy and paste from the R console or by writing the output to a temporary file (below via writeLines). Alternatively, the makeOxyFile function can be used to create a roxygenized copy of an R source file, where the roxygen comment lines have been added above all functions automatically. Next, the default text in the comment lines needs to be replaced by meaningful text describing the utility and usage of each function. This editing process of documentation can be completed and/or revised any time.

load_all() # Loads package in a simulated way without installing it. 
writeLines(makeOxygen(myMAcomp), "myroxylines") # This creates a 'myroxylines' file in current directory. Delete this file after adding its content to the corresponding functions.

(d) Autogenerate help files

The document function autogenerates for each function one *.Rd file in the man directory of the package. The content in the *.Rd help files is based on the information in the roxygen comment lines generated in the previous step. In addition, all relevant export/import instructions are added to the NAMESPACE file. Importantly, when using roxygen-based documentation in a package then the NAMESPACE and *.Rd files should not be manually edited since this information will be lost during the automation routines provided by roxygen2.

document() # Auto-generates/updates *.Rd files under man directory (here: myMAcomp.Rd and talkToMe.Rd)
tools::Rd2txt("man/myMAcomp.Rd") # Renders Rd file from source
tools::checkRd("man/myMAcomp.Rd") # Checks Rd file for problems

(e) Add a vignette

A vignette template can be auto-generated with the use_vignette function from the usethis package. The *.Rmd source file of the vignette will be located under a new vignette directory. Additional vignettes can be manually added to this directory as needed.

use_vignette("introduction", title="Introduction to this package")

(f) Check, install and build package

Now the package can be checked for problems. All warnings and errors should be addressed prior to submission to a public repository. After this it can be installed on a user’s system with the install command. In addition, the build function allows to assemble the package in a *.tar.gz file. The latter is often important for sharing packages and/or submitting them to public repositories.

setwd("..") # Redirect R session to parent directory
check("myfirstpkg") # Check package for problems, when in pkg dir one can just use check()
# remove.packages("myfirstpkg") # Optional. Removes test package if already installed
install("myfirstpkg", build_vignettes=TRUE) # Installs package  
build("myfirstpkg") # Creates *.tar.gz file for package required to for submission to CRAN/Bioc

(g) Using the new package

After installing and loading the package its functions, help files and vignettes can be accessed as follows.

library("myfirstpkg")
library(help="myfirstpkg")
?myMAcomp
vignette("introduction", "myfirstpkg")

Another very useful development function is test for evaluating the test code of a package.

(h) Share package on GitHub

To host and share the new package myfirstpkg on GitHub, one can use the following steps:

  1. Create an empty target GitHub repos online (e.g. named mypkg_repos) as outlined here.
  2. Clone the new GitHub repos to local system with git clone https://github.com/<github_username>/<repo name> (here from command-line)
  3. Copy the root directory of the package into the downloaded repos with cp myfirstpkg mypkg_repos
  4. Next cd into mypkg_repos, and then add all files and directories of the package to the staging area with git add -A :/.
  5. Commit and push the changes to GitHub with: git commit -am "first commit"; git push.
  6. After this the package should be life on the corresponding GitHub web page.
  7. Assuming the package is public, it can be installed directly from GitHub by anyone as shown below (from within R). Installs of private packages require a personal access token (PAT) that needs to be assigned to the auth_token argument. PATs can be created here.
devtools::install_github("<github_user_name>/<mypkg_repos>", subdir="myfirstpkg") # If the package is in the root directory of the repos, then the 'subdir' argument can be dropped.

Outline

  • Overview
  • Package Development with R Base et al.
  • Package Development with devtools et al.
  • References

Session Info

sessionInfo()
## R version 4.3.0 (2023-04-21)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Debian GNU/Linux 11 (bullseye)
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/Los_Angeles
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.31   R6_2.5.1        fastmap_1.1.1   xfun_0.39       cachem_1.0.8    knitr_1.42      htmltools_0.5.5 rmarkdown_2.21  cli_3.6.1       sass_0.4.6      jquerylib_0.1.4 compiler_4.3.0  tools_4.3.0    
## [14] evaluate_0.21   bslib_0.4.2     yaml_2.3.7      rlang_1.1.1     jsonlite_1.8.4

References