Experiment definition provided by targets file
The targets file defines all FASTQ files and sample comparisons of the analysis workflow.
targetspath <- system.file("extdata", "targets_chip.txt", package="systemPipeR")
targets <- read.delim(targetspath, comment.char = "#")
targets[1:4,-c(5,6)]
## FileName SampleName Factor SampleLong SampleReference
## 1 ./data/SRR446027_1.fastq M1A M1 Mock.1h.A
## 2 ./data/SRR446028_1.fastq M1B M1 Mock.1h.B
## 3 ./data/SRR446029_1.fastq A1A A1 Avr.1h.A M1A
## 4 ./data/SRR446030_1.fastq A1B A1 Avr.1h.B M1B
Read quality filtering and trimming
The following example shows how one can design a custom read
preprocessing function using utilities provided by the ShortRead package, and then
apply it with preprocessReads in batch mode to all FASTQ samples referenced in the
corresponding SYSargs instance (args object below). More detailed information on
read preprocessing is provided in systemPipeR's main vignette.
args <- systemArgs(sysma="param/trim.param", mytargets="targets_chip.txt")
filterFct <- function(fq, cutoff=20, Nexceptions=0) {
qcount <- rowSums(as(quality(fq), "matrix") <= cutoff)
fq[qcount <= Nexceptions] # Retains reads where Phred scores are >= cutoff with N exceptions
}
preprocessReads(args=args, Fct="filterFct(fq, cutoff=20, Nexceptions=0)", batchsize=100000)
writeTargetsout(x=args, file="targets_chip_trim.txt", overwrite=TRUE)
FASTQ quality report
The following seeFastq and seeFastqPlot functions generate and plot a series of useful quality
statistics for a set of FASTQ files including per cycle quality box
plots, base proportions, base-level quality trends, relative k-mer
diversity, length and occurrence distribution of reads, number of reads
above quality cutoffs and mean quality distribution. The results are
written to a PDF file named fastqReport.pdf.
args <- systemArgs(sysma="param/tophat.param", mytargets="targets_chip.txt")
library(BiocParallel); library(BatchJobs)
f <- function(x) {
library(systemPipeR)
args <- systemArgs(sysma="param/tophat.param", mytargets="targets_chip.txt")
seeFastq(fastq=infile1(args)[x], batchsize=100000, klength=8)
}
funs <- makeClusterFunctionsSLURM("slurm.tmpl")
param <- BatchJobsParam(length(args), resources=list(walltime="00:20:00", ntasks=1, ncpus=1, memory="2G"), cluster.functions=funs)
register(param)
fqlist <- bplapply(seq(along=args), f)
pdf("./results/fastqReport.pdf", height=18, width=4*length(fqlist))
seeFastqPlot(unlist(fqlist, recursive=FALSE))
dev.off()

Previous Page Next Page
