Quality filtering and adaptor trimming

The following custom function trims adaptors hierarchically from the longest to the shortest match of the right end of the reads. If internalmatch=TRUE then internal matches will trigger the same behavior. The argument minpatternlength defines the shortest adaptor match to consider in this iterative process. In addition, the function removes reads containing Ns or homopolymer regions. More detailed information on read preprocessing is provided in systemPipeR's main vignette.

First, we construct SYSargs2 object from cwl and yml param and targets files.

dir_path <- system.file("extdata/cwl/preprocessReads/trim-se", 
    package = "systemPipeR")
trim <- loadWorkflow(targets = targetspath, wf_file = "trim-se.cwl", 
    input_file = "trim-se.yml", dir_path = dir_path)
trim <- renderWF(trim, inputvars = c(FileName = "_FASTQ_PATH1_", 
    SampleName = "_SampleName_"))
trim
output(trim)[1:2]

Next, we execute the code for trimming all the raw data.

fctpath <- system.file("extdata", "custom_Fct.R", package = "systemPipeR")
source(fctpath)
iterTrim <- ".iterTrimbatch1(fq, pattern='ACACGTCT', internalmatch=FALSE, minpatternlength=6, Nnumber=1, polyhomo=50, minreadlength=16, maxreadlength=101)"
preprocessReads(args = trim, Fct = iterTrim, batchsize = 1e+05, 
    overwrite = TRUE, compress = TRUE)
writeTargetsout(x = trim, file = "targets_trim.txt", step = 1, 
    new_col = "FileName", new_col_output_index = 1, overwrite = TRUE)

FASTQ quality report

The following seeFastq and seeFastqPlot functions generate and plot a series of useful quality statistics for a set of FASTQ files including per cycle quality box plots, base proportions, base-level quality trends, relative k-mer diversity, length and occurrence distribution of reads, number of reads above quality cutoffs and mean quality distribution. The results are written to a PDF file named fastqReport.png.

fqlist <- seeFastq(fastq = infile1(trim), batchsize = 10000, 
    klength = 8)
png("./results/fastqReport.png", height = 18, width = 4 * length(fqlist), 
    units = "in", res = 72)
seeFastqPlot(fqlist)
dev.off()

Figure 1: FASTQ quality report. To zoom in, right click image and open it in a separate browser tab.



Previous page.Previous Page                     Next Page Next page.