## Read mapping with Bowtie2

The NGS reads of this project will be aligned with Bowtie2 against the reference genome sequence (Langmead et al., 2012). The parameter settings of the aligner are defined in the bowtieSE.param file. In ChIP-Seq experiments it is usually more appropriate to eliminate reads mapping to multiple locations. To achieve this, users want to remove the argument setting -k 50 non-deterministic in the bowtieSE.param file.

The following submits 18 alignment jobs via a scheduler to a computer cluster.

args <- systemArgs(sysma="param/bowtieSE.param", mytargets="targets_chip_trim.txt")
sysargs(args)[1] # Command-line parameters for first FASTQ file
moduleload(modules(args)) # Skip if a module system is not used
system("bowtie2-build ./data/tair10.fasta ./data/tair10.fasta") # Indexes reference genome
resources <- list(walltime="1:00:00", ntasks=1, ncpus=cores(args), memory="10G")
reg <- clusterRun(args, conffile=".BatchJobs.R", template="slurm.tmpl", Njobs=18, runid="01",
resourceList=resources)
waitForJobs(reg)
writeTargetsout(x=args, file="targets_bam.txt", overwrite=TRUE)


Alternatively, one can run the alignments sequentially on a single system.

runCommandline(args)


Check whether all BAM files have been created

file.exists(outpaths(args))


The following provides an overview of the number of reads in each sample and how many of them aligned to the reference.

read_statsDF <- alignStats(args=args)

The symLink2bam function creates symbolic links to view the BAM alignment files in a genome browser such as IGV without moving these large files to a local system. The corresponding URLs are written to a file with a path specified under urlfile, here IGVurl.txt.
symLink2bam(sysargs=args, htmldir=c("~/.html/", "somedir/"),