## A. Choice of Sequence Type

• Task 1: Which sequence type - amino acid or nucleotide - is more appropriate to search databases for remotely related sequences? Provide at least three reasons for your decision.

## B. Dynamic Programming for Pairwise Alignments

• Task 2: Create manually (or write an R script for it) one global and one local alignment for the following two protein sequences using the Needleman-Wusch and Smith-Waterman algorithms, respectively:

Use in each case BLOSUM50 as substitution matrix and 8 as gap opening and extension penalties. Note, here is some R code to create the initial matrix programmatically for upload to a spreadsheet program. Alternatively, solve the entire homework by writing an R script. Your answers should contain the following components:

1. Manually populated dynamic programming matrices
2. The optimal pairwise alignments created by traceback
3. The final scores of the alignments

## C. Alignments with Different Substitution Matrices

• Task 1: Load the Biostrings package in R, import the following two cytochrome P450 sequences O15528 and P98187 from NCBI (save as myseq.fasta), and create a global alignment with the pairwiseAlignment function from Biostrings as follows:

Assemble the results from this homework in one PDF file (HW4.pdf) and upload it to your private GitHub repository under Homework/HW4/HW4.pdf.