HW1 - Online Exercise and Basic GitHub Usage

3 minute read

A. Online Excercise: Databases and Software Tools

This is an easy warm-up homework exposing students to a variety of online databases and software tools.

  1. Go to http://www.ncbi.nlm.nih.gov, select Protein database in dropdown, and then run query: P450 & hydroxylase & human [organism], select under Source databases UniProtKB/Swiss-Prot
    1. Report final query syntax from Search Details field.

  1. Save GIs of the final query result to a file. For this select under Send to dropdown GI List format.
    1. Report the number of retrieved GIs.

  1. Retrieve the corresponding sequences through Batch-Entrez using GI list file as query input -> save sequences in FASTA format

  1. Generate multiple alignment and tree of these sequences using MultAalin
    1. Save multiple alignment and tree to file
    2. Identify putative heme binding cysteine in multiple alignment

  1. Open corresponding UniProt page and search for first P450 sequence in your list.
    1. Compare putative heme binding cysteine with consensus pattern from Prosite database (Syntax)
    2. Report corresponding Pfam ID

  1. BLASTP against PDF database (use again first P450 in your list); on result page click first entry in BLAST hit list (here 3K9V_A); then select ‘Identify Conserved Domains’ on side bar; click grey bar labelled ‘p450’; then select ‘Interactive View’ under ‘Structure’ menu which will download a file named ‘pfam00067.cn3’.
    1. Compare resulting alignment with result from MultAlin
    2. View 3D structure (pfam00067.cn3) in Cn3D*, save structure (screen shot) and highlight heme binding cysteine. Note, Cn3D* can be downloaded from here.

*If there are problems in the last step (6.2) with the install of Cn3D, then please use this online only alternative: (i) click in the 3K9V_A page ‘Protein 3D Structure’ instead of ‘Identify Conserved Domains’; (ii) choose one of the two structure entries provided on the subsequent page; (iii) select option “full-featured 3D viewer” in the bottom right corner of the structure image; (iv) choose the ‘Details’ tab on the right; (v) after this the structure of the protein is shown on the left and the underlying protein sequence on the right; (vi) highlight the heme binding cysteine in the structure by selecting it in the sequence; and (vii) then save the structure view to a PNG file or take a screenshot.

B. Homework Submission to a Private GitHub Repository

Please assemble the results of this homework in one PDF file and upload it to your private course GitHub repository under Homework/HW1/HW1.pdf.

Due date

Most homework will be due one week after they are assigned. This one is due on Thu, April 7th at 6:00 PM. You have unlimited attempts. Students can edit and re-upload files anytime before the deadline.

Homework solution

A solution for this homework is not required since the tasks are identical to the steps described above under sections HW1A-B.

Last modified 2022-03-31: some edits (4c571b9ba)