HW1 - Online Exercise and Basic GitHub Usage

4 minute read

A. Online Excercise: Databases and Software Tools

This is an easy warm-up homework exposing students to a variety of online databases and software tools.

  1. Go to http://www.ncbi.nlm.nih.gov, select Protein database in dropdown, and then run query: P450 & hydroxylase & human [organism], select under Source databases UniProtKB/Swiss-Prot
    1. Report final query syntax from Search Details field.

  1. Save GIs of the final query result to a file. For this select under Send to dropdown GI List format.
    1. Report the number of retrieved GIs.

  1. Retrieve the corresponding sequences through Batch-Entrez using GI list file as query input -> save sequences in FASTA format

  1. Generate multiple alignment and tree of these sequences using MultAalin
    1. Save multiple alignment and tree to file
    2. Identify putative heme binding cysteine in multiple alignment

  1. Open corresponding UniProt page and search for first P450 sequence in your list.
    1. Compare putative heme binding cysteine with consensus pattern from Prosite database (Syntax)
    2. Report corresponding Pfam ID

  1. BLASTP against PDF database (use again first P450 in your list); on result page click first entry in BLAST hit list (here 3K9V_A); then select ‘Identify Conserved Domains’ on side bar; click blue bar labelled ‘CYP24A1’; then select ‘Interactive View’ which will download ‘cd20645.cn3’ file.
    1. Compare resulting alignment with result from MultAlin
    2. View 3D structure in Cn3D*, save structure (screen shot) and highlight heme binding cysteine. Note, Cn3D* can be downloaded from here.

*If there are problems in the last step (6.2) with the install of Cn3D, then please use this online only alternative: (i) click in the 3K9V_A page ‘Protein 3D Structure’ instead of ‘Identify Conserved Domains’; (ii) choose one of the two structure entries provided on the subsequent page; (iii) select option “full-featured 3D viewer” in the bottom right corner of the structure image; (iv) choose the ‘Details’ tab on the right; (v) after this the structure of the protein is shown on the left and the underlying protein sequence on the right; (vi) highlight the heme binding cysteine in the structure by selecting it in the sequence; and (vii) then save the structure view to a PNG file or take a screenshot.

B. Homework Submission to a Private GitHub Repository

To learn the basics of GitHub, this homework assignment will be uploaded to a private GitHub respository that will be shared with the instructors. To do so, follow these steps: In this homework, you are asked to create a github private repository.

  1. Login to your GitHub account.
  2. Click the “New” button in the sidebar on the left.
  3. Under “Repository name”, use “learn-github” as the name.
  4. Choose “Private”.
  5. Choose to add a README file (optional).
  6. Click “Create repository”.
  7. You should redirected to your new repo’s page. In the new page, click “Settings”.
  8. Click “Manage access”, and the choose “Invite a collaborator”.
  9. Invite both “tgirke” and “lz100” and send out the invitation.

Note, the creator of this demo cannot search for an invitation created by the same user (here lz100). Thus, the message ‘could not find lz100’ at the end of the video. This will not be the case for students in the class.

Please assemble the results of part A of HW1 in one PDF file named hw1.pdf and upload it to your private GitHub repository generated in part B. To do so, follow the upload instructions here.

C. Homework Submission via GitHub Classroom

To also submit your homework to GitHub Classroom click this link to accept the homework. This will create a private homework repository in GitHub Classroom for you. Note: this new repository is different from the repository in part A. It belongs to the GEN242-2021 Github classroom that will be used for all subsequent homework submissions.

Due date

Most homework will be due one week after they are assigned. This one is due on Thu, April 8th at 6:00 PM. You have unlimited attempts. Students can edit and re-upload files anytime before the deadline.

Homework solution

A solution for this homework is not required since the tasks are identical to the steps described above under sections HW1A-B.

Last modified 2021-04-10: some edits (ebac67cdc)