ENS210 - Fall 2023
Instructor
Name: Ogun Adebali
E-mail: oadebali@sabanciuniv.edu
Office: FENS-1055
Office hours: Fri 10.40-11:30 (by appointment only)
Teaching Assistants
TA | Office Day | Office Hours | Office | |
---|---|---|---|---|
Veysel Ogulcan Kaya | vogulcan@sabanciuniv.edu | Monday | 10:40am-12:30pm | online |
Yagmur Sozeri | yagmur.sozeri@sabanciuniv.edu | Tuesday | 14:40pm-16:30pm | FENS-L038 |
Cem Azgari | cemazgari@sabanciuniv.edu | Wednesday | 14:40pm-16:30pm | FENS-L038 |
Learning Assistants
LA | |
---|---|
Bahar Sevgin | baharsevgin@sabanciuniv.edu |
Durmuş Erdem Kertmen | ekertmen@sabanciuniv.edu |
Deniz Muratli | denizm@sabanciuniv.edu |
Class hours
- Wed 16:40-17:30 FASS-G022 (lecture/prelab)
- Wed 17:40-19:30 FASS-G022 (lab)
- Fri 12:40-14:30 FASS-G022 (lectures)
Content
- ENS210 - Fall 2023
Course Description
Have you ever considered how the code in each of your cells determines your physical appearance, disease risk, even your behaviors? Do you know why you and the annoying fly buzzing in the middle of the night are unique? Why does a diet work well for you whereas it might not for others? It is all genome! If the genome contains so much information why can’t we design person-specific drugs, diets, treatments etc? It is because we don’t understand what the code exactly means! Identifying code is no more a barrier, but its analysis is. In this course, we will learn the basics in computational genomics with the aim of gaining fundamentals of bioinformatics applications. We will learn using publically available tools as well as writing custom python scripts in order to answer biological questions. The more details regarding the content, grading and policy can be found below.
Learning objectives
- Explain why bioinformatics is necessary today.
- Use UNIX environment to parse genome data files.
- Write Python scripts to perform basic DNA and protein sequence analyses.
- Find hypothetical genes in a given DNA sequence.
- Synthesize protein sequence with a given DNA sequence.
- Use regular expressions to find protein motifs and visualize them on protein structure.
- Understand what homology is, how homology information can be used in protein function identification.
- Build and interpret multiple sequence alignments.
- Build, visualize and analyze phylogenetic trees.
- Understand what protein domains are and how they are predicted with a given protein sequence.
- Know a variety of NGS methods and what they are designed for.
- Build NGS analysis pipelines.
Requirements and expectations
- There is no official textbook for the class. Slides will be made available after each class. In order to be successful the best way is in-class learning and taking notes.
- Being active in lectures and lab sessions is encouraged.
- There is NO stupid question. Do not hesitate to ask any question.
- Bring a laptop to every class and lab.
- Late work will not be accepted.
- Lab work should be completed within lab hours. The assignment system will have a firm deadline unless your instructor (or TA) agrees that extra time is required. If extra time is given, the new due date will be midnight. Therefore, please arrange your program accordingly.
Honesty
- All the work should be completed personally unless stated otherwise. You will be assigned a single group project where you are expected to collaborate, the rest will be individual assignments. For group assignments, groups may not share their codes, individual are allowed to share their work (code) with other group members only.
- Plagiarism will NOT be tolerated. This does not mean that you are disallowed to use the internet. However, you may not copy and paste any code from the internet. You need to cite the references/websites properly whenever you get inspired otherwise your work will be treated as plagiarism.
- You are not allowed to share code in any case (except for group assignment).
Attendance
- Attendance is required. If you are not able to attend, send an e-mail to me and state your excuse before the class. 6 (for lectures) or 2 (for labs) unexcused absences will be considered legitimate for grade reduction.
- Make-ups are only given for midterms and the final examination. A medical report must be brought.
- No make-up will be given for any missed lab.
Group presentations
- At the time of group presentations, you must be present and ready to present your group work in class. One of the group members will randomly be called to give their presentation. The group members might receive different grades as personal contribution is a component (20%).
- Late work will not be accepted.
Academic Integrity
To uphold with Sabanci University Academic Integrity Statement
I will not lie and cheat in my academic work.
I will act (by letting the instructor know) if the academic integrity is compromised.
I will not share the video conference link and lecture records with anyone else.
By being registered in this class, you will be assumed to have accepted the rules written in this syllabus.
Evaluation
Component | Weight |
---|---|
Lab/quiz/homework/participation | 30% |
Group project | 10% |
Midterm I | 15% |
Midterm II | 15% |
Final | 30% |
Each lab, homework and announced quiz will have a weight of 2 units, a pop-up quiz will have a weight of 1 unit.
You may receive tophat questions throughout the course.
Your lowest 1 lab score will be dropped. No make-ups will be given for the missed labs. For the unexcused missed labs no points will be given.
Each lab will be evaluated out of 10 points. Homeworks and quizes will be evaluated out of 10 points. Tophat questions will be evaluated based on the point assigned in the Tophat system. All the points will be summed up at the end of the semester. The total will comprise 30% of your total score.
Attendance and active participation are expected. Each of you will receive a participation score at the end of the semester (extra 2 points out of 100). Participation score will contribute to the Lab/quiz/homework/participation
segment, it won’t be dropped. Please note that participation score will subjective and will be given by the insructor in light of your participation in lectures and labs.
Enrol in Tophat
Please go to this link to enrol in the tophat classroom.
If you miss an exam (midterm or final) or more than two labs you will automatically fail and get NA.
Objections
After the results are announced for each exam, and objection days and time slots will be announced. You will only be able to object on the announced days. If the time slots don’t fit to your schedule, you are supposed request an appointment from the instructor on the same date of the announcement for the objection.
If you miss the objection period, you won’t be given a second chance to see your exam papers.
Grading
The grading will be based on the class performance. Curve-based grading will be applied.
There will be no extra homework/project to increase grades at the end of semester. This is not negotiable.
Individual graduation situations are not important, and they won’t change your letter grade at the end of the semester
ANY kind of misconduct including code sharing, plagiarism, cheating etc will NOT be tolerated. You will fail the course. Disciplinary actions will be taken.
Course Plan
The course plan given below is subject to change.
Week # | Date | Topic |
---|---|---|
1 | 4 Oct | Course introduction (starts at 17:40 for this day only) |
6 Oct | Lab 0: Git setup | |
2 | 11 Oct | Introduction to Git + UNIX |
11 Oct | Lab 1: Analyze Files in Linux | |
13 Oct | Introduction to Genomics | |
3 | 18 Oct | PROJECT description |
18 Oct | Useful command line tools | |
18 Oct | Lab 2: Analyze Genomic Files in Linux | |
20 Oct | What is a gene? | |
4 | 25 Oct | Introduction to Python |
25 Oct | Lab 3: Sequence processing in Python | |
27 Oct | Whay is a gene? | |
5 | 1 Nov | Codon tables |
1 Nov | Lab 4: Finding a gene | |
3 Nov | From DNA to Protein | |
6 | 8 Nov | Compare two sequences |
8 Nov | Lab 5: DNA to Protein | |
10 Nov | Homology | |
7 | 15 Nov | Lab 6: Protein to DNA |
15 Nov | DEADLINE: Project first report (by 23:59) | |
17 Nov | Project - Variant Calling Results (presentations) | |
8 | 22 Nov | BLAST |
22 Nov | Lab 7: BLAST | |
24 Nov | Homology - Multiple sequence comparison | |
9 | 29 Nov | How to align multiple sequences |
29 Nov | Lab 8: Multiple Sequence Alignment | |
1 Dec | Midterm | |
10 | 6 Dec | Conservation analysis from multiple sequence alignment |
6 Dec | Lab 9: Measure Conservation | |
8 Dec | Multiple sequence alignment algorithms | |
11 | 13 Dec | Phylogenetic Trees |
13 Dec | Lab 10: Phylogenetics | |
15 Dec | Protein Domains and Motifs | |
12 | 20 Dec | Lab Midterm |
20 Dec | Lab Midterm | |
22 Dec | Phylogenetics | |
13 | 27 Dec | Project Q&A |
27 Dec | Lab 11: Molecular Docking | |
29 Dec | Phylogenetics, NGS methods | |
1 Jan | DEADLINE: Project final report (by 23:59) | |
14 | 3 Jan | Group presentations I |
3 Jan | Group presentations II | |
5 Jan | Group presentations II |
Week 1
Setup for the lab
Lab-0
-
Go to the following assignment
-
Accept the assignment.
-
Copy the link of your repository
-
Clone the repo to your local machine with
git clone REPOSITORY_LINK
-
Follow the instructions in the
readme.md
file in your cloned repository.
Week 2
Wednesday
-
Introduction to the Course - slides
-
Introduction to Genomics - slides
- Genome statistics
- Central Dogma of Biology
- Chargaff’s First Parity Rule
- Structure of Nucleic Acids DNA and RNA
- DNA structure discovery
Lab-1
-
Go to the following assignment This link will also be available on SUcourse at 16:40 on the class day.
-
Accept the assignment.
-
Copy the link of your repository
-
Clone the repo to your local machine with
git clone REPOSITORY_LINK
-
Follow the instructions in the
readme.md
file in your cloned repository.
Extra-material (required)
Command Line Basics - PDF
Friday
-
What is Gene? - slides
- Gene definition
- Variation effect at different levels
- Eukaryotic vs Prokaryotic cells
- Gene structure
- Eukaryotic genes vs Prokaryotic genes
- Alternative splicing
- Epigenetic regulation
- Operon structure
- Lac operon
Week 3
Wednesday
- Introduction to the Group Project
Lab-2
-
Go to the following assignment This link will also be available on SUcourse at 16:40 on the class day.
-
Accept the assignment.
-
Copy the link of your repository
-
Clone the repo to your local machine with
git clone REPOSITORY_LINK
-
Follow the instructions in the
readme.md
file in your cloned repository.
Friday
-
Introduction to Genomics II - slides
- DNA vs RNA
- RNA structures
- How to predict RNA structure
- Sanger sequencing
- Gel electrophoresis
- Shotgun sequencing
- How to calculate the size of a genome in bytes
Week 4
Wednesday
- Group Project - Progress of groups
Lab-3
-
Go to the following assignment This link will also be available on SUcourse at 16:40 on the class day.
-
Accept the assignment.
-
Copy the link of your repository
-
Clone the repo to your local machine with
git clone REPOSITORY_LINK
-
Follow the instructions in the
readme.md
file in your cloned repository.
Friday
-
What is Gene? - slides
- Key points in transcription.
- Sense vs antisense strand, coding vs non-coding strand, template vs non-tepmlate strand, transcribed vs non-transcribed strand
- How to find a gene or motif on both strands
- How to predict a eukaryotic and prokaryotic gene?
- What is genome annotation?
- Genome size and complexity discussion.
- Gene size differences across species.
- How to measure the performance of a tool?
- Why is CpG island relevant in gene prediction?
-
How to find a CpG island.
- Nussinov algorithm - slides
Week 5
Wednesday
- Group Project - Progress of groups
- How to read and write files in Python
- Introduction to FASTA format
Lab-4
-
Go to the following assignment.
-
Accept the assignment.
-
Copy the link of your repository
-
Clone the repo to your local machine with
git clone REPOSITORY_LINK
-
Follow the instructions in the
readme.md
file in your cloned repository.
Friday
- How to apply Nussinov algorithm
-
How to deal with bifurcation (Nussinov)
- DNA to Protein - slides
- Genetic code
- Features of the genetic code
- Stop codon introduction
- Steps of translation
- tRNA
- Codon vs anti-codon
- Translation-transcription coupling
- Wobble pairing
- Codon usage
- Amino acid structure
- Amino acid groupings
- Protein structure
- Membrane proteins
Week 6
Wednesday
- Group Project - Progress of groups
- How to use
while
loop in python? - How to use
find
function?
Lab-5
-
Go to the following assignment.
-
Accept the assignment.
-
Copy the link of your repository
-
Clone the repo to your local machine with
git clone REPOSITORY_LINK
-
Follow the instructions in the
readme.md
file in your cloned repository.
Friday
- Homology - slides
- Similarity vs homology
- Pairwise sequence comparison
- Sequence variations
- Insertions, deletions and protein structure
- The space of global alignment
- Gap penalty functions
- How to score an alignment
- How can we find the best alignment?
- Dynamic programming
- Global alignment
- Needleman-Wunch Algorithm
Week 7
Wednesday
Lab-6
-
Go to the following assignment.
-
Accept the assignment.
-
Copy the link of your repository
-
Clone the repo to your local machine with
git clone REPOSITORY_LINK
-
Follow the instructions in the
readme.md
file in your cloned repository.
Friday
- Group Projects - Stage 1
Week 8
Wednesday
Lab-7
-
Go to the following assignment.
-
Accept the assignment.
-
Copy the link of your repository
-
Clone the repo to your local machine with
git clone REPOSITORY_LINK
-
Follow the instructions in the
readme.md
file in your cloned repository.
Friday
- Discussion on Patient X’s gender. How to reveal it with WES?
- Why does Patient X have many variants for spermatogenesis?
- Homology (continued)- slides
- Local alignment
- Smith-Waterman Algorithm
- Blast algorithm
- Substitution Matrices
- Local vs Global Alignment
Week 9
Wednesday
Lab-8
-
Go to the following assignment.
-
Accept the assignment.
-
Copy the link of your repository
-
Clone the repo to your local machine with
git clone REPOSITORY_LINK
-
Follow the instructions in the
readme.md
file in your cloned repository.
Friday
- Midterm I
Week 10
Wednesday
Lab-9
-
Go to the following assignment.
-
Accept the assignment.
-
Copy the link of your repository
-
Clone the repo to your local machine with
git clone REPOSITORY_LINK
-
Follow the instructions in the
readme.md
file in your cloned repository.
Friday
Week 11
Wednesday
Lab-10
-
Go to the following assignment.
-
Accept the assignment.
-
Copy the link of your repository
-
Clone the repo to your local machine with
git clone REPOSITORY_LINK
-
Follow the instructions in the
readme.md
file in your cloned repository.
Friday
- Multiple Sequence Alignment - slides
- Assumption of MSA
- The use of MSA
- Why is MSA useful compared to pairwise sequence alignment
- How to score MSA
- Dynamic Programming and its complexity
- Star Alignment and its Problems
- Progressive Alignment
- Iterative Alignment
- Progressive alignment; get pairs from newick tree.
- Template-based alignment
- MUSCLE
-
MAFFT
- Protein Domain and Motif - Slides
- Protein Domain vs Motif
- Domain evolution
- Sequence-based domain identification
Week 12
Wednesday
Lab Midterm
-
Go to the following assignment.
-
Accept the assignment.
-
Copy the link of your repository
-
Clone the repo to your local machine with
git clone REPOSITORY_LINK
-
Follow the instructions in the
readme.md
file in your cloned repository.
Friday
- PSSM
- CDD
- PSI-BLAST
- RPS-BLAST
- Three ways of identifying domains
- Consensus sequence
- Advantages/Disadvantages of PSI-BLAST
- RPS-BLAST
- HMM
- HMMER tools
- Databases: Pfam, CDD, Tigrfam
-
HHsearch
- How to build phylogenetic trees
- Rooting trees
- Species tree vs Gene tree
- Horizontol Gene Transfer
- How to interpret phylogenetic trees
- Maximum parsimony
Week 13
Wednesday
Check your mailbox.
Friday
- UPGMA
- Neighbor joining
- Maximum likelihood
- Bootstraping
- Reconciled trees
- Paralogy/Orthology
-
Differential gene loss
- Next Generation Sequencing
- PCR
- Sanger Sequencing
- Whole Genome Sequencing
- Coverage concept
- How much coverage do we need?
- Exom sequencing
- Microarray vs RNA-seq
Week 14
Wednesday
- RNA-seq normalization
- RNA-seq pipeline
- NET-seq, GRO-seq
- ChIP-seq
- ATAC-seq
- DNase-seq and others
- Hi-C-seq