COS598G: Topics in Computational Biology: Spring 2017
Outline of Topics (Tentative)
Our guiding application will be cancer genomics and immunogenomics. However, computational approaches that we study have a variety of applications in computational biology and related fields.
Topic 1: Phylogenetic tree reconstruction [3.5 weeks. 4 lectures; 3 discussion]
- Review/introduction: Algorithms for phylogenetic tree reconstruction
- Characters vs. distance
- Small phylogeny problem vs. large phylogeny problem
- Maximum parsimony (Sankoff’s algorithm)
- Maximum likelihood (Felsenstein algorithm)
- Large phylogeny: searching through tree space
- Important special cases of maximum parsimony
- Perfect phylogeny (Infinite sites assumption)
- Multi-state perfect phylogeny (Infinite alleles assumption)
- Cladistic multi-state perfect phylogeny vs. two-state perfect phylogeny
- Dollo parsimony
- Sequencing of cancer genomes
- Papers and discussion
- Sequencing mixtures; Bulk-sequencing of tumors
- Perfect phylogeny with errors; single-cell sequencing
Copy number aberrations and multi-state phylogeny
Topic 2: Population Genetics models [3 weeks; 3 lectures, 3 discussion]
- Review/introduction: Population genetics
- Forward-time models: Wright-Fisher and Moran Model
- Backward-time model and coalescent.
- Methods to detect selection
- Exponentially growing populations
- Papers and discussion
- Review:
- Biology/Combio: Cumulative Haploinsufficiency and Triplosensitivity Drive Aneuploidy Patterns and Shape the Cancer Genome
- CompBio: Identification of neutral tumor evolution across cancer types
- Math/CompBio: POPULATION GENETICS OF NEUTRAL MUTATIONS IN EXPONENTIALLY GROWING CANCER CELL POPULATIONS
- CompBio: Quantification of subclonal selection in cancer from bulk sequencing data
- CompBio (no cancer): Learning natural selection from the site frequency spectrum
- CompBio (no cancer): Predicting Carriers of Ongoing Selective Sweeps without Knowledge of the Favored Allele
Topic 3: Network analysis of biological data [3.5 weeks. 4 lectures; 3 discussion]
- Random walks and diffusion on graphs
- Lazy random walk: symmetric versus asymmetric
- Random walk with restart, PageRank
- Heat equation and heat kernels
- Graph partitioning
- Graph conductance
- Community detection and Modularity
- Papers and discussion
- Review: Network propagation: a universal amplifier of genetic associations
- Review: Understanding Genotype-Phenotype Effects in Cancer via Network Approaches
- Review: Computational Solutions for Omics Data
- Review: Network Approaches to Complex Disease
- CompBio: Walking the Interactome for Prioritization of Candidate Disease Genes
- CompBio: Network-Based Integration of Disparate Omic Data To Identify "Silent Players" in Cancer
- CompBio: Discovering causal pathways linking genomic events to transcriptional states using Tied Diffusion Through Interacting Events (TieDIE)
- CompBio: Exploiting ontology graph for predicting sparsely annotated gene function
- CompBio: Compact Integration of Multi-Network Topology for Functional Analysis of Genes
- Math/Stats: Near-optimal Anomaly Detection in Graphs using Lova ́sz Extended Scan Statistic
- CS: Vertex Neighborhoods, Low Conductance Cuts, and Good Seeds for Local Community Methods.
Topic 4: Immunogenomics [2 weeks.2 lectures; 2 discussion]
- Introduction: Components of the immune system: B-cells; T-cells; MHC
- Sequencing of B-cell and T-cell repertoires
- Modeling of clonal expansions of B-cells
- Epitope prediction.
- Papers and discussion
- Review: The promise and challenge of high-throughput sequencing of the antibody repertoire. Nature Biotechnology (2014)
- Review: Solving Immunology
- Review: The Diversity and Molecular Evolution of B-Cell Receptors during Infection.
- CompBio: Quantifying selection in high-throughput Immunoglobulin sequencing data sets. Kleinstein NAHR (2012)
- CompBio: Identifying T Cell Receptors from High-Throughput Sequencing: Dealing with Promiscuity in TCRα and TCRβ Pairing. PLOS Computational Biology (2017)
- CompBio: Single-cell TCRseq: paired recovery of entire T-cell alpha and beta chain transcripts in T-cell receptors from single-cell RNAseq
- CompBio: Gapped sequence alignment using artificial neural networks: application to the MHC class I system. Bioinformatics (2016)
- CompBio: MHCflurry Hammerbacher (2016): http://biorxiv.org/content/biorxiv/early/2016/05/23/054775.full.pdf