1 of 32

Computational Network Biology

BMI/CS 775�Fall 2022

Sushmita Roy and Anthony Gitter

https://compnetbiocourse.discovery.wisc.edu

Sep 8th 2022

2 of 32

Goals for today

  • Administrivia
  • Course topics
  • Short survey of interests/background

3 of 32

BMI/CS 775 Computational Network Biology

  • Course home page: https://compnetbiocourse.discovery.wisc.edu
  • Instructors:
    • Prof. Sushmita Roy
      • sroy@biostat.wisc.edu
      • Office hours: Tuesday, 2:30-3:30 pm or by appointment
      • Office: room 3168, Wisconsin Institute for Discovery
    • Prof. Anthony Gitter
      • gitter@biostat.wisc.edu
      • Office hours: Thursday, 2:30-3:30 pm or by appointment
      • Office: room 3268, Wisconsin Institute for Discovery
  • Teaching assistant:
    • Krittisak Chaiyakul
      • chaiyakul@wisc.edu
      • Office hours: Wednesday, 2:30-3:30 pm
      • Office: conference room 4765, Medical Sciences Center (MSC)

4 of 32

Finding our offices

Discovery Building

WID 3168/3268

Engineering Hall

Your WISC cards will be enabled for upper floor access

Medical Sciences Center

MSC 4765

5 of 32

Networks are powerful representations of complex systems

Internet

Image credit: Wikipedia, Wikimedia, The cellmap, Euroscientist, https://extensionaus.com.au/extension-practice/social-network-analysis/

Yeast genetic interaction network

Social network

6 of 32

Learning goals of this class

  • Gain a broad overview of the application areas and computational solutions in network biology
  • Gain a deeper understanding of one or two areas
  • Apply the computational concepts to similar problems in biology and complex systems
  • Understand and critique scientific articles
  • Enable self learning and deeper study of related topics

Overall goal: provide students an introduction to different computational problems in biological networks, key algorithms to solve these problems, and in-depth case studies showing practical applications of these concepts.

7 of 32

Course organization

  • Tentative schedule: https://compnetbiocourse.discovery.wisc.edu/schedule/
  • Material in this course is organized into five major topics
    • Presented via instructor lectures
    • Most of the material is from published papers and review articles
    • Syllabus: https://compnetbiocourse.discovery.wisc.edu/syllabus/
  • Assessments: written critiques, written/programming assignments, and project
  • Last two weeks will be special seminar and project presentations

8 of 32

Websites and resources

    • Course website
      • Assignment instructions, lecture schedule, links to slides, syllabus
      • https://compnetbiocourse.discovery.wisc.edu/
    • Piazza
      • Announcements, discussion, questions
      • Email invitation and https://piazza.com/class/l7jkkqsf2cf5uu
    • Canvas
      • Written assignment and project submission, grading
      • https://canvas.wisc.edu/courses/324465
  • BMI department servers
    • Computing resources (more later), assignment files, code submission
  • Top Hat
    • Considering in-class polling, any opinions?

9 of 32

Lectures and readings

  • Either Prof. Gitter or Prof. Roy will give lectures
  • Lecture slides will be made available on the schedule page (https://compnetbiocourse.discovery.wisc.edu/schedule) briefly before class
  • Readings will be made online on the schedule page and Canvas

10 of 32

COVID-19 protocol

  • We will meet in-person
  • Masks are strongly encouraged
  • Stay home if you are sick or quarantining and contact instructors

11 of 32

Course grading

Proportion of grade

Critiques

20%

Written and implementation assignments

30%

Project Proposal

10%

Project Report

20%

Project presentation

15%

Class participation(*)

5%

* Includes asking/answering questions in class. Discussions on Piazza

Projects

Late submission policy: All students have upto 5 working days total to hand in late assignments

12 of 32

Critiques

  • 1-2 page analysis of selected papers read in each topic
    • Specific papers to be included in the critique will be mentioned
  • The critique should have the following components
    • Overview of the problem area
    • Approaches discussed
    • Strengths and weaknesses
    • Extensions to any of the approaches

13 of 32

Projects

  • There are three main components to the project
    • Proposal, In class presentation, Project report
  • Project proposal (Oct 20th)
  • Project presentations in last week and a half of class
  • Project report due (Dec 13th)
    • Last day of lecture
  • Last month reserved entirely for project, no critiques or assignments

14 of 32

Computational resources for this class

  • Linux servers available through the BMI department
  • Accounts for all enrolled students in the class have been requested
  • Watch Piazza for server names and login instructions

15 of 32

Recommended background

  • Computer science
    • Introductory courses in data structures or machine learning are good, but not required
  • Statistics
    • Good if you have had at least one course, but not required
  • Molecular biology
    • Good if you have had some introductory course
    • An interest in learning some basic molecular biology
  • Programming background
    • Familiarity with a Linux environment
    • Be able to run programs on data on the command line
    • Be able to write code to do some data analysis, computations, implement simple algorithms
    • Some assignments will be in Python with structured code

16 of 32

Goals for today

  • Administrivia
  • Course topics
  • Short survey of background and interests

17 of 32

What is network biology?

  • The term “Network biology” was likely coined by Albert-László Barabási & Zoltán N. Oltvai 2004
    • ~18 years old
  • Intersects with computer science, statistics, physics, molecular biology
  • A collection of algorithms and tools to build, interpret, and use graph representations of interacting molecular entities in biological and bio-medical problems (Adapted from Winterbach et al, 2013, BMC Systems Biology)
  • Related/overlapping areas
    • Bioinformatics, Systems biology, Complex systems, Biological network analysis, Network science, Machine learning on graphs

18 of 32

Why network biology?

  • Living systems are complex systems
    • A complex system: many components that interact to determine overall function
    • Networks are natural representations of complex systems
  • Provides a framework and important tools for integration, interpretation and discovery
  • Many applications e.g.
    • Understanding biological processes at the molecular level
    • Understanding how organisms process environmental signals
    • Predictive models of cellular function
    • Gene function prediction and prioritization
    • Disease prognosis
    • Interpretation of genetic variation

19 of 32

Why network biology?

  • Living systems are complex systems
    • A complex system: many components that interact to determine overall function
    • Networks are natural representations of complex systems
  • Provides a framework and important tools for integration, interpretation and discovery
  • Many applications e.g.
    • Understanding biological processes at the molecular level
    • Understanding how organisms process environmental signals
    • Predictive models of cellular function
    • Gene function prediction and prioritization
    • Disease prognosis
    • Interpretation of genetic variation

“.. plays a central role in the modeling of biological systems, complemented by the highly complex datasets generated across a myriad of multi-omics programs.” Camacho et al, Cell 2018

20 of 32

Overview of lecture topics

Biological problem

    • Mapping regulatory network structure
    • Dynamics and context specificity of networks
    • Modularity in biological networks
    • Comparison of biological networks
    • Identification of important genes
    • Integrating different types of molecular genomic data
    • Predicting protein interfaces

Computational approaches

    • Probabilistic graphical models
    • Graph structure learning
    • Graph clustering
    • Graph alignment
    • Diffusion on graphs
    • Deep learning
    • Generative graph models
    • Matrix factorization

Course material is organized by the biological problem and computational approaches to address the problem

21 of 32

Network inference: How do molecular entities interact within a cell?

Amit et al., Nat. Rev. Immunology, 2011

22 of 32

Network structure inference and dynamics

Gene expression

Samples

Algorithm

Y1

X1

X5

Y2

X2

Biological knowledge bases

Computational concepts

  1. Different types of graphical models for network representation
  2. Learning graphical models from data
  3. Integrating prior information into models
  4. Modeling dynamics in networks

Context C1

Context C2

Contexts can be different time points, cell types, disease states, organisms

Network dynamics

Network inference

23 of 32

Deep learning in network biology

Computational concepts

  1. Graph neural networks
  2. Node and edge embeddings

Predicting protein interfaces

Fout et al., NIPS 2017; Eraslan 2019 Nature review genetics; Zitnik & Leskovec Bioinformatics 2017, Deep Learning in Network Biology ISMB 2018 Workshop by Zitnik and Leskovec

Multi-Layer neural network

Predicting protein function using multiple networks

Embedding nodes in d-dimensions

24 of 32

Graph clustering: functional and disease module identification

Computational concepts

  1. Graph clustering
  2. Modularity measures

Barabasi et al., Nat Rev Genetics 2011

Mitra et al., Nat Rev Genetics 2013;

25 of 32

Graph alignment: What parts of networks from two species are similar?

Computational concepts

  1. Graph alignment
  2. Clustering on graphs
  3. Matrix factorization

Kelley et al PNAS 2003

Pairwise alignment

Multi-way alignment

26 of 32

Integrating different high-dimensional datasets

Computational concepts

  1. Matrix factorization
  2. knn graphs
  3. Graph clustering

Hie et al., Nature Biotechnology 2019

27 of 32

Graph diffusion: Which genes are most important?

Koehler et al., AJHG 2008

Computational concepts

  1. Random walks on graphs
  2. Graph diffusion kernels

28 of 32

Graph diffusion: What pathways are perturbed in cancer?

Leiserson et al . 2014, Nature Genetics

HOTNET2 subnetworks include genes with a wide range of mutation frequency

Computational concepts

  1. Heat kernel
  2. Subgraph analysis

29 of 32

Plan for the semester

When

What

Week 2-Week 4

Representing and learning networks from data

Week 5-Week 7

Deep learning in network biology

Week 8-Week 9

Graph topology and modules

Week 10-Week 11

Network-based integration and interpretation

Week 12-Week 13

Graph alignment

30 of 32

Plan for next week

  • Sep 13th, 15th
    • Background into graph theory, probability
    • Probabilistic graphical models for molecular networks
    • Learning static and dynamic Bayesian networks
  • Background reading
    • L. Hunter. Life and Its Molecules: A Brief Introduction. AI Magazine 25(1):9-22, 2004.
    • Winterbach et al., Topology of molecular interaction networks. BMC Systems Biology, 2013
  • Additional background reading linked from the course website schedule

31 of 32

Goals for today

  • Administrivia
  • Course topics
  • Short survey of interests/background

32 of 32

Short survey of background and interests

  • Survey collects a bit of information about your background and interest
  • Please complete the following survey before you leave class: https://uwmadison.co1.qualtrics.com/jfe/form/SV_ea3kSxbapwWEtlc