Input Output HMMs for modeling network dynamics
Oct 12th and 14th, 2021
BMI 826-23 Computational Network Biology�Fall 2021
Anthony Gitter
The style of IOHMM is adapted from Prof. Mark Craven’s lectures on HMMs
Original IOHMM slides created by Prof. Sushmita Roy
Goals for today
Motivation
DREM: Dynamic Regulatory Events Miner
Ernst et al., 2007, Mol Sys Biol
Recall Markov chains
A three state Markov chain
0.6
high
medium
low
0.1
0.1
0.7
0.2
0.6
0.2
0.3
0.2
P(Xt+1=high|Xt=low)=0.1
These define the transition probabilities
Notation for a Markov chain
Hidden Markov Models
Murphy 2000
What does an HMM do?
Defining an HMM
Notation
An HMM for an occasionally dishonest casino
0.9
0.95
0.05
2
1
0.1
Fair
Loaded
What is hidden?
What is observed?
Emission probabilities
Transition probabilities
An HMM for an occasionally dishonest casino
0.9
0.95
0.05
2
1
0.1
Fair
Loaded
What is hidden?
What is observed?
Which dice is rolled
Number (1-6) on the die
Emission probabilities
Transition probabilities
Goals for today
Recall a DBN for p variables and T time points
p
t=0
X11
X21
Xp1
…
Dependency at the first time point
X2: Variables at time t=2
X1
X2
Xp
…
1
1
1
X1
X2
Xp
…
2
2
2
X1
X2
Xp
…
T
T
T
t=1
t=2
t=T
…
…
…
HMM represented as a DBN
DBN
Kevin Murphy, 2000
DBN version of the occasional dishonest casino
| F | L |
F | 0.95 | 0.05 |
L | 0.1 | 0.9 |
| 1 | 2 | 3 | 4 | 5 | 6 |
F | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 |
L | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.5 |
Three important questions in HMMs
Computing the probability of a sequence from an HMM
Emitting symbol xt
State transition between
consecutive time points
Computing the probability of a sequence from an HMM
Sum over all paths
Goals for today
Input output Hidden Markov Models (IOHMM)
Bengio & Frasconi, IEEE Trans on Neural Networks 1996
Input output Hidden Markov Models (IOHMM)
Inputs
Outputs
Hidden states
Transition and emissions are dependent upon a set of external stimuli
Formally defining an IOHMM
Three important questions in IOHMMs
Computing the probability of a sequence from an IOHMM
Emitting symbol xt
State transition between
consecutive time points
As in the case of HMMs
Sum over all paths
How likely is a given sequence: Forward algorithm
and ending in state k at time t given inputs u1..ut
Steps of the Forward algorithm
Working through an example
Mapping to an IOHMM
Inputs
Outputs
Hidden states
We need to specify three CPTs
The CPTs that we will use
| A | B | C | D |
0 | 0.5 | 0.5 | 0 | 0 |
1 | 0.5 | 0.5 | 0 | 0 |
| | A | B | C | D |
0 | A | 0.6 | 0.2 | 0.2 | 0 |
0 | B | 0.2 | 0.6 | 0 | 0.2 |
0 | C | 0.1 | 0 | 0.8 | 0.1 |
0 | D | 0 | 0.25 | 0.25 | 0.5 |
1 | A | 0.8 | 0.1 | 0.1 | 0 |
1 | B | 0.8 | 0.1 | 0 | 0.1 |
1 | C | 0.1 | 0 | 0.8 | 0.1 |
1 | D | 0 | 0.1 | 0.8 | 0.1 |
| | r1 | r2 | r3 |
0 | A | 0.8 | 0.1 | 0.1 |
0 | B | 0.2 | 0.6 | 0.2 |
0 | C | 0.25 | 0.5 | 0.25 |
0 | D | 0.2 | 0.2 | 0.6 |
1 | A | 0.2 | 0.6 | 0.2 |
1 | B | 0.25 | 0.5 | 0.25 |
1 | C | 0.2 | 0.2 | 0.6 |
1 | D | 0.5 | 0.25 | 0.25 |
Suppose we observed the following sequences
0 1 1 0
r1 r1 r2 r3
Input:
Output:
How likely is this observation from our IOHMM?
Transition probabilities encode some independencies
A
B
C
D
| | A | B | C | D |
0 | A | 0.6 | 0.2 | 0.2 | 0 |
0 | B | 0.2 | 0.6 | 0 | 0.2 |
0 | C | 0.1 | 0 | 0.8 | 0.1 |
0 | D | 0 | 0.25 | 0.25 | 0.5 |
1 | A | 0.8 | 0.1 | 0.1 | 0 |
1 | B | 0.8 | 0.1 | 0 | 0.1 |
1 | C | 0.1 | 0 | 0.8 | 0.1 |
1 | D | 0 | 0.1 | 0.8 | 0.1 |
Applying the forward algorithm
Output:
| 1 | 2 | 3 | 4 |
A | | | | |
B | | | | |
C | | | | |
D | | | | |
Input:
0 | 1 | 1 | 0 |
r1 | r1 | r2 | r3 |
A
B
C
D
Working through some calculations: t=1
| A | B | C | D |
0 | 0.5 | 0.5 | 0 | 0 |
1 | 0.5 | 0.5 | 0 | 0 |
| | A | B | C | D |
0 | A | 0.6 | 0.2 | 0.2 | 0 |
0 | B | 0.2 | 0.6 | 0 | 0.2 |
0 | C | 0.1 | 0 | 0.8 | 0.1 |
0 | D | 0 | 0.25 | 0.25 | 0.5 |
1 | A | 0.8 | 0.1 | 0.1 | 0 |
1 | B | 0.8 | 0.1 | 0 | 0.1 |
1 | C | 0.1 | 0 | 0.8 | 0.1 |
1 | D | 0 | 0.1 | 0.8 | 0.1 |
| | r1 | r2 | r3 |
0 | A | 0.8 | 0.1 | 0.1 |
0 | B | 0.2 | 0.6 | 0.2 |
0 | C | 0.25 | 0.5 | 0.25 |
0 | D | 0.2 | 0.2 | 0.6 |
1 | A | 0.2 | 0.6 | 0.2 |
1 | B | 0.25 | 0.5 | 0.25 |
1 | C | 0.2 | 0.2 | 0.6 |
1 | D | 0.5 | 0.25 | 0.25 |
0 1 1 0
r1 r1 r2 r3
Input:
Output:
Working through some calculations: t=2
| A | B | C | D |
0 | 0.5 | 0.5 | 0 | 0 |
1 | 0.5 | 0.5 | 0 | 0 |
| | A | B | C | D |
0 | A | 0.6 | 0.2 | 0.2 | 0 |
0 | B | 0.2 | 0.6 | 0 | 0.2 |
0 | C | 0.1 | 0 | 0.8 | 0.1 |
0 | D | 0 | 0.25 | 0.25 | 0.5 |
1 | A | 0.8 | 0.1 | 0.1 | 0 |
1 | B | 0.8 | 0.1 | 0 | 0.1 |
1 | C | 0.1 | 0 | 0.8 | 0.1 |
1 | D | 0 | 0.1 | 0.8 | 0.1 |
| | r1 | r2 | r3 |
0 | A | 0.8 | 0.1 | 0.1 |
0 | B | 0.2 | 0.6 | 0.2 |
0 | C | 0.25 | 0.5 | 0.25 |
0 | D | 0.2 | 0.2 | 0.6 |
1 | A | 0.2 | 0.6 | 0.2 |
1 | B | 0.25 | 0.5 | 0.25 |
1 | C | 0.2 | 0.2 | 0.6 |
1 | D | 0.5 | 0.25 | 0.25 |
0 1 1 0
r1 r1 r2 r3
Input:
Output:
Result of applying the forward algorithm
Input:
Output:
| 1 | 2 | 3 | 4 |
A | 0.4 | 0.08 | 0.04488 | 0.0033861 |
B | 0.1 | 0.0125 | 0.033125 | 0.021860825 |
C | 0 | 0.008 | 0.00308 | 0.0028790625 |
D | 0 | 0.01 | 0.0007625 | 0.00438855 |
0 | 1 | 1 | 0 |
r1 | r1 | r2 | r3 |
Learning an IOHMM from data
The expectation maximization algorithm
Learning without hidden information
Number of transitions from state k to state l given input p
Number of times c is emitted from k given input p
The expectation step
Bengio & Frasconi, IEEE Trans on Neural Networks 1996
Computing
Obtained from the forward algorithm
Computing
Forward algorithm fk(t)
Backward algorithm bk(t)
Steps of the backward algorithm
Trying out an example with backward algorithm
0 1 1
r1 r2 r2
Input:
Output:
Results from applying the backward algorithm
Input:
Output:
| 1 | 2 | 3 |
A | | 0.55 | 1 |
B | | 0.555 | 1 |
C | | | 1 |
D | | | 1 |
0 1 1
r1 r2 r2
| | A | B | C | D |
0 | A | 0.6 | 0.2 | 0.2 | 0 |
0 | B | 0.2 | 0.6 | 0 | 0.2 |
0 | C | 0.1 | 0 | 0.8 | 0.1 |
0 | D | 0 | 0.25 | 0.25 | 0.5 |
1 | A | 0.8 | 0.1 | 0.1 | 0 |
1 | B | 0.8 | 0.1 | 0 | 0.1 |
1 | C | 0.1 | 0 | 0.8 | 0.1 |
1 | D | 0 | 0.1 | 0.8 | 0.1 |
| | r1 | r2 | r3 |
0 | A | 0.8 | 0.1 | 0.1 |
0 | B | 0.2 | 0.6 | 0.2 |
0 | C | 0.25 | 0.5 | 0.25 |
0 | D | 0.2 | 0.2 | 0.6 |
1 | A | 0.2 | 0.6 | 0.2 |
1 | B | 0.25 | 0.5 | 0.25 |
1 | C | 0.2 | 0.2 | 0.6 |
1 | D | 0.5 | 0.25 | 0.25 |
Computing
Computing
Putting it all together
Baum-Welch one iteration
Baum-Welch one iteration M step
Contribution from sample 1
Sample 2 will not contribute as there is no relevant configuration
Baum-Welch one iteration M step
Contribution from sample 1
Contribution from sample 2
Goals for today
Bifurcation events
Bifurcation events
Dynamic Regulatory Events Miner (DREM)
Ernst et al., 2007 Mol Sys Biol
DREM inputs and output
Bar-Joseph et al. 2012 Nat Rev Genet
Time
t1
t2
tN
TF-gene binding interactions
Time series gene expression
log2 fold change
DREM key idea
Ernst et al., 2007, Mol Sys Biol
IOHMM model in DREM
Defining the transition probability in DREM
Input associated with the ith gene: collection of� binding sites on gene i’s promoter
State-specific parameters
Structure learning in DREM
Results
DREM application in yeast amino acid starvation
DREM identified 11 paths, and associated important AA related TFs for each split
Does condition non-specific data help?
Yes, adding additional non-condition specific data helped explain more splits or found more TFs per split
Validation of INO4 binding
Validation of INO4 binding
Genes associated with INO4 split profile
INO4 occupancy is much higher in AA �starvation compared to normal (SCD)
More genes are bound genome-wide in AA starvation
Stronger binding in AA starvation of genes in this path
Does integration help?
Take away points
Take away points
Take away points