Hidden Markov models
Lior Pachter
California Institute of Technology
1
Lecture 15
Caltech Bi/BE/CS183
Spring 2023
These slides are distributed under the CC BY 4.0 license
Laws of motion
2
Single-cell snapshots of cellular motion
3
Measuring cell states across time
4
Pseudotime assignment
5
Paul Magwene
Pseudotime assignment
6
Paul Magwene
Measuring cell states across time
7
Determining when genes are active
Hidden Markov model
Hidden Markov model
8
EM algorithm
Dynamic programming
A (discrete time) Markov chain
9
A (discrete time) Markov chain
10
A two state Markov chain
11
A two state Markov chain
12
BAABAA
A lattice view
13
A two-state gene expression model
14
A hidden Markov model
15
A hidden Markov model
16
Solved with the forward algorithm.
A hidden Markov model
17
What is the most likely sequence of states of the Markov chain to have resulted in 1,4,3,6,6,4?
Solved with the Viterbi algorithm.
A hidden Markov model
18
What is the probability that X3 = B if Y0=1,Y1=4,Y2=3,Y3=6,Y4=6,Y5=4 ?
Solved with the forward-backward algorithm.
A hidden Markov model
19
Given multiple sequences of numbers (observations of Y), estimate parameters for the model
Solved with the Baum-Welch algorithm.
A lattice view
20
Observed sequence
Hidden sequence
A two-state gene expression model
21
A hidden Markov model for gene expression during differentiation
22
EM algorithm
Dynamic programming
The Viterbi algorithm I
23
Observed sequence
Tij
The Viterbi algorithm II
24
Observed sequence
T’ij
The Viterbi algorithm III
25
Observed sequence
Tij
Summary
Markov chain: a memoryless stochastic process
Hidden Markov model: a hidden Markov chain from which observations are generated.
Viterbi algorithm: dynamic programming algorithm for finding the most likely sequence of hidden states to have generated an observed sequence.
Baum-Welch algorithm: an expectation-maximization algorithm for learning parameters of a hidden Markov model.
26
Recall (Lecture 7)
27
Hidden Markov models (HMMs) as graphical models
28
A chain of length 4
29
Two complementary ways to think about graphical models
30
Application to gene finding
31
Is there a gene in this sequence?
Recall (Lecture 1)
32
isoforms
A hidden Markov model for gene finding
33
T
A
A
T
A
T
G
T
C
C
A
C
G
G
T
T
G
T
A
C
A
C
G
G
C
A
G
G
T
A
T
T
G
A
G
G
T
A
T
T
G
A
G
A
T
G
T
A
A
C
T
G
A
A
Observed sequence:
Predict genes by running the Viterbi algorithm for an HMM:
Hidden state space
34
The sparsity pattern for the transition matrix for hidden states
Observed sequence
35
Observed sequence
36
A generative view of the HMM
37
The lattice view
38
Intron length distribution in human
39
Waiting times in a Markov chain
40
A
1-p
p
p
duration
Recall (Lecture 10): geometric distribution II
42
A full gene-finding HMM
43
Summary
Hidden markov models are useful for annotating features in sequences, or for annotating time series data.
Graphical models: hidden Markov models are special instances of graphical models.
Model choice: while HMMs are restrictive in that they make a strong Markovian assumption for the hidden random variables, associated inference algorithms are efficient.
Waiting times are geometrically distributed, which is suitable for introns. Exons require an extension of HMMs called generalized HMMs.
44
Additional References
45