1 of 71

Probability

9.6 Markov Chain Monte Carlo (MCMC)

Alex Tsun

2 of 71

Agenda

Discrete-Time Stochastic Processes
Markov Chains
Markov Chain Monte Carlo (MCMC)

3 of 71

Discrete-Time Stochastic Processes

4 of 71

Discrete-Time Stochastic Processes

5 of 71

Discrete-Time Stochastic Processes

X₀

X₁

X₂

X₃

X₄

X₅

6 of 71

Random Walk on a Graph Example

Definition of

Expectation

7 of 71

Random Walk on a Graph Example

Definition of

Expectation

8 of 71

Random Walk on a Graph Example

Definition of

Expectation

9 of 71

Random Walk on a Graph Example

Definition of

Expectation

10 of 71

Random Walk on a Graph Example

Definition of

Expectation

11 of 71

Random Walk on a Graph Example

Definition of

Expectation

12 of 71

Markov Chains

13 of 71

Markov Chains

14 of 71

Markov Chains

15 of 71

Transition Probability Matrix Example

To s₁

To s₂

To s₃

To s₄

To s₅

From s₁

From s₂

From s₃

From s₄

From s₅

16 of 71

Transition Probability Matrix Example

From s₁

From s₂

From s₃

From s₄

From s₅

To s₁

To s₂

To s₃

To s₄

To s₅

17 of 71

Transition Probability Matrix Example

P(X_t+1= 2 | X_t= 1)

From s₁

From s₂

From s₃

From s₄

From s₅

To s₁

To s₂

To s₃

To s₄

To s₅

P(X_t+1= 3 | X_t= 1)

18 of 71

Transition Probability Matrix Example

From s₁

From s₂

From s₃

From s₄

From s₅

To s₁

To s₂

To s₃

To s₄

To s₅

19 of 71

Transition Probability Matrix Example

From s₁

From s₂

From s₃

From s₄

From s₅

To s₁

To s₂

To s₃

To s₄

To s₅

P(X_t+1= 1 | X_t= 2)

P(X_t+1= 4 | X_t= 2)

20 of 71

Transition Probability Matrix Example

From s₁

From s₂

From s₃

From s₄

From s₅

To s₁

To s₂

To s₃

To s₄

To s₅

21 of 71

Transition Probability Matrix Example

From s₁

From s₂

From s₃

From s₄

From s₅

To s₁

To s₂

To s₃

To s₄

To s₅

22 of 71

Transition Probability Matrix Example

From s₁

From s₂

From s₃

From s₄

From s₅

To s₁

To s₂

To s₃

To s₄

To s₅

23 of 71

Transition Probability Matrix Example

24 of 71

Computing Probability Example

25 of 71

Computing Probability Example

26 of 71

Computing Probability Example

27 of 71

Computing Probability Example

From state 2, can only go to 1 and 4 with nonzero probability

28 of 71

Computing Probability Example

From state 2, can only go to 1 and 4 with nonzero probability

29 of 71

Computing Probability Example

30 of 71

Computing Probability Example

31 of 71

Computing Probability Example

32 of 71

Computing Probability Example

33 of 71

Computing Probability Example

34 of 71

Random Picture

Definition of

Expectation

35 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Suppose we start at a random state with these probabilities (sum to 1).

36 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Suppose we start at a random state with these probabilities (sum to 1).

What is my belief distribution for where I am next?

37 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

38 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

39 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

Def of TPM (works for any time)

Def of v

40 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

Def of TPM (works for any time)

Def of v

LTP

P(X₁= 1)

41 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

P(X₁= 1)

42 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

P(X₁= 1)

43 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

Def of TPM (works for any time)

Def of v

P(X₁= 1)

44 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

Def of TPM (works for any time)

Def of v

LTP

P(X₁= 1)

P(X₁= 2)

45 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

Def of TPM (works for any time)

Def of v

LTP

P(X₁= 2)

HMMMM

46 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

P(X₁= 1)

P(X₁= 2)

P(X₁= 3)

P(X₁= 4)

P(X₁= 5)

47 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

P(X₁= 1)

P(X₁= 2)

P(X₁= 3)

P(X₁= 4)

P(X₁= 5)

Right-multiplying by P gives the belief distribution at the next time step!

48 of 71

Applying A Transition Matrix

P(X₀= 1)

P(X₀= 2)

P(X₀= 3)

P(X₀= 4)

P(X₀= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

P(X₁= 1)

P(X₁= 2)

P(X₁= 3)

P(X₁= 4)

P(X₁= 5)

Right-multiplying by P gives the belief distribution at the next time step!

By induction (repeatedly applying this matrix P), vPⁿ is the distribution after n time steps.

49 of 71

Stationary Distribution of a Markov Chain

50 of 71

Markov Chain Monte Carlo (MCMC)

51 of 71

Markov Chain Monte Carlo (MCMC)

52 of 71

Markov Chain Monte Carlo (MCMC)

53 of 71

Markov Chain Monte Carlo (MCMC)

54 of 71

The Knapsack Problem

55 of 71

The Knapsack Problem

56 of 71

The Knapsack Problem

57 of 71

The Knapsack Problem

58 of 71

The Knapsack Problem

59 of 71

Greedy Algorithm for The Knapsack Problem?

60 of 71

Greedy Algorithm for The Knapsack Problem?

Item	Value	Weight	Value / Weight
Laptop	156	52	3
iPad	100	50	2
Tablet	98	49	2

61 of 71

Greedy Algorithm for The Knapsack Problem?

Item	Value	Weight	Value / Weight
Laptop	156	52	3
iPad	100	50	2
Tablet	98	49	2

62 of 71

Greedy Algorithm for The Knapsack Problem?

Item	Value	Weight	Value / Weight
Laptop	156	52	3
iPad	100	50	2
Tablet	98	49	2

63 of 71

Greedy Algorithm for The Knapsack Problem?

Item	Value	Weight	Value / Weight
Laptop	156	52	3
iPad	100	50	2
Tablet	98	49	2

64 of 71

Greedy Algorithm for The Knapsack Problem?

Item	Value	Weight	Value / Weight
Laptop	156	52	3
iPad	100	50	2
Tablet	98	49	2

65 of 71

MCMC for the Knapsack Problem

We’ll define a Markov Chain whose state space is of size 2ⁿ:

Binary vectors x of length n (which can only have 0/1 entries), representing whether or not we have each item.

We’ll transition only to “good” states - ones that are feasible (satisfy weight constraint), while keeping track of the best solution so far.

66 of 71

MCMC for the Knapsack Problem

We’ll define a Markov Chain whose state space is of size 2ⁿ:

Binary vectors of length n (which can only have 0/1 entries), representing whether or not we have each item.

We’ll transition only to “good” states - ones that are feasible (satisfy weight constraint), while keeping track of the best solution so far.

67 of 71

MCMC for the Knapsack Problem

We’ll define a Markov Chain whose state space is of size 2ⁿ:

Binary vectors of length n (which can only have 0/1 entries), representing whether or not we have each item.

We’ll transition only to “good” states - ones that are feasible (satisfy weight constraint), while keeping track of the best solution so far.

68 of 71

MCMC for the Knapsack Problem

We’ll define a Markov Chain whose state space is of size 2ⁿ:

Binary vectors of length n (which can only have 0/1 entries), representing whether or not we have each item.

We’ll transition only to “good” states - ones that are feasible (satisfy weight constraint), while keeping track of the best solution so far.

69 of 71

MCMC for the Knapsack Problem

70 of 71

MCMC for the Knapsack Problem

By running many iterations, we get to the stationary distribution.

The stationary distribution has higher probabilities for “good” or “feasible” solutions, so we are likely to get a good solution!

Note: This is one version of MCMC for this knapsack problem, but it may not be the best. It would be good to transition to “better” solutions (higher value), and not just to any feasible solution.

71 of 71

END PIC

Alex Tsun

Joshua Fan