1 of 71

Probability

9.6 Markov Chain Monte Carlo (MCMC)

Alex Tsun

2 of 71

Agenda

  • Discrete-Time Stochastic Processes
  • Markov Chains
  • Markov Chain Monte Carlo (MCMC)

3 of 71

Discrete-Time Stochastic Processes

4 of 71

Discrete-Time Stochastic Processes

5 of 71

Discrete-Time Stochastic Processes

X0

X1

X2

X3

X4

X5

6 of 71

Random Walk on a Graph Example

Definition of

Expectation

1

2

3

5

4

7 of 71

Random Walk on a Graph Example

Definition of

Expectation

1

2

3

5

4

8 of 71

Random Walk on a Graph Example

Definition of

Expectation

1

2

3

5

4

9 of 71

Random Walk on a Graph Example

Definition of

Expectation

1

2

3

5

4

10 of 71

Random Walk on a Graph Example

Definition of

Expectation

1

2

3

5

4

11 of 71

Random Walk on a Graph Example

Definition of

Expectation

1

2

3

5

4

12 of 71

Markov Chains

13 of 71

Markov Chains

14 of 71

Markov Chains

15 of 71

Transition Probability Matrix Example

1

2

3

5

4

To s1

To s2

To s3

To s4

To s5

From s1

From s2

From s3

From s4

From s5

16 of 71

Transition Probability Matrix Example

1

2

3

5

4

From s1

From s2

From s3

From s4

From s5

To s1

To s2

To s3

To s4

To s5

17 of 71

Transition Probability Matrix Example

1

2

3

5

4

P(Xt+1= 2 | Xt= 1)

From s1

From s2

From s3

From s4

From s5

To s1

To s2

To s3

To s4

To s5

P(Xt+1= 3 | Xt= 1)

1

18 of 71

Transition Probability Matrix Example

1

2

3

5

4

From s1

From s2

From s3

From s4

From s5

To s1

To s2

To s3

To s4

To s5

19 of 71

Transition Probability Matrix Example

1

2

3

5

4

From s1

From s2

From s3

From s4

From s5

To s1

To s2

To s3

To s4

To s5

P(Xt+1= 1 | Xt= 2)

P(Xt+1= 4 | Xt= 2)

2

20 of 71

Transition Probability Matrix Example

1

2

3

5

4

From s1

From s2

From s3

From s4

From s5

To s1

To s2

To s3

To s4

To s5

21 of 71

Transition Probability Matrix Example

1

2

3

5

4

From s1

From s2

From s3

From s4

From s5

To s1

To s2

To s3

To s4

To s5

22 of 71

Transition Probability Matrix Example

1

2

3

5

4

From s1

From s2

From s3

From s4

From s5

To s1

To s2

To s3

To s4

To s5

23 of 71

Transition Probability Matrix Example

1

2

3

5

4

24 of 71

Computing Probability Example

1

2

3

5

4

25 of 71

Computing Probability Example

1

2

3

5

4

26 of 71

Computing Probability Example

1

2

3

5

4

27 of 71

Computing Probability Example

1

2

3

5

4

From state 2, can only go to 1 and 4 with nonzero probability

28 of 71

Computing Probability Example

1

2

3

5

4

From state 2, can only go to 1 and 4 with nonzero probability

29 of 71

Computing Probability Example

1

2

3

5

4

30 of 71

Computing Probability Example

1

2

3

5

4

31 of 71

Computing Probability Example

1

2

3

5

4

32 of 71

Computing Probability Example

1

2

3

5

4

33 of 71

Computing Probability Example

1

2

3

5

4

34 of 71

Random Picture

Definition of

Expectation

35 of 71

Applying A Transition Matrix

1

2

3

5

4

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Suppose we start at a random state with these probabilities (sum to 1).

36 of 71

Applying A Transition Matrix

1

2

3

5

4

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Suppose we start at a random state with these probabilities (sum to 1).

What is my belief distribution for where I am next?

37 of 71

Applying A Transition Matrix

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

38 of 71

Applying A Transition Matrix

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

39 of 71

Applying A Transition Matrix

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

Def of TPM (works for any time)

Def of v

40 of 71

Applying A Transition Matrix

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

Def of TPM (works for any time)

Def of v

LTP

P(X1= 1)

41 of 71

Applying A Transition Matrix

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

P(X1= 1)

42 of 71

Applying A Transition Matrix

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

P(X1= 1)

43 of 71

Applying A Transition Matrix

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

Def of TPM (works for any time)

Def of v

P(X1= 1)

44 of 71

Applying A Transition Matrix

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

Def of TPM (works for any time)

Def of v

LTP

P(X1= 1)

P(X1= 2)

45 of 71

Applying A Transition Matrix

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

Def of TPM (works for any time)

Def of v

LTP

P(X1= 2)

HMMMM

46 of 71

Applying A Transition Matrix

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

P(X1= 1)

P(X1= 2)

P(X1= 3)

P(X1= 4)

P(X1= 5)

47 of 71

Applying A Transition Matrix

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

P(X1= 1)

P(X1= 2)

P(X1= 3)

P(X1= 4)

P(X1= 5)

Right-multiplying by P gives the belief distribution at the next time step!

48 of 71

Applying A Transition Matrix

P(X0= 1)

P(X0= 2)

P(X0= 3)

P(X0= 4)

P(X0= 5)

Let’s compute the matrix product vP (we’ll see why soon!).

P(X1= 1)

P(X1= 2)

P(X1= 3)

P(X1= 4)

P(X1= 5)

Right-multiplying by P gives the belief distribution at the next time step!

By induction (repeatedly applying this matrix P), vPn is the distribution after n time steps.

49 of 71

Stationary Distribution of a Markov Chain

50 of 71

Markov Chain Monte Carlo (MCMC)

51 of 71

Markov Chain Monte Carlo (MCMC)

52 of 71

Markov Chain Monte Carlo (MCMC)

53 of 71

Markov Chain Monte Carlo (MCMC)

54 of 71

The Knapsack Problem

55 of 71

The Knapsack Problem

56 of 71

The Knapsack Problem

57 of 71

The Knapsack Problem

58 of 71

The Knapsack Problem

59 of 71

Greedy Algorithm for The Knapsack Problem?

60 of 71

Greedy Algorithm for The Knapsack Problem?

Item

Value

Weight

Value / Weight

Laptop

156

52

3

iPad

100

50

2

Tablet

98

49

2

61 of 71

Greedy Algorithm for The Knapsack Problem?

Item

Value

Weight

Value / Weight

Laptop

156

52

3

iPad

100

50

2

Tablet

98

49

2

62 of 71

Greedy Algorithm for The Knapsack Problem?

Item

Value

Weight

Value / Weight

Laptop

156

52

3

iPad

100

50

2

Tablet

98

49

2

63 of 71

Greedy Algorithm for The Knapsack Problem?

Item

Value

Weight

Value / Weight

Laptop

156

52

3

iPad

100

50

2

Tablet

98

49

2

64 of 71

Greedy Algorithm for The Knapsack Problem?

Item

Value

Weight

Value / Weight

Laptop

156

52

3

iPad

100

50

2

Tablet

98

49

2

65 of 71

MCMC for the Knapsack Problem

We’ll define a Markov Chain whose state space is of size 2n:

Binary vectors x of length n (which can only have 0/1 entries), representing whether or not we have each item.

We’ll transition only to “good” states - ones that are feasible (satisfy weight constraint), while keeping track of the best solution so far.

66 of 71

MCMC for the Knapsack Problem

We’ll define a Markov Chain whose state space is of size 2n:

Binary vectors of length n (which can only have 0/1 entries), representing whether or not we have each item.

We’ll transition only to “good” states - ones that are feasible (satisfy weight constraint), while keeping track of the best solution so far.

67 of 71

MCMC for the Knapsack Problem

We’ll define a Markov Chain whose state space is of size 2n:

Binary vectors of length n (which can only have 0/1 entries), representing whether or not we have each item.

We’ll transition only to “good” states - ones that are feasible (satisfy weight constraint), while keeping track of the best solution so far.

68 of 71

MCMC for the Knapsack Problem

We’ll define a Markov Chain whose state space is of size 2n:

Binary vectors of length n (which can only have 0/1 entries), representing whether or not we have each item.

We’ll transition only to “good” states - ones that are feasible (satisfy weight constraint), while keeping track of the best solution so far.

69 of 71

MCMC for the Knapsack Problem

70 of 71

MCMC for the Knapsack Problem

By running many iterations, we get to the stationary distribution.

The stationary distribution has higher probabilities for “good” or “feasible” solutions, so we are likely to get a good solution!

Note: This is one version of MCMC for this knapsack problem, but it may not be the best. It would be good to transition to “better” solutions (higher value), and not just to any feasible solution.

71 of 71

END PIC

Alex Tsun

Joshua Fan