1 of 14

Lecture 9.๏ฟฝ Actor-Critic Design Decisions๏ฟฝ๏ฟฝ

Sookyung Kim๏ฟฝ

1

2 of 14

Taxonomy of RL algorithm

Actor-Critic

2

3 of 14

Value Function Fitting

ย 

ย 

ย 

ย 

3

4 of 14

From Evaluation to Actor Critic

4

5 of 14

Actor-critic algorithm (with discount)

5

6 of 14

Actor-critic Design Decisions

6

7 of 14

Architecture Design

7

8 of 14

Online Actor-critic in practice

8

9 of 14

Can we remove the on-policy assumption entirely?

9

10 of 14

Letโ€™s see what that looks like

10

11 of 14

Letโ€™s see what that looks like

11

12 of 14

Fixing the policy update

Remember

12

13 of 14

Fixing the policy update

13

14 of 14

Some implementation details

DDPG in next chapter

14