1 of 19

Lecture 7��Deep Deterministic Policy Gradient (DDPG) Method

1

Instructor: Ercan Atam

Institute for Data Science & Artificial Intelligence

Course: DSAI 642- Advanced Reinforcement Learning

2 of 19

2

List of contents for this lecture

The need for DDPG method

The intuition behind DDPG method

The math behind DDPG method

Pseudocode of DDPG method

Advantages/disadvantages of DDPG method

3 of 19

3

Relevant readings/videos for this lecture

https://download.cmutschler.de/lectures/FAU_RL_2021/slides/7-04_DDPG.pdf

(Some slides are modified/improved versions from here)

https://www.youtube.com/watch?v=K7iB88bC19o

(very good and detailed lecture on DDPG!)

Chapter 12 of Miguel Mirales, “Grokking Deep Reinforcement Learning”, Manning, 2020

4 of 19

4

What is the DDPG method?

5 of 19

5

The intuition behind the DDPG method

6 of 19

6

Generalizing DQN to continuous actions (1)

7 of 19

7

Generalizing DQN to continuous actions (2)

}

8 of 19

8

From DQN to DDPG

9 of 19

9

The Q-Learning side of the DDPG method (1)

10 of 19

10

The Q-Learning side of the DDPG method (2)

11 of 19

11

The policy learning side of DDPG (1)

12 of 19

12

The policy learning side of DDPG (2)

13 of 19

13

The policy learning side of DDPG (3)

14 of 19

14

Exploration-Exploitation in the DDPG method

15 of 19

15

DDPG algorithm

16 of 19

16

DDPG algorithm explained visually

17 of 19

17

+s, -s

18 of 19

18

Summary

19 of 19

References �(utilized for preparation of lecture notes or MATLAB code)

Chapter 12 of Miguel Mirales, “Grokking Deep Reinforcement Learning”, Manning, 2020.
https://www.youtube.com/watch?v=dBZXv7yzG64
https://www.youtube.com/watch?v=K7iB88bC19o
https://download.cmutschler.de/lectures/FAU_RL_2021/slides/7-04_DDPG.pdf
https://spinningup.openai.com/en/latest/algorithms/ddpg.html
https://groups.uni-paderborn.de/lea/share/lehre/reinforcementlearning/lecture_slides/built/Lecture12.pdf
https://www.youtube.com/watch?v=0D6a0a1HTtc

19