Presenter (15mins)
Seth Karten skarten@cs.cmu.edu
Dynamics-Aware Unsupervised Discovery of Skills
Archit Sharma∗, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman
ICLR 2020
This paper had so many typos ☹
Problem Statement & Motivation
Related Work – End-to-End Hierarchical RL�
(Peng et al., 2017)
Related Work – Model-based RL�
(Chua et al., 2018a)
Related Work – Diversity is all you need (DIAYN)
Mutual Information for Unsupervised Skill Discovery
Mutual Information for Unsupervised Skill Discovery
Mutual Information for Unsupervised Skill Discovery
Planning Using Skill Dynamics
Using MPPI (Model Predictive Path Integral)
Mujoco Environments: Humanoid, Ant, Half-Cheetah
Experiment 1: Continuous Skill Spaces Allow Interpolation
Experiment 2: Lower Variance Primitives
Experiment 3: DADs vs Model-based RL
Experiment 4: Meta-controller vs MPPI controller
Strengths
Strength 1
Strength 2
Strength 3
Weaknesses
Weakness 1
Weakness 2
Weakness 3
TL;DR/Summary: Key Insights
QnA (1mins)
5 Discussion Points – send to TA, don’t include in slides