Diffusion Policy:�Visuomotor Policy Learning via Action Diffusion
Cheng Chi, Siyuan Feng, Yilun Du, Zhenjia Xu,
Eric Cousineau, Benjamin Burchfel, Shuran Song
Presenter: Mingyo Seo
1. Multi-modal distribution�2. Sequential correlation�3. High precision
Policy learning from demostration
Policy representations
- Multi-modal distribution�- High-dimensional output�- Stable training
- Unstable to train
(need negative loss)
- Direct regression:
multi-modality X
- Categorical:
sensitive to hyperparameter
Conditional denoising diffusion process for actions of receding horizons
Key Ideas
Conditional denoising diffusion process for actions of receding horizons
Key Ideas
Closed-loop action sequence
Visual conditioning
Stable training
Denoising Diffusion Probabilistic Models (DDPM)
Problem formulation
Visual observation conditioning
No output
Data with noise
De-noised samples
Implementation: Architecture
Implementation: Visual encoder
Implementation: Real-time control (DDIM)
Non-Markovian process for fast computation (inference)
Markovian
DDIM
Properties
Properties
Properties
Evaluation in Simulation
Evaluation with a real robot
Evaluation with a real robot
Discussion