1 of 18

Diffusion Policy:�Visuomotor Policy Learning via Action Diffusion

Cheng Chi, Siyuan Feng, Yilun Du, Zhenjia Xu,

Eric Cousineau, Benjamin Burchfel, Shuran Song

Presenter: Mingyo Seo

2 of 18

1. Multi-modal distribution�2. Sequential correlation�3. High precision

Policy learning from demostration

3 of 18

Policy representations

- Multi-modal distribution�- High-dimensional output�- Stable training

- Unstable to train

(need negative loss)

- Direct regression:

multi-modality X

- Categorical:

sensitive to hyperparameter

4 of 18

Conditional denoising diffusion process for actions of receding horizons

Key Ideas

5 of 18

Conditional denoising diffusion process for actions of receding horizons

Key Ideas

Closed-loop action sequence

Visual conditioning

Stable training

6 of 18

Denoising Diffusion Probabilistic Models (DDPM)

 

 

7 of 18

Problem formulation

8 of 18

Visual observation conditioning

  • Approximate , not

    • Accelerate inference speed
    • End-to-end training of the visual encoder
  • Training

No output

Data with noise

De-noised samples

9 of 18

Implementation: Architecture

  • CNN
  • Time series transformer

10 of 18

Implementation: Visual encoder

  • ResNet-18 without pre-training
  • End-to-end training
    • Using pre-trained model (ImageNet) showed less promising results
  • Modifications
    • Global average pooling => Spatial softmax pooling
    • Batch norm => Group norm

11 of 18

Implementation: Real-time control (DDIM)

  • Denoising Diffusion Implicit Models (DDIM):

Non-Markovian process for fast computation (inference)

  • DDIM shares the same objectives

Markovian

DDIM

12 of 18

Properties

  • Multi-modal action distributions (short/long-horizon)

13 of 18

Properties

  • Action sequence prediction
    • Temporal action consistency
    • Robustness to idle actions
    • Reaction time (horizon length)

14 of 18

Properties

  • Works well with position control
  • Training stability

15 of 18

Evaluation in Simulation

16 of 18

Evaluation with a real robot

17 of 18

Evaluation with a real robot

18 of 18

Discussion