1 of 20

Implicit Kinematic Policies: Unifying Joint and Cartesian Action Spaces

in End-to-End Robot Learning

Presented by: Mohan Kumar S

September 1, 2022

Aditya Ganapathi, Pete Florence, Jake Varley, Kaylee Burns, Ken Goldberg, Andy Zeng

AGI Labs Reading Group

2 of 20

Motivation & Main Problem

2

AGI Labs Reading Group

3 of 20

3

End-Effector Actions (Task Space)

Joint Actions (Configuration Space)

AGI Labs Reading Group

4 of 20

Action space matters

  • Cartesian actions perform favorably when learning policies for tabletop manipulation�
  • Joint actions fare better for whole-body motion control�

4

AGI Labs Reading Group

5 of 20

Key Insights

  • How models can discover for themselves which optimal combination of action spaces to use?
  • A new formulation that integrates forward kinematics as a differentiable module within the deep network of an autoregressive implicit policy
  • Exposes both joint and cartesian action spaces as inputs to the implicit model in a kinematically consistent manner.

5

AGI Labs Reading Group

6 of 20

Problem Setting

6

AGI Labs Reading Group

7 of 20

Proposed Approach

7

AGI Labs Reading Group

8 of 20

Problem Setup

8

AGI Labs Reading Group

9 of 20

Algorithm

9

AGI Labs Reading Group

10 of 20

Loss function

10

AGI Labs Reading Group

11 of 20

Networks

11

Florence et al.’ 21

AGI Labs Reading Group

12 of 20

Experimental Setup & Results

12

AGI Labs Reading Group

13 of 20

Hypotheses

  • Across tasks where one action space substantially outperforms the other, can IKP achieve the best of both and perform consistently well across the tasks?
  • Given a miscalibrated robot with unknown offsets in the joint encoders (representing low-cost encoders), can IKP compensate for these offsets while succeeding at the tasks

13

AGI Labs Reading Group

14 of 20

Tasks

Bimanual Sweeping, Flipping Miscalibrated Insertion, Sweeping

14

AGI Labs Reading Group

15 of 20

Results

15

AGI Labs Reading Group

16 of 20

Discussion

16

AGI Labs Reading Group

17 of 20

Discussion of Results

17

  • Strengths
    • The frequency of the action space can have an outsized effect on the robustness and quality of learned policies. IKP allows us not to make an explicit choice about the action space beforehand.

  • Weaknesses
    • Training and Inference is more involved, with m different models to train and infer for each dimension of action y in ℝ^m.

AGI Labs Reading Group

18 of 20

Key Limitations

18

  • Only qualitative results on real robots. Does not say how many rollouts actually succeeded, rollout frequency?

AGI Labs Reading Group

19 of 20

Future Work

19

  • Extend multi-actionspace formulation to incorporate the forward and/or inverse robot dynamics into the network which will allow us to expose joint torques and velocities to our model in a fully differentiable way
  • Use the residual forward kinematics module to learn the link parameters themselves, which can be a promising direction for controlling soft and/or continuum robots

AGI Labs Reading Group

20 of 20

References

20

  • Florence et al (2021). Implicit Behavioral Cloning
    • Introduces the implicit behavior cloning formulation.

AGI Labs Reading Group