1 of 20

Implicit Kinematic Policies: Unifying Joint and Cartesian Action Spaces

in End-to-End Robot Learning

Presented by: Mohan Kumar S

September 1, 2022

Aditya Ganapathi, Pete Florence, Jake Varley, Kaylee Burns, Ken Goldberg, Andy Zeng

AGI Labs Reading Group

2 of 20

Motivation & Main Problem

AGI Labs Reading Group

3 of 20

End-Effector Actions (Task Space)

Joint Actions (Configuration Space)

AGI Labs Reading Group

4 of 20

Action space matters

Cartesian actions perform favorably when learning policies for tabletop manipulation�
Joint actions fare better for whole-body motion control�

AGI Labs Reading Group

5 of 20

Key Insights

How models can discover for themselves which optimal combination of action spaces to use?
A new formulation that integrates forward kinematics as a differentiable module within the deep network of an autoregressive implicit policy
Exposes both joint and cartesian action spaces as inputs to the implicit model in a kinematically consistent manner.

AGI Labs Reading Group

6 of 20

Problem Setting

AGI Labs Reading Group

7 of 20

Proposed Approach

AGI Labs Reading Group

8 of 20

Problem Setup

AGI Labs Reading Group

9 of 20

Algorithm

AGI Labs Reading Group

10 of 20

Loss function

AGI Labs Reading Group

11 of 20

Networks

Florence et al.’ 21

AGI Labs Reading Group

12 of 20

Experimental Setup & Results

AGI Labs Reading Group

13 of 20

Hypotheses

Across tasks where one action space substantially outperforms the other, can IKP achieve the best of both and perform consistently well across the tasks?
Given a miscalibrated robot with unknown offsets in the joint encoders (representing low-cost encoders), can IKP compensate for these offsets while succeeding at the tasks

AGI Labs Reading Group

14 of 20

Tasks

Bimanual Sweeping, Flipping Miscalibrated Insertion, Sweeping

AGI Labs Reading Group

15 of 20

Results

AGI Labs Reading Group

16 of 20

Discussion

AGI Labs Reading Group

17 of 20

Discussion of Results

Strengths

The frequency of the action space can have an outsized effect on the robustness and quality of learned policies. IKP allows us not to make an explicit choice about the action space beforehand.

Weaknesses

Training and Inference is more involved, with m different models to train and infer for each dimension of action y in ℝ^m.

AGI Labs Reading Group

18 of 20

Key Limitations

Only qualitative results on real robots. Does not say how many rollouts actually succeeded, rollout frequency?

AGI Labs Reading Group

19 of 20

Future Work

Extend multi-actionspace formulation to incorporate the forward and/or inverse robot dynamics into the network which will allow us to expose joint torques and velocities to our model in a fully differentiable way
Use the residual forward kinematics module to learn the link parameters themselves, which can be a promising direction for controlling soft and/or continuum robots

AGI Labs Reading Group

20 of 20

References

Florence et al (2021). Implicit Behavioral Cloning

Introduces the implicit behavior cloning formulation.

AGI Labs Reading Group