1 of 34

Assistive Gym

Spring 2020

Lecture 4

Zackory Erickson

2 of 34

3 of 34

Assistive Gym

An open source physics simulation framework for assistive robotics

4 of 34

Assistive Gym

Pybullet physics engine

OpenAI Gym interface

Real time simulation: train in the cloud, test on a laptop

5 of 34

Pybullet

Open source Python wrappers for the Bullet physics engine

Rigid body and soft body (cloth, rope, etc.) position-based dynamics

CPU and GPU rendering support

Quickstart Guide: https://bit.ly/36HKWyd

6 of 34

Where is Bullet Physics Used?

Films

Games

7 of 34

8 of 34

Pybullet in Robotics

Erwin Coumans - Google Robotics (2014—)

Several Pybullet + robotics references on: https://pybullet.org

S. James et al. “Sim-to-Real via Sim-to-Sim: Data-efficient Robotic Grasping via Randomized-to-Canonical Adaptation Networks,” CVPR, 2019.

A. Singla et al. “Realizing Learned Quadruped Locomotion Behaviors through Kinematic Motion Primitives,” ICRA, 2019.

J. Tan et al. “Sim-to-Real: Learning Agile Locomotion For Quadruped Robots,” RSS, 2018.

9 of 34

OpenAI Gym

  1. An open source toolkit that provides a collection of tasks / environments.
  2. A common interface for developing and comparing intelligent control algorithms.

Atari Breakout

Pendulum

Google Deepmind

Fetch Robot

Pybullet Roboschool

10 of 34

Why Physics Simulation?

Safely learn

Parallelize data collection

Simulate many humans

11 of 34

Key features

of Assistive Gym

12 of 34

Environments

13 of 34

PR2

Jaco

Baxter

Sawyer

14 of 34

Human Model

Male and female models

40 controllable joints

Head

Torso

Waist

Arms / hands

Legs / feet

15 of 34

16 of 34

Robot Base Positioning

Joint-limit-weighted kinematic isotropy (JLWKI)

High manipulability near goals

17 of 34

Human Preferences

18 of 34

Human Preferences

Environment

Robot

Human

Physical

Mental

+

rR(s)

rH(s)

r(s)

19 of 34

What can a robot learn with a static human?

20 of 34

21 of 34

How?

22 of 34

Assistive Gym Examples

23 of 34

Random Actions

import gym, assistive_gym

env = gym.make('FeedingJacoHuman-v0')

env.render()

observation = env.reset()

while True:

env.render()

action = env.action_space.sample() # Get a random action

observation, reward, done, info = env.step(action)

24 of 34

Teleoperation

Open teleop_example.py

target_joint_positions = p.calculateInverseKinematics(env.robot, 8, position, orientation)

joint_positions, joint_velocities, joint_torques = env.get_motor_joint_states(env.robot)

joint_action = (target_joint_positions - joint_positions) * 10

observation, reward, done, info = env.step(joint_action)

25 of 34

Reinforcement Learning

PPO - Proximal Policy Optimization

Environment

Agent

Observation

Reward

Action

26 of 34

What about active human motion?

27 of 34

Co-optimization

Environment

Agents

Observation

Reward

Action

Action

28 of 34

29 of 34

Can Assistive Gym help train real robots?

30 of 34

31 of 34

32 of 34

Let’s Run Pretrained Policies

https://github.com/Healthcare-Robotics/assistive-gym/wiki

  1. Install the Pytorch RL library and download pretrained policies
  2. Static Human: python3 -m ppo.enjoy --env-name "FeedingPR2-v0"
  3. Active Human: python3 -m ppo.enjoy_coop --env-name "DrinkingSawyerHuman-v0"

33 of 34

Let’s Train a New Control Policy

Training Command:

python3 -m ppo.train --env-name "ScratchItchJaco-v0" --num-env-steps 100000 --save-dir ./trained_models_new/

Now, evaluate the trained policy:

python3 -m ppo.enjoy --env-name "ScratchItchJaco-v0" --load-dir trained_models_new/ppo

We can use the --load-policy argument to continue training an existing policy.

34 of 34

Basic Code Structure

class NewTaskEnv(AssistiveEnv):

def __init__(self, robot_type='pr2', human_control=False):

...

def step(self, action):

...

def _get_obs(self, forces, forces_human):

...

def reset(self):

...