Foundation Models for Robot Learning
Soroush Nasiriany
UT Robot Learning Reading Group
2022.04.28
1
What is a foundation model?
2
Figure from “On the Opportunities and Risks of Foundation Models”, Bommasani et al. 2021
Why do we want foundation models?
3
The biggest challenge ahead…
Applying foundation models to niche, safety-critical tasks
4
Obtaining useful training data
Adaptation to downstream tasks
Safety, interpretability, and privacy
This tutorial: foundation models for robot learning
Vision
Image representation for downstream tasks
Language
High-level reasoning
5
Pre-trained vision models for robot learning
6
R3M: A Universal Visual Representation for Robot Manipulation
7
What training objectives are used?
8
Experiments
9
Leveraging language models for robot learning
10
Prior work: prompt engineering
11
Ungrounded. Does not take current observation from environment into account!
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
12
Ground instructions from language model with learned value function
How feasible is this instruction, given the current environment context?
Where does the value function come from?
13
Experiment setup
14
Evaluation
15
Case study
16
Ablations
17
Discussion
18