Improving Robot Success Detection using
Static Object Data
Rosario Scalise, Jesse Thomason, Yonatan Bisk, Siddhartha Srinivasa
1
2
3
4
sensor stream
classification
of
outcome
5
Is apple in bowl?
,
t=0
t=15
yes
6
however...
7
however... sensors are noisy
8
pre-manipulation:
post-manipulation:
9
pre-manipulation:
post-manipulation:
10
pre-manipulation:
post-manipulation:
11
sensor stream
classification
of
outcome
12
sensor stream
static object information
classification
of
outcome
13
sensor stream
size
classification
of
outcome
14
sensor stream
size
shape
classification
of
outcome
15
sensor stream
size
shape
object-
relationships
classification
of
outcome
16
Grasped Object: OG
Target Object: OT
17
What is the observed outcome?
OG ON OT ?
OG IN OT ?
Y
N
Y
N
18
Classify this outcome using egocentric RGBD sensor modalities.
OG ON OT ?
OG IN OT ?
Y
N
Y
N
19
Our Domain: The YCB Objects
20
Our Domain: The YCB Objects
21
Our Domain: The YCB Objects
22
Our Domain: The YCB Objects
23
( OG , OT )
OG ON OT ?
OG IN OT ?
Dataset format:
Input: object pair
Output: GT labels
24
( , )
OG ON OT ? YES
OG IN OT ? NO
Dataset format:
Input: object pair
Output: GT labels
25
Robot Pairs
195 object pairs
X 5 trials each
= 955 examples
26
Robot Pairs
195 object pairs
X 5 trials each
= 955 examples
> 50 operator hours for this dataset!
27
Auxiliary Data from Human Judgement
28
Front, Back, Topdown, Left, Right
Auxiliary Data from Human Judgement
29
Auxiliary Data from Human Judgement
30
on?
in?
Auxiliary Data from Human Judgement
31
on?
in?
yes
yes
no
no
Auxiliary Data from Human Judgement
32
on?
in?
on?
in?
yes
yes
no
no
Auxiliary Data from Human Judgement
33
on?
in?
on?
in?
yes
yes
yes
no
yes
no
no
no
Auxiliary Data from Human Judgement
34
on?
in?
on?
in?
yes
yes
yes
no
yes
no
no
no
>3 annotations per object pair
Auxiliary Data from Human Judgement
35
on?
in?
on?
in?
yes
yes
yes
no
yes
no
no
no
All Pairs vs. Robot Pairs
Auxiliary Data from Human Judgement
36
“long yellow food”
“curved fruit”
“portable tasty snack”
Auxiliary Data from Human Judgement
37
9 referring expressions per object
Auxiliary Data from Human Judgement
“long yellow food”
“curved fruit”
“portable tasty snack”
38
Models
39
Accuracy on Test Fold
Baseline (majority class) :
Baseline (random) :
IN
.32 ± .00
.49 ± .06
ON
.36 ± .00
.50 ± .06
40
Egocentric RGBD
sensor stream baseline
41
Egocentric RGBD
42
Egocentric RGBD
43
Egocentric RGBD
44
Egocentric RGBD
45
Egocentric RGBD
46
Accuracy on Test Fold
Baseline (majority class) :
Baseline (random) :
Egocentric RGBD :
IN
.32 ± .00
.49 ± .06
.77 ± .05
ON
.36 ± .00
.50 ± .06
.53 ± .10
47
RGBD + Static Object Data
48
RGBD + Static Object Data
49
RGBD + Static Object Data
50
RGBD + Static Object Data
51
RGBD + Static Object Data
52
RGBD + Static Object Data
53
RGBD + Static Object Data
54
RGBD + Static Object Data
55
56
Ego Classification: On? NO
57
Ego Classification: On? NO
Ego + Obj Data Classification: On? YES
58
Accuracy on Test Fold
Baseline (majority class) :
Baseline (random) :
Ego RGBD :
Ego RGBD + Object Data :
IN
.32 ± .00
.49 ± .06
.77 ± .05
.74 ± .07
ON
.36 ± .00
.50 ± .06
.53 ± .10
.59 ± .08
59
RGBD + Static Object Data
Pre-Trained on
‘All Pairs’
1
60
RGBD + Static Object Data
Then trained on ‘Robot Pairs’
2
61
62
Ego Classification: In? NO
63
Ego Classification: In? NO
Ego + Pretrained Obj: In? YES
64
Accuracy on Test Fold
Baseline (majority class) :
Baseline (random) :
Ego RGBD :
Ego RGBD + Object Data :
Ego RGBD + Pre-trained Obj :
IN
.32 ± .00
.49 ± .06
.77 ± .05
.74 ± .07
.77 ± .05
ON
.36 ± .00
.50 ± .06
.53 ± .10
.59 ± .08
.59 ± .06
65
In summary...
66
+ object data
67
+ object data
68
+ object data
Improving Robot Success Detection using
Static Object Data
Rosario Scalise, Jesse Thomason, Yonatan Bisk, Siddhartha Srinivasa
69
Data + Code Repository: https://github.com/thomason-jesse/YCBLanguage