JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 26

Robotic Grasp Detection

2 of 26

The Cornell Grasping Dataset

3 of 26

The Cornell Grasping Dataset

4 of 26

height

width

cos(2θ)

sin(2θ)

How To Grasp...

(x, y)

5 of 26

Convolutional, 64 filters, 5x5 size

Convolutional, 128 filters, 3x3 size

Convolutional, 128 filters, 3x3 size

Convolutional, 128 filters, 3x3 size

Convolutional, 256 filters, 3x3 size

Fully Connected, 512 Outputs, Dropout = .5

Fully Connected, 512 Outputs, Dropout = .5

Fully Connected, 6 Outputs

Architecture: Direct Regression to Grasps

(x, y)

height

width

cos(2θ)

sin(2θ)

Big Assumption:

1 grasp per image

6 of 26

Algorithm	Image-wise split accuracy	Object-wise split accuracy	Time per image
2-stage sliding window SVM, static features	60.5%	58.3%	Unknown
2-stage sliding window, deep features	73.9%	75.6%	13.5 sec
Deep CNN Regression	85.1%	84.5%	76 ms

Direct Regression to Grasps Works!

7 of 26

Average Grasps Are Awesome!!!

8 of 26

Average Grasps Are Awesome!!! (Right up until they’re not….)

9 of 26

Average Grasps Are Awesome!!! (Right up until they’re not….)

10 of 26

New System, Predict Local Bounding Boxes.

11 of 26

New System, Predict Local Bounding Boxes.

12 of 26

New System, Predict Local Bounding Boxes.

13 of 26

New System, Predict Local Bounding Boxes.

14 of 26

New System, Predict Local Bounding Boxes.

15 of 26

Convolutional, 64 filters, 5x5 size

Convolutional, 128 filters, 3x3 size

Convolutional, 128 filters, 3x3 size

Convolutional, 128 filters, 3x3 size

Convolutional, 256 filters, 3x3 size

Fully Connected, Output NxNx7 Grid

Fully Connected, 512 Outputs, Dropout = .5

Architecture: Direct Regression to Grasps

New Assumption:

1 grasp per NxN patch

16 of 26

Output = Grasps + Weights

Grasp Coordinates

Heatmap Of Grasp Probability

17 of 26

Learning

Only back-propagate error for the ground-truth grasps.

Back-propagate error for full heatmap

18 of 26

Examples

19 of 26

Examples

20 of 26

Examples

21 of 26

Examples

22 of 26

Examples

23 of 26

Examples

24 of 26

Examples

25 of 26

Examples

26 of 26

Algorithm	Image-wise split accuracy	Object-wise split accuracy	Time per image
2-stage sliding window SVM, static features	60.5%	58.3%	Unknown
2-stage sliding window, deep features	73.9%	75.6%	13.5 sec
Deep CNN Regression	85.1%	84.5%	76 ms
Grassroots Detection	88.2%	88.6%	76 ms

Grassroots Works Better!!!