Robotic Grasp Detection
The Cornell Grasping Dataset
The Cornell Grasping Dataset
height
width
cos(2θ)
sin(2θ)
How To Grasp...
(x, y)
Convolutional, 64 filters, 5x5 size
Convolutional, 128 filters, 3x3 size
Convolutional, 128 filters, 3x3 size
Convolutional, 128 filters, 3x3 size
Convolutional, 256 filters, 3x3 size
Fully Connected, 512 Outputs, Dropout = .5
Fully Connected, 512 Outputs, Dropout = .5
Fully Connected, 6 Outputs
Architecture: Direct Regression to Grasps
(x, y)
height
width
cos(2θ)
sin(2θ)
Big Assumption:
1 grasp per image
Algorithm | Image-wise split accuracy | Object-wise split accuracy | Time per image |
2-stage sliding window SVM, static features | 60.5% | 58.3% | Unknown |
2-stage sliding window, deep features | 73.9% | 75.6% | 13.5 sec |
Deep CNN Regression | 85.1% | 84.5% | 76 ms |
Direct Regression to Grasps Works!
Average Grasps Are Awesome!!!
Average Grasps Are Awesome!!! (Right up until they’re not….)
Average Grasps Are Awesome!!! (Right up until they’re not….)
New System, Predict Local Bounding Boxes.
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
New System, Predict Local Bounding Boxes.
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
New System, Predict Local Bounding Boxes.
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
New System, Predict Local Bounding Boxes.
New System, Predict Local Bounding Boxes.
Convolutional, 64 filters, 5x5 size
Convolutional, 128 filters, 3x3 size
Convolutional, 128 filters, 3x3 size
Convolutional, 128 filters, 3x3 size
Convolutional, 256 filters, 3x3 size
Fully Connected, Output NxNx7 Grid
Fully Connected, 512 Outputs, Dropout = .5
Architecture: Direct Regression to Grasps
New Assumption:
1 grasp per NxN patch
Output = Grasps + Weights
Grasp Coordinates
Heatmap Of Grasp Probability
Learning
Only back-propagate error for the ground-truth grasps.
Back-propagate error for full heatmap
Examples
Examples
Examples
Examples
Examples
Examples
Examples
Examples
Algorithm | Image-wise split accuracy | Object-wise split accuracy | Time per image |
2-stage sliding window SVM, static features | 60.5% | 58.3% | Unknown |
2-stage sliding window, deep features | 73.9% | 75.6% | 13.5 sec |
Deep CNN Regression | 85.1% | 84.5% | 76 ms |
Grassroots Detection | 88.2% | 88.6% | 76 ms |
Grassroots Works Better!!!