: Learning Residual Gaussian Splats for Transparent
Object Manipulation
Aviral Agrawal, Ritaban Roy, Bardienus P. Duisterhof, Keerthan Bhat Hekkadka, Hongyi Chen, Jeffrey Ichnowski
Motivation
Proposed Approach
Conclusion and Future Work
Figure 1: DepthAnything [4] (left two) and Intel RealSenseTM (right) camera perform poorly for transparent objects
Results
Robotics Implications
ClearSplatting leverages mostly static scenes to improve depth perception.
Our two main goals for experimentation are addressed as follows:
Figure 2: Clear-Splatting approach
References
Grasping and manipulating transparent objects poses a significant challenge for robots since depth sensors and off-the-shelf monocular depth-estimators fail on transparent objects.
Premise: Leverage multi-view 3D reconstruction methods to model depth for transparent objects in a short time span
Problem statement: Neural Radiance Field (NeRF) methods [1,2] have shown ability to estimate depth for transparent objects given multi-view images. However, they still struggle with challenging objects and lighting conditions. With this work, we aim to show that novel scene prior learning compounded with novel depth-based gaussian pruning for 3D Gaussian Splatting can outperform existent works.
[1] J. Ichnowski*, Y. Avigal*, J. Kerr, and K. Goldberg, “Dex-NeRF: Using a neural radiance field to grasp transparent objects,” in Conference on Robot Learning (CoRL), 2020.
[2] B. P. Duisterhof, Y. Mao, S. H. Teng, and J. Ichnowski, “Residual-nerf: Learning residual nerfs for transparent object manipulation,” in ICRA, 2024
[3] B. Kerbl, G. Kopanas, T. Leimku ̈hler, and G. Drettakis, “3d gaussian splatting for real-time radiance field rendering,” ACM Transactions on Graphics, vol. 42, no. 4, July 2023. [Online].
[4] Yang, L., Kang, B., Huang, Z., Xu, X., Feng, J., & Zhao, H. (2024). Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. ArXiv, abs/2401.10891.
Figure 5: Convergence vs. time for top view
Figure 6: Convergence vs. time for all views
All view vs. top view performance gauging:
Figure 3: Depth map objective metric comparison
Figure 4: Depth map qualitative comparison
(avirala, ritabanr, bduister, kbhathek, hongyic, jichnows) @andrew.cmu.edu