ABCDEFGHIJKLMNOPQ
1
Latest update:
🤖 = latest additions
March 20, 2026
2
Note: drop-down filters only work when open in Google SpreadsheetsNumber of entries: 21
3
Physical Objects and Artifacts
4
Manipulation DatasetsImageDescription
Data
Data Types
Camera Views
Robot Hardware
Relevant Applications
Relevant Tasks
Relevant Physical Objects and Artifacts
(see repository linked above)
SamplesTasksNotesLink(s)
License
Citation
Year (Initial Release)
5
🤖 Dex1B: Articulation
Dex1B is a large-scale, diverse, and high-quality demonstration dataset generated using generative models. It contains one billion demonstrations for two fundamental dexterous manipulation tasks: grasping and articulation. The dataset was constructed using DexSimple, a generative model that integrates geometric constraints to improve feasibility and incorporates additional conditions to enhance diversity. For grasping, 1 million scenes were constructed using object assets from Objaverse. For articulation, scenes were constructed using object assets from PartNet-Mobility.SimPoint clouds, Robot pose, Action sequences, Depth mapsSingle-viewUR5, Unitree G1Application AgnosticArticulated Object Manipulation, Grasping1,000,000,0002 fundamental tasks (Grasping and Articulation) across 6,000+ objectshttps://jianglongye.com/dex1b/ Ye, Jianglong, Keyi Wang, Chengjing Yuan, Ruihan Yang, Yiquan Li, Jiyue Zhu, Yuzhe Qin, Xueyan Zou, and Xiaolong Wang. "Dex1b: Learning with 1b demonstrations for dexterous manipulation." arXiv preprint arXiv:2506.17198 (2025).2025
6
🤖 Flat'n'Fold
Comprising 1,212 human and 887 robot demonstrations of flattening and folding 44 unique garments across 8 categories, Flat'n'Fold surpasses prior datasets in size, scope, and diversity. The dataset uniquely captures the entire manipulation process from crumpled to folded states, providing synchronized multi-view RGB-D images, point clouds, and action data, including hand or gripper positions and rotations. It includes both human demonstrations (20 participants) and human-controlled robot demonstrations.RealRGB images, Depth images, Point clouds, Action sequences, Robot pose, Robot joint states, Tracker dataMulti-viewRethink Robotics BaxterService/DomesticDeformable Object Manipulation2,0992 main tasks (Flattening and Folding) across 44 unique garments in 8 categoriesIncludes 6,329 human and 5,574 robot annotated point clouds for grasping point prediction benchmark. ~20,000 annotated sub-task boundaries for task decompositionhttps://cvas-ug.github.io/flat-n-fold CC BY 4.0Zhuang, Lipeng, Shiyu Fan, Yingdong Ru, Florent P. Audonnet, Paul Henderson, and Gerardo Aragon-Camarasa. "Flat'n'Fold: A Diverse Multi-Modal Dataset for Garment Perception and Manipulation." In 2025 IEEE International Conference on Robotics and Automation (ICRA), pp. 7937-7944. IEEE, 2025.2025
7
🤖 Galaxea Open-World Dataset
Galaxea Open-World Dataset is a large-scale, diverse collection of robot behaviors recorded in authentic human living and working environments, comprising 500+ hours of real-world mobile manipulation data. It contains 100,000 demonstration trajectories spanning 150+ task categories across 50 distinct real-world scenes at 11 physical sites, covering residential, catering, retail, and office spaces. All demonstrations are gathered using a single consistent robotic embodiment (Galaxea R1 Lite) and enriched with precise subtask-level language annotationso facilitate both training and evaluation.RealRGB images, Depth images, Robot joint states, Action sequencesExternalGalaxea R1 Lite, Dual Galaxea A1XService/Domestic, Commercial/RetailPick-and-Place, General Home/Service Tasks100,000150+ task categories, 58 operational skillsData collected at 11 physical sites yielding 50 unique scenes; 1,600+ unique real-world objects sourced from retail suppliers across residential, kitchen, retail, and office environmentshttps://opengalaxea.github.io/G0/

https://huggingface.co/datasets/OpenGalaxea/Galaxea-Open-World-Dataset
CC BY-NC-SA 4.0Jiang, Tao, Tianyuan Yuan, Yicheng Liu, Chenhao Lu, Jianning Cui, Xiao Liu, Shuiqi Cheng, Jiyang Gao, Huazhe Xu, and Hang Zhao. "Galaxea open-world dataset and g0 dual-system vla model." arXiv preprint arXiv:2509.00576 (2025).2025
8
🤖 Hand–Object to Robot Action Dataset (HORA)
HORA is a large-scale multimodal dataset for cross-embodiment robotic learning from human(hand) object interactions. Built on the RoboWheel pipeline, it unifies heterogeneous HOI sources through physically plausible reconstruction, canonical action-space alignment, cross-embodiment retargeting, and simulation-based augmentation. It provides both HOI and robot-oriented modalities, including MANO hand parameters, 6-DoF object poses, contact annotations, robot observations, end-effector trajectories, and dense tactile signals for the mocap subset.RealRGB images, Depth images, 6D poses, Robot joint statesExternalUR5, Franka Emika Panda, KUKA LBR iiwa 7, Kinova Gen3Application AgnosticPick-and-Place, Human-Robot Handovers, General Home/Service Tasks150,000Covers diverse manipulation tasks derived from multiple public HOI datasets plus custom recordingshttps://zhangyuhong01.github.io/Robowheel/

https://huggingface.co/datasets/HORA-DB/HORA

https://github.com/zhangyuhong01/Robowheel-Toolkits
Zhang, Yuhong, Zihan Gao, Shengpeng Li, Ling-Hao Chen, Kaisheng Liu, Runqing Cheng, Xiao Lin, Junjia Liu, Zhuoheng Li, Jingyi Feng, Ziyan He, Jintian Lin, Zheyan Huang, Zhifang Liu, and Haoqian Wang. "RoboWheel: A Data Engine from Real-World Human Demonstrations for Cross-Embodiment Robotic Learning." arXiv preprint arXiv:2512.02729 (2025).2025
9
🤖 Purpose-driven Robotic Interaction in Scene Manipulation (PRISM)
PRISM is a large-scale synthetic dataset created to overcome the limitations of prior small-scale, simplistic task-oriented grasping datasets. It is constructed by composing 10,000 procedurally-generated cluttered scenes using 2,356 object instances from ShapeNet-Sem paired with stable grasps from the ACRONYM dataset, resulting in 378,844 task-grasp samples. Each scene is rendered from 10 camera views, and every sample contains a rendered RGB-D scene, a natural language task description, a calibrated 6-DoF grasp pose, a spatial grasp description, and a pixel-level grasp location.SimRGB images, Depth images, 6D posesMulti-viewFranka Emika PandaService/DomesticPick-and-Place, Tool Use378,844568 unique task categories2,356 diverse object instances from ShapeNet-Sem covering household, kitchen, office, and tool categorieshttps://abhaybd.github.io/GraspMolmo/

https://huggingface.co/datasets/allenai/PRISM
MITDeshpande, Abhay, Yuquan Deng, Arijit Ray, Jordi Salvador, Winson Han, Jiafei Duan, Kuo-Hao Zeng, Yuke Zhu, Ranjay Krishna, and Rose Hendrix. "Graspmolmo: Generalizable task-oriented grasping via large-scale synthetic data generation." arXiv preprint arXiv:2505.13441 (2025).2025
10
🤖 RefSpatial
RefSpatial is a large-scale, multi-source dataset of 2.5 million high-quality examples totaling 20 million QA pairs, designed to train and fine-tune Vision-Language Models (VLMs) on spatial referring tasks for robotics. It integrates data from three complementary source types: 2D web images (OpenImages) for broad spatial concepts and depth perception, 3D embodied videos (CA-1M) for fine-grained indoor scene understanding, and procedurally generated simulated data with ground-truth reasoning chains to support multi-step spatial referring (up to 5 steps).Real, SimRGB images, Depth images, 3D skeletonSingle-viewUR5, Unitree G1Application AgnosticPick-and-Place2,500,00031 spatial relations (left/right, above/below, front/back, near/far, metric distance, orientation, etc.); single-step and multi-step spatial reasoning (up to 5 steps)https://huggingface.co/datasets/JingkunAn/RefSpatial

https://zhoues.github.io/RoboRefer/
Zhou, Enshen, Jingkun An, Cheng Chi, Yi Han, Shanyu Rong, Chi Zhang, Pengwei Wang, Zhongyuan Wang, Tiejun Huang, Lu Sheng, and Shanghang Zhang. "Roborefer: Towards spatial referring with reasoning in vision-language models for robotics." arXiv preprint arXiv:2506.04308 (2025).2025
11
🤖 Robo360
Robo360 is the first real-world omnispective multi-view and multi-material robotic manipulation dataset, designed to bridge 3D scene understanding with robot control. It features robotic manipulation captured with dense 360° surrounding view coverage, enabling high-quality 3D neural representation learning (particularly dynamic NeRF) and multi-view policy learning. The dataset contains over 2,000 demonstration trajectories of more than 100 distinct objects with diverse material variations including rigid, deformable, transparent, and reflective objects.RealVideo, Audio, Robot joint statesMulti-viewSingle armApplication AgnosticArticulated Object Manipulation, Pick-and-Place2,000Diverse object manipulation tasks across 100+ objects with varying material propertieshttps://huggingface.co/datasets/liuyubian/Robo360 Liang, Litian, Liuyu Bian, Caiwei Xiao, Jialin Zhang, Linghao Chen, Isabella Liu, Fanbo Xiang, Zhiao Huang, and Hao Su. "Robo360: a 3D omnispective multi-material robotic manipulation dataset." arXiv preprint arXiv:2312.06686 (2023).2023
12
🤖 RoboCerebra
RoboCerebra is a large-scale benchmark for evaluating high-level System 2 reasoning in long-horizon robotic manipulation, targeting capabilities such as planning, reflection, and episodic memory that are underexplored by existing reactive (System 1) benchmarks. It features 1,000 human-annotated simulation trajectories across 100 task variants, each spanning up to 3,000 simulation steps, constructed via a top-down pipeline where GPT generates task instructions and decomposes them into subtask sequences which human operators then execute in simulation. SimRGB images, VideoSingle-view, ExternalSingle armService/DomesticGeneral Home/Service Tasks1,000100 task variants across 6 subtask-type categories; 4-15 subtask steps per trajectoryCommon household objects including cream cheese, popcorn, butter, cookies, wine bottles, tomato sauce, BBQ sauce, chocolate pudding, alphabet soup, milk, frying panshttps://github.com/qiuboxiang/RoboCerebra

https://huggingface.co/datasets/qiukingballball/RoboCerebra

https://robocerebra.github.io/
MITHan, Songhao, Boxiang Qiu, Yue Liao, Siyuan Huang, Chen Gao, Shuicheng Yan, and Si Liu. "Robocerebra: A large-scale benchmark for long-horizon robotic manipulation evaluation." arXiv preprint arXiv:2506.06677 (2025).2025
13
🤖 RoboMIND
RoboMIND is the largest multi-embodiment teleoperation dataset collected on a unified standardized platform, comprising 107K real-world demonstration trajectories spanning 479 distinct tasks across 96 unique object classes and amounting to 305.5 hours of interaction data. It covers four distinct robot embodiments, Franka Emika Panda, UR5e, AgileX Cobot Magic V2.0, and the Tien Kung humanoid and uniquely includes 5,000 real-world failure demonstrations each annotated with cause of failure to enable failure reflection and correction during policy learning. RealRGB images, Depth images, Robot joint states, Action sequencesExternal, Multi-viewFranka Emika Panda, UR5, Tien Kung Humanoid, AgileX Cobot Magic V2.0Service/DomesticArticulated Object Manipulation107,000479 distinct tasks (v1.2); 279 tasks in initial v1.0 release96 object classes across domestic, kitchen, industrial, office, and retail categorieshttps://huggingface.co/datasets/x-humanoid-robomind/RoboMIND

https://github.com/x-humanoid-robomind/x-humanoid-robomind.github.io

https://x-humanoid-robomind.github.io/
Apache 2.0Wu, Kun, Chengkai Hou, Jiaming Liu, Zhengping Che, Xiaozhu Ju, Zhuqin Yang, Meng Li, Yinuo Zhao, Zhiyuan Xu, Guang Yang, Shichao Fan, Xinhua Wang, Fei Liao, Zhen Zhao, Guangyu Li, Zhao Jin, Lecheng Wang, Jilei Mao, Ning Liu, Pei Ren, Qiang Zhang, Yaoxu Lyu, Mengzhen Liu, Jingyang He, Yulin Luo, Zeyu Gao, Chenxuan Li, Chenyang Gu, Yankai Fu, Di Wu, Xingyu Wang, Sixiang Chen, Zhenyu Wang, Pengju An, Siyuan Qian, Shanghang Zhang, and Jian Tang. "Robomind: Benchmark on multi-embodiment intelligence normative data for robot manipulation." arXiv preprint arXiv:2412.13877 (2024).2024
14
🤖 RoboVerse
RoboVerse is a comprehensive unified framework comprising a simulation platform (MetaSim), a large-scale synthetic dataset, and standardized benchmarks for both imitation learning and reinforcement learning, designed to overcome the data-scaling and evaluation-standardization bottlenecks in robot learning. The dataset contains ~500K unique high-fidelity trajectories covering 276 task categories with ~5.5K assets and over 10 million transitions.SimRGB images, Depth images, Robot joint states, 6D poses, Action sequencesExternal, Multi-viewFranka Emika Panda, Unitree G1Application AgnosticPick-and-Place, Articulated Object Manipulation500,000276 task categories; 1,000+ distinct task variants; Open6DOR subset alone has 5,000+ tasks across position, rotation, and 6-DoF trackshttps://roboverseorg.github.io/ Apache 2.0Geng, Haoran, Feishi Wang, Songlin Wei, Yuyang Li, Bangjun Wang, Boshi An, Charlie Tianyue Cheng, Haozhe Lou, Peihao Li, Yen-Jen Wang, Yutong Liang, Dylan Goetting, Chaoyi Xu, Haozhe Chen, Yuxi Qian, Yiran Geng, Jiageng Mao, Weikang Wan, Mingtong Zhang, Jiangran Lyu, Siheng Zhao, Jiazhao Zhang, Jialiang Zhang, Chengyang Zhao, Haoran Lu, Yufei Ding, Ran Gong, Yuran Wang, Yuxuan Kuang, Ruihai Wu, Baoxiong Jia, Carlo Sferrazza, Hao Dong, Siyuan Huang, Yue Wang, Jitendra Malik, and Pieter Abbeel. "Roboverse: Towards a unified platform, dataset and benchmark for scalable and generalizable robot learning." arXiv preprint arXiv:2504.18904 (2025).2025
15
AgiBot World
AgiBot World is a large-scale platform comprising over 1 million trajectories across 217 tasks in five deployment scenarios, we achieve an order-of-magnitude increase in data scale compared to existing datasets. Accelerated by a standardized collection pipeline with human-in-the-loop verification, AgiBot World guarantees high-quality and diverse data distribution. It is extensible from grippers to dexterous hands and visuo-tactile sensors for fine-grained skill acquisition. AgiBot World Beta is the complete dataset featuring over 1M trajectories and Alpha is a subset containing over 92K trajectories.RealRGB images, Depth images, Robot pose, Robot velocity, Robot force, Robot torque, VideoExternal, WristSingle arm, Bi-manual, Mobile manipulator, Two-finger, Multi-finger, AgiBot G1Commercial/Retail, Logistics/Warehousing, Manufacturing, Service/DomesticPick-and-Place, Cloth Folding, Deformable Object Manipulation, Shelf Picking, General Home/Service Tasks1,000,041217100 robots, 100+ real-world scenarios across 5 target domains, 87 types of atomic skillshttps://huggingface.co/datasets/agibot-world/AgiBotWorld-Beta

https://github.com/OpenDriveLab/Agibot-World
CC BY-NC-SA 4.0Bu, Qingwen, Jisong Cai, Li Chen, Xiuqi Cui, Yan Ding, Siyuan Feng, Shenyuan Gao et al. "Agibot world colosseo: A large-scale manipulation platform for scalable and intelligent embodied systems." arXiv preprint arXiv:2503.06669 (2025).2025
16
BridgeData V2
BridgeData V2 is a large and diverse dataset of robotic manipulation behaviors designed to facilitate research in scalable robot learning. The dataset is compatible with open-vocabulary, multi-task learning methods conditioned on goal images or natural language instructions. Skills learned from the data generalize to novel objects and environments, as well as across institutions.RealRGB images, RGB-D imagesExternal, WristSingle arm, Two-finger, WidowX 250Assistive Robotics, Service/DomesticPick-and-Place, Deformable Object Manipulation60,0968https://rail-berkeley.github.io/bridgedata/

https://github.com/rail-berkeley/bridge_data_v2
MITWalke, Homer Rich, Kevin Black, Tony Z. Zhao, Quan Vuong, Chongyi Zheng, Philippe Hansen-Estruch, Andre Wang He et al. "Bridgedata v2: A dataset for robot learning at scale." In Conference on Robot Learning, pp. 1723-1736. PMLR, 2023.2023
17
DROID (Distributed Robot Interaction Dataset)
DROID (Distributed Robot Interaction Dataset) is a diverse robot manipulation dataset with 76k demonstration trajectories or 350h of interaction data, collected across 564 scenes and 86 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance, greater robustness, and improved generalization ability. We open source the full dataset, code for policy training, and a detailed guide for reproducing our robot hardware setup.RealRGB images, Robot pose, Robot velocityExternal, WristSingle arm, Two-finger, Franka Emika Panda, Robotiq 2F-85Assistive Robotics, Commercial/Retail, Service/DomesticGeneral Home/Service Tasks76,00086https://colab.research.google.com/drive/1b4PPH4XGht4Jve2xPKMCh-AXXAQziNQa

https://droid-dataset.github.io/
CC BY 4.0Khazatsky, Alexander, Karl Pertsch, Suraj Nair, Ashwin Balakrishna, Sudeep Dasari, Siddharth Karamcheti, Soroush Nasiriany et al. "Droid: A large-scale in-the-wild robot manipulation dataset." arXiv preprint arXiv:2403.12945 (2024).2024
18
Functional Manipulation Benchmark (FMB)
Our dataset consists of objects in diverse appearance and geometry. It requires multi-stage and multi-modal fine motor skills to successfully assemble the pegs onto a unfixed board in a randomized scene. We collected a total of 22,550 trajectories across two different tasks on a Franka Panda arm. We record the trajectories from 2 global views and 2 wrist views. Each view contains both RGB and depth map. Two datasets included: Single-Object Multi-Stage Manipulation Task Full Dataset and Multi-Object Multi-Stage Manipulation Task with Assembly 1, 2, and 3.RealRGB images, Depth images, Robot pose, Robot velocity, Robot force, Robot torqueExternal, WristSingle arm, Two-finger, Franka Emika PandaManufacturingAssemblyFunctional Manipulation Benchmark (FMB)22,5502https://functional-manipulation-benchmark.github.io/dataset/index.html CC BY 4.0Luo, Jianlan, Charles Xu, Fangchen Liu, Liam Tan, Zipeng Lin, Jeffrey Wu, Pieter Abbeel, and Sergey Levine. "Fmb: a functional manipulation benchmark for generalizable robotic learning." The International Journal of Robotics Research (2023): 02783649241276017.2023
19
FurnitureBench
FurnitureBench is the real-world furniture assembly benchmark, which aims at providing a reproducible and easy-to-use platform for long-horizon complex robotic manipulation. Furniture assembly poses integral robotic manipulation challenges that autonomous robots must be capable of: long-horizon planning, dexterous control, and robust visual perception. By presenting a well-defined suite of tasks with a lower barrier of entry (large-scale human teleoperation data and standardized configurations), we encourage the research community to push the boundaries of the current robotic system.RealRGB-D images, Robot pose, Robot velocity, AprilTag poses, MetadataExternal, WristSingle arm, Two-finger, Franka Emika PandaCommercial/Retail, Manufacturing, Service/DomesticAssemblyFurnitureBench5,1009https://clvrai.github.io/furniture-bench/docs/tutorials/dataset.html

https://clvrai.github.io/furniture-bench/
MITHeo, Minho, Youngwoon Lee, Doohyun Lee, and Joseph J. Lim. "Furniturebench: Reproducible real-world benchmark for long-horizon complex manipulation." The International Journal of Robotics Research (2023): 02783649241304789.2023
20
Kaiwu
The dataset first provides an integration of human, environment and robot data collection framework with 20 subjects and 30 interaction objects resulting in totally 11,664 instances of integrated actions. For each of the demonstration, hand motions, operation pressures, sounds of the assembling process, multi-view videos, high-precision motion capture information, eye gaze with firstperson videos, electromyography signals are all recorded. Finegrained multi-level annotation based on absolute timestamp, and semantic segmentation labelling are performed.RealVideo, 3D skeleton, Audio, Haptic, Eye gaze, IMU, EMGExternalHuman handManufacturingAssembly11,6643020 human subjectshttps://www.scidb.cn/en/detail?dataSetId=33060cd729604d2ca7d41189a9fc492b Jiang, Shuo, Haonan Li, Ruochen Ren, Yanmin Zhou, Zhipeng Wang, and Bin He. "Kaiwu: A Multimodal Manipulation Dataset and Framework for Robot Learning and Human-Robot Interaction." IEEE Robotics and Automation Letters, vol. 10, no. 11, pp. 11482-11489, Nov. 2025, doi: 10.1109/LRA.2025.36096152025
21
LIBERO
LIBERO is designed for studying knowledge transfer in multitask and lifelong robot learning problems. Successfully resolving these problems require both declarative knowledge about objects/spatial relationships and procedural knowledge about motion/behaviors. LIBERO provides 130 tasks grouped into 4 task suites: LIBERO-Spatial, LIBERO-Object, LIBERO-Goal, and LIBERO-100SimRGB imagesExternal, WristSingle arm, Two-finger, Franka Emika PandaAssistive Robotics, Commercial/Retail, Service/DomesticPick-and-Place, Cloth Folding, Deformable Object Manipulation, Shelf Picking, General Home/Service TasksLIBERO-Spatial: 62,250 frames

LIBERO-Object: 74,507 frames

LIBERO-Goal: 63,728 frames

LIBERO-100: 807,133 frames
LIBERO-Spatial: 10 tasks

LIBERO-Object: 10 tasks

LIBERO-Goal: 10 tasks

LIBERO-100: 100 tasks
https://libero-project.github.io/datasets

https://github.com/Lifelong-Robot-Learning/LIBERO
MITLiu, Bo, Yifeng Zhu, Chongkai Gao, Yihao Feng, Qiang Liu, Yuke Zhu, and Peter Stone. "Libero: Benchmarking knowledge transfer for lifelong robot learning." Advances in Neural Information Processing Systems 36 (2023): 44776-44791.2023
22
Open X-Embodiment
Open X-Embodiment provides datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms.Real, SimRGB images, Depth images, Robot pose, Robot velocityExternal, WristSingle arm, Bi-manual, Mobile manipulator, Two-finger, Suction, Robotiq 2F-85, WSG-50Assistive Robotics, Commercial/Retail, Service/DomesticGeneral Home/Service Tasks1,000,000160,26622 robot embodiments across 21 institutionshttps://robotics-transformer-x.github.io/

https://github.com/google-deepmind/open_x_embodiment

https://docs.google.com/spreadsheets/d/1rPBD77tk60AEIGZrGSODwyyzs5FgCU9Uz3h-3_t2A9g/
Apache 2.0O’Neill, Abby, Abdul Rehman, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley et al. "Open x-embodiment: Robotic learning datasets and rt-x models: Open x-embodiment collaboration 0." In 2024 IEEE International Conference on Robotics and Automation (ICRA), pp. 6892-6903. IEEE, 2024.2024
23
PartInstruct
PartInstruct is the first benchmark for training and evaluating such models. It features 513 object instances across 14 categories, 1302 manipulation tasks in 16 classes, and over 10,000 expert demonstrations synthesized in a 3D simulator. Each demonstration includes a high-level task instruction, a sequence of basic part-based skills, and ground-truth 3D object data. Additionally, we designed a comprehensive test suite to evaluate the generalizability of learned policies across new states, objects, and tasks. SimRGB images, Depth images, Point clouds, Segmentation masks, 3D object model meshesExternalSingle arm, Two-finger, Franka Emika PandaAssistive Robotics, Commercial/Retail, Service/DomesticPick-and-Place, Shelf Picking, General Home/Service Tasks, Grasping10,0001,302513 object instances across 14 categories

16 task classes
https://huggingface.co/datasets/SCAI-JHU/PartInstruct

https://github.com/SCAI-JHU/PartInstruct

https://partinstruct.github.io/
MITYin, Yifan, Zhengtao Han, Shivam Aarya, Jianxin Wang, Shuhang Xu, Jiawei Peng, Angtian Wang, Alan Yuille, and Tianmin Shu. "PartInstruct: Part-level Instruction Following for Fine-grained Robot Manipulation." arXiv preprint arXiv:2505.21652 (2025).2025
24
REASSEMBLE (Robotic assEmbly disASSEMBLy datasEt)
REASSEMBLE (Robotic assEmbly disASSEMBLy datasEt) is a new dataset designed specifically for contact-rich manipulation tasks. Built around the NIST Assembly Task Board 1 benchmark, REASSEMBLE includes four actions (pick, insert, remove, and place) involving 17 objects. The dataset contains 4,551 demonstrations, of which 4,035 were successful, spanning a total of 781 minutes. Our dataset features multi-modal sensor data including event cameras, force-torque sensors, microphones, and multi-view RGB cameras.RealRGB images, Robot pose, Robot velocity, Robot force, Robot torque, Audio, Event cameraExternal, WristSingle arm, Two-finger, Franka Emika PandaManufacturingAssemblyNIST Assembly Task Boards (ATB)4,5512Tasks: Assemble, Disassemblehttps://tuwien-asl.github.io/REASSEMBLE_page/

https://researchdata.tuwien.ac.at/records/0ewrv-8cb44
MITSliwowski, Daniel, Shail Jadav, Sergej Stanovcic, Jedrzej Orbik, Johannes Heidersberger, and Dongheui Lee. "Reassemble: A multimodal dataset for contact-rich robotic assembly and disassembly." In Proceedings of Robotics: Science and Systems (RSS) 2025.2025
25
RH20T
RH20T is a dataset comprising over 110,000 contact-rich robot manipulation sequences across diverse skills, contexts, robots, and camera viewpoints, all collected in the real world. Each sequence in the dataset includes visual, force, audio, and action information, along with a corresponding human demonstration video. We have invested significant efforts in calibrating all the sensors and ensuring a high-quality dataset.RealRGB images, Depth images, Robot pose, Robot force, Robot torque, IR images, Audio, TactileSingle arm, Two-finger, Franka Emika Panda, UR5, Flexiv, DH Robotics AG-95, Robotiq 2F-85, WSG-50Assistive Robotics, Commercial/Retail, Service/DomesticGeneral Home/Service Tasks110,000147Tasks: 48 from RLBench, 29 from MetaWorld, 70 self-proposedhttps://rh20t.github.io/

https://github.com/rh20t/rh20t_api
CC BY-NC 4.0, CC-BY-SA 4.0, MITFang, Hao-Shu, Hongjie Fang, Zhenyu Tang, Jirong Liu, Chenxi Wang, Junbo Wang, Haoyi Zhu, and Cewu Lu. "Rh20t: A comprehensive robotic dataset for learning diverse skills in one-shot." In 2024 IEEE International Conference on Robotics and Automation (ICRA), pp. 653-660. IEEE, 2024.2024