Open-Source Robotics Learning Datasets
A curated catalog of open-source datasets for robot manipulation, imitation learning, and reinforcement learning — with links to official sources.
Real-World Manipulation Data
Datasets with in-the-wild robot interactions and long-horizon tasks.
CollectionBenchmark-Centric Datasets
Suites designed for reproducible evaluation and cross-paper comparison.
CollectionCross-Robot Ecosystems
Shared formats and multi-embodiment data for foundation model training.
Popular Categories
Popular Tags
Datasets for Robot Learning

DROID
76K trajectories, 350 hours, 86 tasks. In-the-wild manipulation from 50 collectors across 564 scenes. TensorFlow Datasets, Hugging Face.
View dataset →
BridgeData V2
60K trajectories, 24 environments, 13 manipulation skills. Low-cost WidowX robot. Natural language labels, multi-task learning.
View dataset →
Open X-Embodiment
1M+ episodes, 22 robot types, 500+ skills. Unified RLDS format. RT-X models. 33 institutions.
View dataset →
ALOHA
Bimanual teleoperation. ALOHA-Cosmos-Policy, baseline datasets. HDF5, Hugging Face. Open hardware.
View dataset →
LIBERO
130 tasks, 65K demos. Lifelong learning benchmark. Spatial, object, goal suites. RoboSuite simulation.
View dataset →
RoboNet
15M frames, 7 robot platforms. Multi-robot transfer. Sawyer, Franka, Baxter, Fetch, WidowX.
View dataset →
RoboMimic & MimicGen
Framework + datasets. MimicGen: 50K demos from 200 human demos. Simulation + real. MIT license.
View dataset →
LeRobot
Standardized format + hub. DROID-100, ALOHA, SO-100. PyTorch, streaming. "ImageNet of robotics."
View dataset →Models & Tools You Can Pair
Research-Ready Curation
We highlight scale, format, and access details needed for quick evaluation.
Cross-Stack Compatibility
Datasets are mapped to practical model and tool ecosystems.
Deployment Context
Dataset choices are linked with real robot execution constraints.
Scale-up Path
When open data is not enough, we support custom collection pipelines.