LeRobot adapter
Reads Hugging Face LeRobot
datasets (format v2.1) and creates one RoboTrace episode per
trajectory. Tiny install footprint - the adapter does not depend
on the heavy lerobot PyPI package (which would pull torch,
torchvision, pyav, and several CUDA wheels). It reads the on-disk
format directly with pyarrow + huggingface_hub, so the install
adds about 20 MB on top of the base SDK.
from robotrace.adapters import lerobot
# Upload every trajectory in a Hub dataset as its own RoboTrace episode.
lerobot.upload_dataset(
"lerobot/aloha_static_cups_open",
policy_version="aloha-v1",
env_version="aloha-cell-1",
)That's the whole 95% case. Read on for the four explicit verbs, column auto-classification, multi-camera handling, and how this maps to LeRobot's data model.
Install
# Sensor / action only - passing canonical_camera or no cameras at all.
pip install 'robotrace-dev[lerobot]==0.1.0a6'
# With multi-camera horizontal tiling (most LeRobot datasets have 2+ cams).
pip install 'robotrace-dev[lerobot,video]==0.1.0a6'The pin is the most reliable install during alpha and drops once
we cut 1.0.
[lerobot] pulls in huggingface_hub, pyarrow, and numpy. The
[video] extra adds opencv-python for tiling multiple cameras
into one video.mp4. Single-camera uploads (or any
canonical_camera="..."-pinned upload) don't need [video]
because the source mp4 is copied byte-for-byte without re-encoding.
LeRobot dataset format v3.0 (multi-episode parquet shards,
introduced late 2025) is not yet supported - the adapter raises a
clear ConfigurationError pointing at v2.1 fallbacks. v3.0 lands
in a follow-up release.
The four verbs
| Verb | What it does |
|---|---|
lerobot.scan_dataset(repo) | Read-only introspection. Pulls only meta/* from the Hub. Returns a DatasetSummary with fps, episode count, frame count, camera list, and per-episode lengths and tasks. |
lerobot.encode_episode(repo, idx, out) | Fetch one episode's parquet + per-camera mp4s, write video.mp4 / sensors.npz / actions.npz into out. Returns an EncodedEpisode with the file paths and provenance metadata. No upload. |
lerobot.upload_episode(repo, idx, ...) | One-shot for a single trajectory: scan → encode to a tempdir → start_episode + upload_* + finalize. Returns the finalized Episode. |
lerobot.upload_dataset(repo, ...) | Bulk: walk every (or a subset of) trajectory and call upload_episode for each. Sequential - one episode at a time, fresh tempdir, so disk stays at one trajectory's worth at any moment. |
scan_dataset is the dry-run - most users start there to see how
many episodes the adapter would upload before paying the network
cost.
summary = lerobot.scan_dataset("lerobot/aloha_static_cups_open")
print(summary.report())
# lerobot/aloha_static_cups_open (hub, v2.1, 50 fps)
# episodes: 50, frames: 12500
# cameras: observation.images.cam_high, observation.images.cam_low_left, observation.images.cam_low_right
# features: action, observation.state, next.reward, next.doneIf it looks right, swap scan_dataset for upload_dataset and
you're done.
Local datasets vs. Hub datasets
The first argument can be either a Hub repo id (namespace/dataset-name)
or a local directory containing the meta/, data/, videos/
layout. Resolution is automatic - anything that exists on disk wins,
otherwise we hit the Hub.
# Hub dataset (downloads files lazily, caches in ~/.cache/huggingface).
lerobot.upload_dataset("lerobot/pusht", policy_version="pusht-v1")
# Local dataset on a workstation.
lerobot.upload_dataset("/data/robot_runs/2026-05-10/", policy_version="pusht-v1")
# Pin the Hub revision to a specific commit / tag / branch.
lerobot.upload_dataset(
"lerobot/aloha_static_cups_open",
revision="v2.1",
policy_version="aloha-v1",
)For private or gated datasets, set HF_TOKEN in your environment -
huggingface_hub reads it automatically.
Column auto-classification
LeRobot datasets use a strong dotted-column convention, so the classifier is mechanical. The mapping (first match wins):
| Column pattern | → Slot |
|---|---|
observation.images.<camera_key> | video (mp4 source) |
action or action.<x> | actions |
next.reward, next.done, next.success, next.<x> | episode_meta (rolled into per-episode metadata) |
timestamp, frame_index, episode_index, index, task_index | internal (skipped) |
observation.state | sensors |
Any other observation.<x> | sensors |
| Anything else | sensors (safe default) |
Camera keys are read from info.json["features"], not from the
parquet - LeRobot v2.1 stores image data in videos/.../<key>/...mp4
and references them by feature name only. The classifier is pure
function - you can call lerobot.classify_column("...") to sanity-
check what the encoder will do without writing anything to disk.
Multi-camera datasets
When a dataset has more than one observation.images.<key> feature,
the adapter tiles the per-camera mp4s horizontally into a single
video.mp4. Heights are black-padded so cameras with different
resolutions still align. Cameras are emitted in the order they
appear in info.json["features"], so the same dataset always
produces the same mosaic.
If you only want one camera, pass canonical_camera:
lerobot.upload_episode(
"lerobot/aloha_static_cups_open",
episode_index=0,
canonical_camera="observation.images.cam_high",
policy_version="aloha-v1",
)Single-camera uploads skip the opencv code path entirely - no tile, no re-encode. The source mp4 is copied byte-for-byte and pushed to R2 as-is.
How sensors / actions get packed
Each non-image column contributes a set of arrays into a single NPZ file per slot. Layout uses the column name as a namespace and preserves per-frame timestamps:
sensors.npz
observation.state/_t_ns int64[N] # nanosecond timestamps
observation.state/value float32[N, K] # per-frame state vector
observation.environment_state/_t_ns int64[N]
observation.environment_state/value float32[N, M]
actions.npz
action/_t_ns int64[N]
action/value float32[N, A]
action.gripper/_t_ns int64[N]
action.gripper/value float32[N]_t_ns is recovered from the parquet's timestamp column (LeRobot
stores it in seconds; we convert to nanoseconds for symmetry with
the ROS 2 adapter and the SDK ingest schema). Columns that aren't
1-D scalars or fixed-length lists of floats - e.g. structs, ragged
lists, strings - are skipped and recorded in
metadata.skipped_columns so you can spot them in the portal.
Episode outcome
next.reward, next.done, next.success and any other next.*
column don't go into actions.npz. They describe the episode's
outcome, so they're rolled up into the episode-level metadata
instead:
{
"adapter": "lerobot",
"lerobot_repo_id": "lerobot/aloha_static_cups_open",
"lerobot_codebase_version": "v2.1",
"lerobot_episode_index": 0,
"lerobot_episode_length": 250,
"lerobot_tasks": ["pick up the cup"],
"lerobot_episode_outcome": {
"next.done": true,
"next.reward": 0.42,
"next.reward_sum": 87.5
}
}next.reward_sum is the trajectory's cumulative reward (LeRobot
stores per-step reward, so we sum once during encoding) - what
training pipelines usually want as a single quality signal per run.
Bulk uploads with progress
upload_dataset walks every trajectory by default. Pass
episode_indices= to upload a slice, and on_progress= to surface
per-episode progress in your own UI:
def progress(done, total, episode, error):
if error is not None:
print(f" [{done}/{total}] FAILED: {error}")
else:
print(f" [{done}/{total}] {episode.id}")
lerobot.upload_dataset(
"lerobot/aloha_static_cups_open",
policy_version="aloha-v1",
env_version="aloha-cell-1",
episode_indices=range(0, 10),
on_progress=progress,
)Errors don't abort the loop by default - a single corrupted parquet
shouldn't kill a 50-episode upload. Pass stop_on_error=True to
fail fast.
Encode-then-handle-it-yourself
encode_episode exposes the artifacts as files so you can inspect
or post-process before uploading:
encoded = lerobot.encode_episode(
"lerobot/aloha_static_cups_open",
episode_index=0,
output_dir="/tmp/encoded/",
)
print(encoded.duration_s, encoded.fps)
# 5.0 50.0
print([a.path for a in (encoded.video, encoded.sensors, encoded.actions) if a])
# [PosixPath('/tmp/encoded/video.mp4'),
# PosixPath('/tmp/encoded/sensors.npz'),
# PosixPath('/tmp/encoded/actions.npz')]
print(encoded.metadata["lerobot_episode_outcome"])
# {'next.done': True, 'next.reward': 0.95, 'next.reward_sum': 23.4}Then drive start_episode / upload_* directly. Same plumbing
upload_episode uses internally.
Format compatibility
| LeRobot version | Status | Notes |
|---|---|---|
| v2.0 / v2.1 | ✅ supported | Used by virtually every public lerobot/* Hub dataset as of May 2026. One parquet per episode, one mp4 per episode per camera. |
| v3.0 | ❌ not yet | Multi-episode parquet shards (introduced late 2025). The adapter raises a clear ConfigurationError pointing at the v2.1 revision fallback. v3.0 support tracked for a follow-up release. |
If you hit a v3.0 dataset and need it now: pin the Hub revision to a
v2.1 tag if one exists (revision="v2.1"), or convert locally with
the lerobot CLI's downgrade script. Otherwise open an
issue - we're
prioritising v3.0 by demand.
Errors
| Exception | When |
|---|---|
ConfigurationError | Repo / path doesn't exist, format is v3.0, parquet/mp4 missing, or pyarrow / huggingface_hub aren't installed |
AuthError | API key bad / revoked (raised by the underlying start_episode) |
ValidationError | Server rejected the create payload |
TransportError | Network / DNS / timeout during the create or upload |
If an upload fails partway through, the adapter (via
Client.start_episode's standard handling) flips the run to
status="failed" with the failure reason in
metadata.failure_reason before re-raising - so you don't end up
with ghostly "recording" runs in the portal.