LeRobot adapter

Reads Hugging Face LeRobot datasets (format v2.1) and creates one RoboTrace episode per trajectory. Tiny install footprint - the adapter does not depend on the heavy lerobot PyPI package (which would pull torch, torchvision, pyav, and several CUDA wheels). It reads the on-disk format directly with pyarrow + huggingface_hub, so the install adds about 20 MB on top of the base SDK.

from robotrace.adapters import lerobot
 
# Upload every trajectory in a Hub dataset as its own RoboTrace episode.
lerobot.upload_dataset(
    "lerobot/aloha_static_cups_open",
    policy_version="aloha-v1",
    env_version="aloha-cell-1",
)

That's the whole 95% case. Read on for the four explicit verbs, column auto-classification, multi-camera handling, and how this maps to LeRobot's data model.

Install

# Sensor / action only - passing canonical_camera or no cameras at all.
pip install 'robotrace-dev[lerobot]==0.1.0a6'
 
# With multi-camera horizontal tiling (most LeRobot datasets have 2+ cams).
pip install 'robotrace-dev[lerobot,video]==0.1.0a6'

The pin is the most reliable install during alpha and drops once we cut 1.0.

[lerobot] pulls in huggingface_hub, pyarrow, and numpy. The [video] extra adds opencv-python for tiling multiple cameras into one video.mp4. Single-camera uploads (or any canonical_camera="..."-pinned upload) don't need [video] because the source mp4 is copied byte-for-byte without re-encoding.

LeRobot dataset format v3.0 (multi-episode parquet shards, introduced late 2025) is not yet supported - the adapter raises a clear ConfigurationError pointing at v2.1 fallbacks. v3.0 lands in a follow-up release.

The four verbs

Verb	What it does
`lerobot.scan_dataset(repo)`	Read-only introspection. Pulls only `meta/*` from the Hub. Returns a `DatasetSummary` with fps, episode count, frame count, camera list, and per-episode lengths and tasks.
`lerobot.encode_episode(repo, idx, out)`	Fetch one episode's parquet + per-camera mp4s, write `video.mp4` / `sensors.npz` / `actions.npz` into `out`. Returns an `EncodedEpisode` with the file paths and provenance metadata. No upload.
`lerobot.upload_episode(repo, idx, ...)`	One-shot for a single trajectory: scan → encode to a tempdir → `start_episode` + `upload_*` + `finalize`. Returns the finalized `Episode`.
`lerobot.upload_dataset(repo, ...)`	Bulk: walk every (or a subset of) trajectory and call `upload_episode` for each. Sequential - one episode at a time, fresh tempdir, so disk stays at one trajectory's worth at any moment.

scan_dataset is the dry-run - most users start there to see how many episodes the adapter would upload before paying the network cost.

summary = lerobot.scan_dataset("lerobot/aloha_static_cups_open")
print(summary.report())
# lerobot/aloha_static_cups_open  (hub, v2.1, 50 fps)
#   episodes: 50, frames: 12500
#   cameras: observation.images.cam_high, observation.images.cam_low_left, observation.images.cam_low_right
#   features: action, observation.state, next.reward, next.done

If it looks right, swap scan_dataset for upload_dataset and you're done.

Local datasets vs. Hub datasets

The first argument can be either a Hub repo id (namespace/dataset-name) or a local directory containing the meta/, data/, videos/ layout. Resolution is automatic - anything that exists on disk wins, otherwise we hit the Hub.

# Hub dataset (downloads files lazily, caches in ~/.cache/huggingface).
lerobot.upload_dataset("lerobot/pusht", policy_version="pusht-v1")
 
# Local dataset on a workstation.
lerobot.upload_dataset("/data/robot_runs/2026-05-10/", policy_version="pusht-v1")
 
# Pin the Hub revision to a specific commit / tag / branch.
lerobot.upload_dataset(
    "lerobot/aloha_static_cups_open",
    revision="v2.1",
    policy_version="aloha-v1",
)

For private or gated datasets, set HF_TOKEN in your environment - huggingface_hub reads it automatically.

Column auto-classification

LeRobot datasets use a strong dotted-column convention, so the classifier is mechanical. The mapping (first match wins):

Column pattern	→ Slot
`observation.images.<camera_key>`	`video` (mp4 source)
`action` or `action.<x>`	`actions`
`next.reward`, `next.done`, `next.success`, `next.<x>`	`episode_meta` (rolled into per-episode metadata)
`timestamp`, `frame_index`, `episode_index`, `index`, `task_index`	`internal` (skipped)
`observation.state`	`sensors`
Any other `observation.<x>`	`sensors`
Anything else	`sensors` (safe default)

Camera keys are read from info.json["features"], not from the parquet - LeRobot v2.1 stores image data in videos/.../<key>/...mp4 and references them by feature name only. The classifier is pure function - you can call lerobot.classify_column("...") to sanity- check what the encoder will do without writing anything to disk.

Multi-camera datasets

When a dataset has more than one observation.images.<key> feature, the adapter tiles the per-camera mp4s horizontally into a single video.mp4. Heights are black-padded so cameras with different resolutions still align. Cameras are emitted in the order they appear in info.json["features"], so the same dataset always produces the same mosaic.

If you only want one camera, pass canonical_camera:

lerobot.upload_episode(
    "lerobot/aloha_static_cups_open",
    episode_index=0,
    canonical_camera="observation.images.cam_high",
    policy_version="aloha-v1",
)

Single-camera uploads skip the opencv code path entirely - no tile, no re-encode. The source mp4 is copied byte-for-byte and pushed to R2 as-is.

How sensors / actions get packed

Each non-image column contributes a set of arrays into a single NPZ file per slot. Layout uses the column name as a namespace and preserves per-frame timestamps:

sensors.npz
  observation.state/_t_ns       int64[N]            # nanosecond timestamps
  observation.state/value       float32[N, K]       # per-frame state vector
  observation.environment_state/_t_ns  int64[N]
  observation.environment_state/value  float32[N, M]
 
actions.npz
  action/_t_ns                  int64[N]
  action/value                  float32[N, A]
  action.gripper/_t_ns          int64[N]
  action.gripper/value          float32[N]

_t_ns is recovered from the parquet's timestamp column (LeRobot stores it in seconds; we convert to nanoseconds for symmetry with the ROS 2 adapter and the SDK ingest schema). Columns that aren't 1-D scalars or fixed-length lists of floats - e.g. structs, ragged lists, strings - are skipped and recorded in metadata.skipped_columns so you can spot them in the portal.

Episode outcome

next.reward, next.done, next.success and any other next.* column don't go into actions.npz. They describe the episode's outcome, so they're rolled up into the episode-level metadata instead:

{
  "adapter": "lerobot",
  "lerobot_repo_id": "lerobot/aloha_static_cups_open",
  "lerobot_codebase_version": "v2.1",
  "lerobot_episode_index": 0,
  "lerobot_episode_length": 250,
  "lerobot_tasks": ["pick up the cup"],
  "lerobot_episode_outcome": {
    "next.done": true,
    "next.reward": 0.42,
    "next.reward_sum": 87.5
  }
}

next.reward_sum is the trajectory's cumulative reward (LeRobot stores per-step reward, so we sum once during encoding) - what training pipelines usually want as a single quality signal per run.

Bulk uploads with progress

upload_dataset walks every trajectory by default. Pass episode_indices= to upload a slice, and on_progress= to surface per-episode progress in your own UI:

def progress(done, total, episode, error):
    if error is not None:
        print(f"  [{done}/{total}] FAILED: {error}")
    else:
        print(f"  [{done}/{total}] {episode.id}")
 
lerobot.upload_dataset(
    "lerobot/aloha_static_cups_open",
    policy_version="aloha-v1",
    env_version="aloha-cell-1",
    episode_indices=range(0, 10),
    on_progress=progress,
)

Errors don't abort the loop by default - a single corrupted parquet shouldn't kill a 50-episode upload. Pass stop_on_error=True to fail fast.

Encode-then-handle-it-yourself

encode_episode exposes the artifacts as files so you can inspect or post-process before uploading:

encoded = lerobot.encode_episode(
    "lerobot/aloha_static_cups_open",
    episode_index=0,
    output_dir="/tmp/encoded/",
)
 
print(encoded.duration_s, encoded.fps)
# 5.0 50.0
print([a.path for a in (encoded.video, encoded.sensors, encoded.actions) if a])
# [PosixPath('/tmp/encoded/video.mp4'),
#  PosixPath('/tmp/encoded/sensors.npz'),
#  PosixPath('/tmp/encoded/actions.npz')]
print(encoded.metadata["lerobot_episode_outcome"])
# {'next.done': True, 'next.reward': 0.95, 'next.reward_sum': 23.4}

Then drive start_episode / upload_* directly. Same plumbing upload_episode uses internally.

Format compatibility

LeRobot version	Status	Notes
v2.0 / v2.1	✅ supported	Used by virtually every public `lerobot/*` Hub dataset as of May 2026. One parquet per episode, one mp4 per episode per camera.
v3.0	❌ not yet	Multi-episode parquet shards (introduced late 2025). The adapter raises a clear `ConfigurationError` pointing at the v2.1 revision fallback. v3.0 support tracked for a follow-up release.

If you hit a v3.0 dataset and need it now: pin the Hub revision to a v2.1 tag if one exists (revision="v2.1"), or convert locally with the lerobot CLI's downgrade script. Otherwise open an issue - we're prioritising v3.0 by demand.

Errors

Exception	When
`ConfigurationError`	Repo / path doesn't exist, format is v3.0, parquet/mp4 missing, or `pyarrow` / `huggingface_hub` aren't installed
`AuthError`	API key bad / revoked (raised by the underlying `start_episode`)
`ValidationError`	Server rejected the create payload
`TransportError`	Network / DNS / timeout during the create or upload

If an upload fails partway through, the adapter (via Client.start_episode's standard handling) flips the run to status="failed" with the failure reason in metadata.failure_reason before re-raising - so you don't end up with ghostly "recording" runs in the portal.