robotrace.log_episode
The single one-shot entry point for ingesting an episode. Equivalent
to start_episode(...) → upload all
artifacts → finalize. Use this for the 95% case of "I have files on
disk, log them and move on."
The contract is sacred
This signature is the sacred SDK contract. Once we cut 1.0.0,
breaking it requires:
- A major version bump (
1.x→2.0) - At least one minor of
DeprecationWarningbefore the break ships, so existing training scripts get an early warning instead of a TypeError
Until 1.0.0 (we're at 0.1.0a6 today) we may still iterate on
the shape - every change lands in the SDK changelog.
Already on OpenTelemetry? Calling
log_episode(...)inside an active OTel span attachestrace_id/span_id/traceparentto the episode automatically - no new kwargs. See OpenTelemetry trace correlation for the install step (pip install 'robotrace-dev[otel]==0.1.0a6') and how the portal deep-links into your APM.
Signature
def log_episode(
*,
# Identification
name: str | None = None,
source: Literal["real", "sim", "replay"] = "real",
robot: str | None = None,
# Reproducibility - load-bearing
policy_version: str | None = None,
env_version: str | None = None,
git_sha: str | None = None,
seed: int | None = None,
# Artifact paths (uploaded inline via signed PUT URLs)
video: str | Path | None = None,
sensors: str | Path | None = None,
actions: str | Path | None = None,
# Run details
duration_s: float | None = None,
fps: float | None = None,
metadata: Mapping[str, Any] | None = None,
# Final state - defaults to "ready". Pass "failed" when the run
# errored before producing usable data.
status: Literal["ready", "failed"] = "ready",
) -> EpisodeAll arguments are keyword-only - positional calls raise TypeError.
This is intentional: it lets us add new params without breaking
older call sites.
Identification
name: str | None
Human-readable label for the run, shown in the episodes list. Falls
back to episode_<short_id> when omitted. Use the same naming
scheme across runs of the same task - it makes the list filterable.
source: "real" | "sim" | "replay"
Where the episode came from:
real- physical robot. The default.sim- simulator (MuJoCo, Genesis, Isaac, Drake, etc.).replay- generated by re-rolling a policy against a previously recorded observation stream. The eval engine sets this for you; you generally don't pass it manually.
robot: str | None
Stable identifier for the physical robot or sim configuration that
produced the episode. Recommend a short slug
(halcyon-bimanual-01, franka-right, ur5-cell-3) so the portal
can group runs across days.
Reproducibility (load-bearing)
These four fields exist so future-you can re-roll a new policy against this episode and know what changed. Don't drop them to "simplify" - the eval engine literally can't run without them.
policy_version: str | None
A stable identifier for the policy / model checkpoint that produced this episode. Conventions we recommend:
| Style | Example |
|---|---|
| SL/IL | ckpt_2026-05-01_step_180k |
| RL | ppo_2026-05-01_seed42 |
| Frozen baseline | baseline_v1 |
| VLA | pap-v3.2.1 (semver against the policy) |
Whatever you pick - make it resolvable. The re-roll feature can only re-run a policy version it can locate, so don't put random hashes here unless your registry can map them back to weights.
env_version: str | None
The environment / world version. For sim, the build hash or config
tag (mujoco_warehouse_v3, genesis-rev412). For real-world, the
workcell setup version (cell_a_2026-04-12). Required so re-rolls
know whether comparing across policy_versions is fair.
git_sha: str | None
The git SHA of your training/inference code at the time the episode was produced. We don't validate that the SHA exists in any specific repo - that's between you and your CI. Seven characters minimum is the convention.
seed: int | None
The seed used by the policy / env. If your stack uses multiple
seeds, pass the highest-level one and stash the rest in metadata.
Artifacts
Local file paths. Each is uploaded to Cloudflare R2 via a short-lived signed PUT URL - bytes never touch the RoboTrace origin server. The SDK streams from disk so memory stays flat regardless of file size.
video: str | Path | None
A video file (.mp4, .webm, .mov). The signed URL is minted
with Content-Type: video/mp4, so the file's actual content type
needs to match. Files up to 8 GB are supported in Phase 1; split
longer episodes.
sensors: str | Path | None
A serialized sensor blob - typically a .npy, .npz, .h5, or
.bin file containing per-step sensor arrays ((T, ...) shaped, time
axis first). Format is opaque to the server: we store the bytes and
let your replay tooling deserialize them.
actions: str | Path | None
A serialized actions blob - typically a .parquet, .feather, or
.npy file containing the (T, action_dim) action vector. Required
if you want to re-roll a different policy on this episode later.
The SDK sanity-checks file extensions against slot names. Passing
actions="run.mp4" raises ConfigurationError - likely the kwargs
got swapped.
Run details
duration_s: float | None
Wall-clock duration of the run in seconds. Shown on the detail page and used by the dashboard heatmap to weight cells.
fps: float | None
Sampling rate for the recorded sensors / actions. Used by the replay viewer (when it ships) to align video and sensor tracks.
metadata: Mapping[str, Any] | None
Free-form JSON metadata stored as metadata jsonb on the episode
row. Use it for anything that doesn't fit the standard fields:
operator, lighting, shift, hardware revision, task outcome, etc.
Don't put bytes or raw sensor values here - that's what the artifact slots are for. The column is indexed for JSON search but not designed for multi-MB blobs.
status: "ready" | "failed"
Final state to flip the episode into. Defaults to "ready". Pass
"failed" when you know the run errored before producing usable
data - the episode still appears in the list but is filtered out of
"recent successful runs" cards.
Return value
@dataclass
class Episode:
id: str # uuid, as str
status: str # "ready" or "failed"
storage: Literal["r2", "unconfigured"]
upload_urls: dict[ArtifactKind, UploadUrl]You rarely need the return value from log_episode - by the time it
returns, everything's already uploaded and finalized. Useful when
you want to capture the episode id for your own logs:
ep = rt.log_episode(...)
my_logger.info("logged episode", episode_id=ep.id)Errors
log_episode raises typed exceptions on every failure path. See
Errors for the full hierarchy and recovery
patterns. The most common ones in this call:
| Exception | When |
|---|---|
ConfigurationError | api_key / base_url missing, or a file path doesn't exist |
AuthError | API key bad / revoked |
ValidationError | Payload didn't pass server-side validation |
ConflictError | (rare) Episode is somehow already archived |
TransportError | Network / DNS / timeout |
ServerError | 5xx - flag for retries |
If an upload fails partway through, the SDK auto-flips the run to
status="failed" with the failure reason in metadata.failure_reason
before re-raising - so you don't end up with ghostly "recording"
runs in the portal.
Don'ts
- Don't call
log_episodefrom inside your training inner loop. Rate-limit at episode boundaries, not per step. - Don't put episode bytes in
metadata. The DB is for metadata, R2 is for bytes. - Don't log the API key in your training script - use environment variables. The SDK never logs the key value.
- Don't pass positional arguments. The contract is keyword-only on purpose.