`robotrace.log_episode`

The single one-shot entry point for ingesting an episode. Equivalent to start_episode(...) → upload all artifacts → finalize. Use this for the 95% case of "I have files on disk, log them and move on."

The contract is sacred

This signature is the sacred SDK contract. Once we cut 1.0.0, breaking it requires:

A major version bump (1.x → 2.0)
At least one minor of DeprecationWarning before the break ships, so existing training scripts get an early warning instead of a TypeError

Until 1.0.0 (we're at 0.1.0a6 today) we may still iterate on the shape - every change lands in the SDK changelog.

Already on OpenTelemetry? Calling log_episode(...) inside an active OTel span attaches trace_id / span_id / traceparent to the episode automatically - no new kwargs. See OpenTelemetry trace correlation for the install step (pip install 'robotrace-dev[otel]==0.1.0a6') and how the portal deep-links into your APM.

Signature

def log_episode(
    *,
    # Identification
    name: str | None = None,
    source: Literal["real", "sim", "replay"] = "real",
    robot: str | None = None,
 
    # Reproducibility - load-bearing
    policy_version: str | None = None,
    env_version: str | None = None,
    git_sha: str | None = None,
    seed: int | None = None,
 
    # Artifact paths (uploaded inline via signed PUT URLs)
    video: str | Path | None = None,
    sensors: str | Path | None = None,
    actions: str | Path | None = None,
 
    # Run details
    duration_s: float | None = None,
    fps: float | None = None,
    metadata: Mapping[str, Any] | None = None,
 
    # Final state - defaults to "ready". Pass "failed" when the run
    # errored before producing usable data.
    status: Literal["ready", "failed"] = "ready",
) -> Episode

All arguments are keyword-only - positional calls raise TypeError. This is intentional: it lets us add new params without breaking older call sites.

Identification

`name: str | None`

Human-readable label for the run, shown in the episodes list. Falls back to episode_<short_id> when omitted. Use the same naming scheme across runs of the same task - it makes the list filterable.

`source: "real" | "sim" | "replay"`

Where the episode came from:

real - physical robot. The default.
sim - simulator (MuJoCo, Genesis, Isaac, Drake, etc.).
replay - generated by re-rolling a policy against a previously recorded observation stream. The eval engine sets this for you; you generally don't pass it manually.

`robot: str | None`

Stable identifier for the physical robot or sim configuration that produced the episode. Recommend a short slug (halcyon-bimanual-01, franka-right, ur5-cell-3) so the portal can group runs across days.

Reproducibility (load-bearing)

These four fields exist so future-you can re-roll a new policy against this episode and know what changed. Don't drop them to "simplify" - the eval engine literally can't run without them.

`policy_version: str | None`

A stable identifier for the policy / model checkpoint that produced this episode. Conventions we recommend:

Style	Example
SL/IL	`ckpt_2026-05-01_step_180k`
RL	`ppo_2026-05-01_seed42`
Frozen baseline	`baseline_v1`
VLA	`pap-v3.2.1` (semver against the policy)

Whatever you pick - make it resolvable. The re-roll feature can only re-run a policy version it can locate, so don't put random hashes here unless your registry can map them back to weights.

`env_version: str | None`

The environment / world version. For sim, the build hash or config tag (mujoco_warehouse_v3, genesis-rev412). For real-world, the workcell setup version (cell_a_2026-04-12). Required so re-rolls know whether comparing across policy_versions is fair.

`git_sha: str | None`

The git SHA of your training/inference code at the time the episode was produced. We don't validate that the SHA exists in any specific repo - that's between you and your CI. Seven characters minimum is the convention.

`seed: int | None`

The seed used by the policy / env. If your stack uses multiple seeds, pass the highest-level one and stash the rest in metadata.

Artifacts

Local file paths. Each is uploaded to Cloudflare R2 via a short-lived signed PUT URL - bytes never touch the RoboTrace origin server. The SDK streams from disk so memory stays flat regardless of file size.

`video: str | Path | None`

A video file (.mp4, .webm, .mov). The signed URL is minted with Content-Type: video/mp4, so the file's actual content type needs to match. Files up to 8 GB are supported in Phase 1; split longer episodes.

`sensors: str | Path | None`

A serialized sensor blob - typically a .npy, .npz, .h5, or .bin file containing per-step sensor arrays ((T, ...) shaped, time axis first). Format is opaque to the server: we store the bytes and let your replay tooling deserialize them.

`actions: str | Path | None`

A serialized actions blob - typically a .parquet, .feather, or .npy file containing the (T, action_dim) action vector. Required if you want to re-roll a different policy on this episode later.

The SDK sanity-checks file extensions against slot names. Passing actions="run.mp4" raises ConfigurationError - likely the kwargs got swapped.

Run details

`duration_s: float | None`

Wall-clock duration of the run in seconds. Shown on the detail page and used by the dashboard heatmap to weight cells.

`fps: float | None`

Sampling rate for the recorded sensors / actions. Used by the replay viewer (when it ships) to align video and sensor tracks.

`metadata: Mapping[str, Any] | None`

Free-form JSON metadata stored as metadata jsonb on the episode row. Use it for anything that doesn't fit the standard fields: operator, lighting, shift, hardware revision, task outcome, etc.

Don't put bytes or raw sensor values here - that's what the artifact slots are for. The column is indexed for JSON search but not designed for multi-MB blobs.

`status: "ready" | "failed"`

Final state to flip the episode into. Defaults to "ready". Pass "failed" when you know the run errored before producing usable data - the episode still appears in the list but is filtered out of "recent successful runs" cards.

Return value

@dataclass
class Episode:
    id: str                               # uuid, as str
    status: str                           # "ready" or "failed"
    storage: Literal["r2", "unconfigured"]
    upload_urls: dict[ArtifactKind, UploadUrl]

You rarely need the return value from log_episode - by the time it returns, everything's already uploaded and finalized. Useful when you want to capture the episode id for your own logs:

ep = rt.log_episode(...)
my_logger.info("logged episode", episode_id=ep.id)

Errors

log_episode raises typed exceptions on every failure path. See Errors for the full hierarchy and recovery patterns. The most common ones in this call:

Exception	When
`ConfigurationError`	`api_key` / `base_url` missing, or a file path doesn't exist
`AuthError`	API key bad / revoked
`ValidationError`	Payload didn't pass server-side validation
`ConflictError`	(rare) Episode is somehow already archived
`TransportError`	Network / DNS / timeout
`ServerError`	5xx - flag for retries

If an upload fails partway through, the SDK auto-flips the run to status="failed" with the failure reason in metadata.failure_reason before re-raising - so you don't end up with ghostly "recording" runs in the portal.

Don'ts

Don't call log_episode from inside your training inner loop. Rate-limit at episode boundaries, not per step.
Don't put episode bytes in metadata. The DB is for metadata, R2 is for bytes.
Don't log the API key in your training script - use environment variables. The SDK never logs the key value.
Don't pass positional arguments. The contract is keyword-only on purpose.