learn / 05·10 — under agents · the ledger

a recordNOBODYcan rewrite

An agent works all night and leaves a log of every move. But a log is just text — the agent could edit it, a second agent could, you could. The ledger seals that log: each raw line folded into a hash chain whose single head is signed by the tenant's identity. Two properties fall out — tamper-evident and attributable — and neither the agent nor you can quietly rewrite the past.

the ledger11 min read
A lone figure standing beneath an immense glowing wall of chained tablets, each tablet sealed with a single bright stamp that links to the next, rising into a skylight — monumental ledger-wall dwarfing the small human, warm amber and signal-green light, 1970s sci-fi style

a log anyone can REWRITE

You hired an agent for the long run — that was the whole pitch of the parent lesson. It worked all night. This morning there's a file, _steps.jsonl, with a line for every tool call it made: the command, the arguments, the output, the timestamp. It reads like a perfect record of the night.

Now the uncomfortable question. Did it actually do what the log says? Because _steps.jsonl is plain text. The agent could rewrite it. A second agent sharing the workspace could rewrite it. You could rewrite it. And after the one moment you'd ever need an honest record — a prompt injection, a bad deploy, a step that shouldn't have run — the thing you reach for is exactly the thing anyone with file access can forge.

Logs answer what happened only if nobody touched them, and nothing about a text file enforces that. Post-incident forensics on an editable record is theater. The ledger is the part that makes the record worth trusting.

the DEFINITION

led·ger /ˈle·jər/ noun

1. a seal over the run log, not a second log: the raw bytes of _steps.jsonl folded into a SHA-256 hash chain whose single head is signed by the tenant's did:key — yielding two checkable properties, tamper_evident (the log is intact) and attributable (it's intact and bound to a named identity).

Those two words are not marketing — they're the exact field names the verifier returns. A ledger doesn't store anything new about the run; it stores one number and one signature that, together, make the existing log impossible to alter without anyone noticing. The rest of this lesson is how ~30 lines of code buy you that.

what gets logged — the CHOKEPOINT

A record you can trust starts before any sealing: it starts with where the writing happens. In this system the log is written by the agent loop, not by the agent. Every tool call passes through one place — the chokepoint — and that place appends the line, always, regardless of whether any caller wired up a callback. The agent cannot opt a step out of the record, because it never holds the pen. Nothing escapes by construction.

Every step is also wall-clock bounded — a tool that wedges is killed at 150 seconds and recorded as a logged error, never a silent stall. The event shape is the same everywhere it's written:

{ step, agent, tool, args, output, exit_code, error, dur_ms, ts }
   step    — the call's index in this run
   agent   — nil for a solo agent; the member's name in a fleet
   tool    — what was invoked  ·  args / output — the call + its result
   dur_ms  — monotonic-clock duration  ·  ts — unix seconds

Three different writers produce that one shape into that one file: the native agent loop, command calls made from inside a WebAssembly component through the Dock, and instrumentation spans bridged out of WASM. They use different prefixes on the tool field — a bare name, a command: name, a wasm: name — but every line lands in the same _steps.jsonl, so the seal covers all of them at once.

flowchart LR
  loop["the agent loop
tool = git, fetch, …"] dock["WASM components
via the Dock
tool = command:<name>"] span["WASM spans
instrument-enter/exit
tool = wasm:<name>"] file[("_steps.jsonl
one file · one shape
appended lock-free")] loop -- "appends" --> file dock -- "appends" --> file span -- "appends" --> file style file fill:#9fc4e8,stroke:#121316,stroke-width:2.5px style loop fill:#ffffff,stroke:#121316 style dock fill:#ffffff,stroke:#121316 style span fill:#ffffff,stroke:#121316

Read the picture as three lanes draining into one basin. The loop writes the agent's own moves; the Dock lane writes what sandboxed components do on the agent's behalf; the span lane writes timing from inside WASM. None of the three can choose to stay dry. One honest note: the span bridge is built and unit-tested, but the guest-side wiring that feeds it is still an open to-do — so today the first two lanes carry the run, and the third is plumbed ahead of its source.

the chain — one head for the NIGHT

Here's the first half of the trick, and it's transferable to anything you ever build. You don't need a blockchain or an audit service. You fold every raw log line into a running SHA-256 and the entire history collapses into one 64-character number — the head.

h(0) = sha256("workbooks-ledger-v1")          the genesis constant
h(i) = sha256( h(i-1)  ++  raw_line_i )        fold each line in
head = h(n)                                     one number for the whole run

Three details make it strong. First, the bytes hashed are the raw line — the file exactly as written, not a re-parse and re-encode — so there's no canonicalization step to disagree about and no re-serialization bug to hide a change in. Second, because each hash feeds the next, altering, inserting, dropping, or reordering any line changes that line's hash and every hash after it, all the way to the head. Third, the sealed count is checked separately, so even a change that somehow preserved the head couldn't preserve the line count.

Walk one real fold. Suppose step three of the night was a commit:

{"step":3,"agent":"waldo","tool":"git","args":{"cmd":"commit"},
 "output":"[main 4f2c1d9] post: morning links (pushed)",
 "exit_code":0,"error":null,"dur_ms":812,"ts":1765540210}

   h(3) = sha256( h(2)  ++  the raw bytes of that line )

Nothing is parsed. The line is hashed as the literal text on disk, byte for byte. Do that for all 47 lines and you have one head. Picture the chain — and picture what an edit does to it:

flowchart LR
  g["genesis
workbooks-ledger-v1"] --> h1["h1"] --> h2["h2"] --> h3["h3"] --> hn["head
9b3c…"] e2["edit line 2"] -. "changes h2" .-> x2["h2′"] x2 -. "and h3" .-> x3["h3′"] x3 -. "and the head" .-> xn["head′ ≠ 9b3c…"] style g fill:#aee5c2,stroke:#121316 style hn fill:#9fc4e8,stroke:#121316,stroke-width:2.5px style e2 fill:#f3c5a3,stroke:#121316 style x2 fill:#f3c5a3,stroke:#121316 style x3 fill:#f3c5a3,stroke:#121316 style xn fill:#f3c5a3,stroke:#121316

The top row is the honest chain: genesis seeds h1, each link folds the next line in, and the run ends on a head of 9b3c…. The bottom branch is a tamperer touching line two — its hash becomes h2-prime, which forces h3-prime, which forces a different head entirely. There is no edit that moves a line without moving the head. That's tamper-evidence, and it cost about thirty lines of code.

the signature — whose run was IT

A hash proves the log is intact. It does not prove whose it is — anyone could compute a head over any file. The second half of the trick is a signature: sign the head with the tenant's Ed25519 did:key and the run is bound to an identity that's self-certifying.

A did:key looks like did:key:z6Mk…. The part after z6Mk is just the Ed25519 public key, multibase-encoded with a two-byte codec prefix — the W3C-interoperable form, not a bespoke hex string. Self-certifying means the verifier reads the public key straight out of the DID and checks the signature with it: no certificate authority, no registry lookup, no third party to ask whether this identity is real. The string is the credential.

Seal the run and you get one small file, _ledger.json:

{"v":1,
 "did":"did:key:z6MkpTHzv6frd3xkS9hvSDXYNS6JLkPGfDDM1jx6cAYy3eUR",
 "count":47,
 "head":"9b3c2e51… 64 hex chars …",
 "sig":"a41f… 128 hex chars …",
 "ts":1765540233}
fieldwhat it iswho checks it
vledger format version (1)the reader, for compatibility
didthe tenant identity that sealed itanyone — it carries its own pubkey
counthow many lines were sealedverify — caught even if the head matched
headthe chain's final hashverify — recompute and compare
sigEd25519 signature over the headverify — check under the DID's pubkey
tswhen the seal happened, unix secondsthe reader, for ordering

The verdict of that table in one sentence: did and head say who and what, while sig and count are what verification actually recomputes. And the two properties stack in a deliberate order — attributable is defined as tamper_evident AND the signature verifies. A broken chain can never be attributable, because there's no point asking whose a log is once you know it's been altered. Intact first, then named.

verifying a RUN

Verification recomputes the chain over the current _steps.jsonl and reports two booleans against the sealed file. There are two surfaces. From the CLI:

$ wb ledger morning-links
tamper-evident=ok attributable=ok count=47 did=did:key:z6Mkp…

Now edit one byte of step twelve's output in _steps.jsonl and run it again:

$ wb ledger morning-links
tamper-evident=FAIL attributable=FAIL count=47 did=did:key:z6Mkp…

Both fall together, and that's the design: attributable is tamper_evident and the signature — so the instant the chain breaks, attributability goes with it. The HTTP twin returns the same verdict as JSON:

$ curl https://<nexus>/api/ledger/morning-links
{"tamper_evident":true,"attributable":true,
 "did":"did:key:z6Mkp…","head":"9b3c…","count":47}

And if a run crashed before its end-of-run seal, or someone deleted both files, there is simply nothing to check:

{"error":"no ledger for this run"}

Read that last one carefully: it is not the same as a passing check. It means absence — and absence is the tell. There are four outcomes worth knowing by sight:

outcometamper-evidentattributablewhat happened
both okokoklog intact and bound to the sealing identity
chain ok, sig failsokFAILlog intact, but the key doesn't match the DID — wrong / rotated identity
chain failsFAILFAILa line changed, was inserted, dropped, or reordered
no ledgererrornever sealed (crash) or the seal was deleted — absence, not proof

The honest read of that matrix: only the top row is a clean bill of health. Everything below it is a question you now know to ask, which is the entire value — the record stopped being something you have to take on faith.

where identity LIVES

depth rung · skippable — the key, and how it survives a redeploy

The signature is only as durable as the key behind it. Each tenant gets an Ed25519 keypair, minted once at repo init and stored under .workbooks/<tenant>.ed25519. The private half never enters version control — .workbooks/ is in the auto-generated .gitignore, owned by one boundary so the rule can't drift between the git, bundle, and library egress paths. A bulk git add -A is safe precisely because of this: the sweep physically cannot pick up the signing key or the telemetry database.

Now the failure mode the deploy model forces you to confront. Containers are wiped on every redeploy. Without a persisted seed, a redeploy would mint a brand-new DID — and every prior signature, every sealed run, would suddenly be attributed to a key nobody holds. So the primary tenant's keypair is restored deterministically from a 32-byte seed kept as a deployment secret:

$ fly secrets set WB_SIGNING_KEY=$(head -c 32 /dev/urandom | base64)
  # WB_PRIMARY_TENANT defaults to "dev"
sequenceDiagram
  participant B as boot
  participant E as env
  participant K as keystore
  B->>E: WB_SIGNING_KEY present?
  alt seed persisted
    E-->>B: yes — 32-byte seed
    B->>K: restore the same keypair
    Note over K: same DID as last month —
old ledgers still verify else no seed B->>K: mint a fresh keypair Note over K: new DID — prior signatures
orphaned (chain still checks) end

Read the sequence as a fork at boot. Boot asks the environment whether the seed is set. If it is, the keystore restores the exact keypair from before — same DID this month as last, so a run sealed weeks ago still verifies under today's key. If the seed is missing, boot mints a fresh keypair: the old ledgers stay tamper-evident — the chain is self-contained and still checks — but attributable now binds them to a key no one holds. Set the secret and the whole history stays signed by one continuous identity. One caveat stated plainly: only the primary tenant is restored this way; other tenants regenerate on redeploy until bring-your-own-key storage lands, which is fine in the single-tenant case but worth knowing.

one key, two RAILS

depth rung · skippable — the same identity, signing two different things

The did:key that signs runs is not single-purpose. The very same key signs published artifacts: a manifest over the canonical JSON of a workbook, embedded in the published HTML, verifiable with wb verify <file>. One identity, two rails — run provenance and artifact provenance — and a third power waiting in the wings.

railwhat gets signedtravels with the bytes?
ledgerthe head over a run's _steps.jsonlno — verify needs the workdir
manifesta canonical-JSON manifest in published HTMLyes — embedded in the file
x25519(receives) wrapped content keys for sealed bundlesthe wrap travels; release is escrow-only

The verdict of that table: the manifest rail is the portable one — a signed workbook carries its proof inside itself and wb verify checks it anywhere. The ledger rail is not yet portable; verifying a run still needs filesystem access to the workdir. And the same identity has a third power — mapped to an X25519 key so encrypted content keys can be wrapped to a DID — but that map is not yet checked against the reference vectors, so key release stays escrow-only for now. One keypair, three powers: sign runs, sign artifacts, receive encrypted keys.

private by default, public by ANCHOR

By default the ledger never leaves home. _steps.jsonl, the status and trace files, the telemetry database, and _ledger.json itself are session-private — never committed, bundled, or shipped. The rule the whole system holds: sharing exposes the work, never the session that produced it. So the seal exists, and proves the run, without anyone outside ever seeing it.

Making the head public is a separate, explicit act: anchor. It commits _ledger.json into the tenant's git repo, authored by a fixed ledger identity, so the head is witnessed by git's own content-addressing — outside the workdir, where the agent can't reach it.

flowchart LR
  subgraph wd["the workdir — private by default"]
    steps["_steps.jsonl"]
    led["_ledger.json"]
  end
  subgraph repo["the tenant git repo — witnessed"]
    commit["ledger-<slug> commit
head content-addressed"] end led -- "anchor (explicit opt-in)" --> commit style wd fill:#fbfaf6,stroke:#121316 style repo fill:#fbfaf6,stroke:#121316 style led fill:#9fc4e8,stroke:#121316,stroke-width:2.5px style commit fill:#aee5c2,stroke:#121316

Read the flow as a deliberate door. On the left, the workdir holds the steps and the sealed ledger, both private. The single arrow is anchoring — an opt-in act that copies the head into the repo on the right, where git's hashing witnesses it from outside. One honest caveat: anchoring beyond git — to an ATproto record or an external op-log — is the planned next leg, and the anchor step is not yet wired to run automatically on every run. The seal is automatic; making it externally witnessed is still a choice you make.

what the seal does NOT do

Honesty section, because the failure modes are part of understanding the guarantee.

  • Evidence, not prevention. The chain catches a rewrite; it does not stop one. Anyone with file access can still edit _steps.jsonl — they just can't do it and keep the head matching the signature. The value is that tampering becomes impossible to hide, not impossible to attempt.
  • Seal is end-of-run and best-effort. It happens once, at the close of a workflow run, and never blocks the result. A run that crashes before it seals has no ledger — verify returns no ledger for this run, which is honest about the gap rather than papering over it.
  • Anchoring is the unfinished leg. External verifiability is opt-in and partially built, as the last section said. Don't read the ledger as automatically witnessed off-host today.
  • Verification needs the workdir. Checking a run is a runtime or CLI operation against the files, not yet a portable offline artifact check. Contrast wb verify on a signed workbook, which is portable — that gap is real.
  • Provenance of actions, not a transcript. The chained line carries only the first 200 characters of each tool's output. The ledger proves what was done; it is not a full recording of every byte produced.
  • The trust root is the chokepoint. The chain proves the log wasn't changed since sealing. It cannot prove the steps were logged honestly during the run — that guarantee comes from logging living at the loop, where nothing escapes by construction, which is exactly why the writing was never put in the agent's hands.

questions people actually ASK

Can the agent just delete the log and the ledger?

It can — and then verify returns no ledger for this run, which is not a pass. Absence is the tell: a missing ledger is itself a signal something is wrong. And anchoring exists for exactly this — once the head is committed into git, deleting the workdir copy doesn't unwitness it.

Why not a blockchain?

Because you don't need consensus to get the property you want. A hash chain gives you tamper-evidence and one signature gives you attribution — that's the whole guarantee, in about thirty lines. Consensus is for agreeing with strangers who don't trust each other; this is one tenant signing their own runs.

What if I edit my own steps file?

Verify FAILs — and that's the point, not a bug. The design goal is that neither the agent nor the user can rewrite the record. You holding the key lets you sign runs; it doesn't let you alter a sealed one without the head and signature disagreeing.

Does verifying cost anything?

One file read and N hashes — recompute the chain over the current _steps.jsonl and compare to the sealed head and count. It's an index-speed operation, not a service call.

Is args in the chain?

Yes — the whole raw line is hashed, byte for byte, and args is part of that line. Nothing in a step is folded in selectively; if it's in the log, it's in the chain.

Whose identity signs it — and does it survive a redeploy?

The tenant's did:key, restored from a persisted 32-byte seed for the primary tenant so the DID is stable across redeploys. Skip the seed and the next boot mints a new identity — old ledgers stay tamper-evident but lose attribution to a key nobody holds.

keep GOING

The ledger is one of the real edges around an agent — follow the others.