a log anyone can REWRITE
You hired an agent for the long run — that was the
whole pitch of the parent lesson. It worked all night. This morning there's
a file, _steps.jsonl, with a line for every tool call it made:
the command, the arguments, the output, the timestamp. It reads like a
perfect record of the night.
Now the uncomfortable question. Did it actually do what the log says?
Because _steps.jsonl is plain text. The agent could rewrite it.
A second agent sharing the workspace could rewrite it. You could
rewrite it. And after the one moment you'd ever need an honest record — a
prompt injection, a bad deploy, a step that shouldn't have run — the thing
you reach for is exactly the thing anyone with file access can forge.
Logs answer what happened only if nobody touched them, and nothing about a text file enforces that. Post-incident forensics on an editable record is theater. The ledger is the part that makes the record worth trusting.
the DEFINITION
1. a seal over the run log, not a second
log: the raw bytes of _steps.jsonl folded into a SHA-256
hash chain whose single head is signed by the tenant's
did:key — yielding two checkable properties,
tamper_evident (the log is intact) and
attributable (it's intact and bound to a named
identity).
Those two words are not marketing — they're the exact field names the verifier returns. A ledger doesn't store anything new about the run; it stores one number and one signature that, together, make the existing log impossible to alter without anyone noticing. The rest of this lesson is how ~30 lines of code buy you that.
what gets logged — the CHOKEPOINT
A record you can trust starts before any sealing: it starts with where the writing happens. In this system the log is written by the agent loop, not by the agent. Every tool call passes through one place — the chokepoint — and that place appends the line, always, regardless of whether any caller wired up a callback. The agent cannot opt a step out of the record, because it never holds the pen. Nothing escapes by construction.
Every step is also wall-clock bounded — a tool that wedges is killed at 150 seconds and recorded as a logged error, never a silent stall. The event shape is the same everywhere it's written:
{ step, agent, tool, args, output, exit_code, error, dur_ms, ts }
step — the call's index in this run
agent — nil for a solo agent; the member's name in a fleet
tool — what was invoked · args / output — the call + its result
dur_ms — monotonic-clock duration · ts — unix seconds
Three different writers produce that one shape into that one file: the
native agent loop, command calls made from inside a WebAssembly component
through the Dock, and instrumentation spans bridged out of WASM. They use
different prefixes on the tool field — a bare name, a
command: name, a wasm: name — but every line lands
in the same _steps.jsonl, so the seal covers all of them at
once.
flowchart LR loop["the agent loop
tool = git, fetch, …"] dock["WASM components
via the Dock
tool = command:<name>"] span["WASM spans
instrument-enter/exit
tool = wasm:<name>"] file[("_steps.jsonl
one file · one shape
appended lock-free")] loop -- "appends" --> file dock -- "appends" --> file span -- "appends" --> file style file fill:#9fc4e8,stroke:#121316,stroke-width:2.5px style loop fill:#ffffff,stroke:#121316 style dock fill:#ffffff,stroke:#121316 style span fill:#ffffff,stroke:#121316
Read the picture as three lanes draining into one basin. The loop writes the agent's own moves; the Dock lane writes what sandboxed components do on the agent's behalf; the span lane writes timing from inside WASM. None of the three can choose to stay dry. One honest note: the span bridge is built and unit-tested, but the guest-side wiring that feeds it is still an open to-do — so today the first two lanes carry the run, and the third is plumbed ahead of its source.
the chain — one head for the NIGHT
Here's the first half of the trick, and it's transferable to anything you ever build. You don't need a blockchain or an audit service. You fold every raw log line into a running SHA-256 and the entire history collapses into one 64-character number — the head.
h(0) = sha256("workbooks-ledger-v1") the genesis constant
h(i) = sha256( h(i-1) ++ raw_line_i ) fold each line in
head = h(n) one number for the whole run
Three details make it strong. First, the bytes hashed are the raw line — the file exactly as written, not a re-parse and re-encode — so there's no canonicalization step to disagree about and no re-serialization bug to hide a change in. Second, because each hash feeds the next, altering, inserting, dropping, or reordering any line changes that line's hash and every hash after it, all the way to the head. Third, the sealed count is checked separately, so even a change that somehow preserved the head couldn't preserve the line count.
Walk one real fold. Suppose step three of the night was a commit:
{"step":3,"agent":"waldo","tool":"git","args":{"cmd":"commit"},
"output":"[main 4f2c1d9] post: morning links (pushed)",
"exit_code":0,"error":null,"dur_ms":812,"ts":1765540210}
h(3) = sha256( h(2) ++ the raw bytes of that line )
Nothing is parsed. The line is hashed as the literal text on disk, byte for byte. Do that for all 47 lines and you have one head. Picture the chain — and picture what an edit does to it:
flowchart LR g["genesis
workbooks-ledger-v1"] --> h1["h1"] --> h2["h2"] --> h3["h3"] --> hn["head
9b3c…"] e2["edit line 2"] -. "changes h2" .-> x2["h2′"] x2 -. "and h3" .-> x3["h3′"] x3 -. "and the head" .-> xn["head′ ≠ 9b3c…"] style g fill:#aee5c2,stroke:#121316 style hn fill:#9fc4e8,stroke:#121316,stroke-width:2.5px style e2 fill:#f3c5a3,stroke:#121316 style x2 fill:#f3c5a3,stroke:#121316 style x3 fill:#f3c5a3,stroke:#121316 style xn fill:#f3c5a3,stroke:#121316
The top row is the honest chain: genesis seeds h1, each link folds the
next line in, and the run ends on a head of 9b3c…. The bottom
branch is a tamperer touching line two — its hash becomes h2-prime, which
forces h3-prime, which forces a different head entirely. There is no edit
that moves a line without moving the head. That's tamper-evidence,
and it cost about thirty lines of code.
the signature — whose run was IT
A hash proves the log is intact. It does not prove whose
it is — anyone could compute a head over any file. The second half of the
trick is a signature: sign the head with the tenant's Ed25519
did:key and the run is bound to an identity that's
self-certifying.
A did:key looks like
did:key:z6Mk…. The part after z6Mk is just the
Ed25519 public key, multibase-encoded with a two-byte codec prefix — the
W3C-interoperable form, not a bespoke hex string. Self-certifying means the
verifier reads the public key straight out of the DID and checks the
signature with it: no certificate authority, no registry lookup, no
third party to ask whether this identity is real. The string is the
credential.
Seal the run and you get one small file, _ledger.json:
{"v":1,
"did":"did:key:z6MkpTHzv6frd3xkS9hvSDXYNS6JLkPGfDDM1jx6cAYy3eUR",
"count":47,
"head":"9b3c2e51… 64 hex chars …",
"sig":"a41f… 128 hex chars …",
"ts":1765540233}
| field | what it is | who checks it |
|---|---|---|
| v | ledger format version (1) | the reader, for compatibility |
| did | the tenant identity that sealed it | anyone — it carries its own pubkey |
| count | how many lines were sealed | verify — caught even if the head matched |
| head | the chain's final hash | verify — recompute and compare |
| sig | Ed25519 signature over the head | verify — check under the DID's pubkey |
| ts | when the seal happened, unix seconds | the reader, for ordering |
The verdict of that table in one sentence: did and
head say who and what, while sig
and count are what verification actually recomputes. And the two
properties stack in a deliberate order — attributable is
defined as tamper_evident AND the signature verifies. A broken
chain can never be attributable, because there's no point asking whose a log
is once you know it's been altered. Intact first, then named.
verifying a RUN
Verification recomputes the chain over the current
_steps.jsonl and reports two booleans against the sealed file.
There are two surfaces. From the CLI:
$ wb ledger morning-links tamper-evident=ok attributable=ok count=47 did=did:key:z6Mkp…
Now edit one byte of step twelve's output in _steps.jsonl and
run it again:
$ wb ledger morning-links tamper-evident=FAIL attributable=FAIL count=47 did=did:key:z6Mkp…
Both fall together, and that's the design: attributable is
tamper_evident and the signature — so the instant the
chain breaks, attributability goes with it. The HTTP twin returns the same
verdict as JSON:
$ curl https://<nexus>/api/ledger/morning-links
{"tamper_evident":true,"attributable":true,
"did":"did:key:z6Mkp…","head":"9b3c…","count":47}
And if a run crashed before its end-of-run seal, or someone deleted both files, there is simply nothing to check:
{"error":"no ledger for this run"}
Read that last one carefully: it is not the same as a passing check. It means absence — and absence is the tell. There are four outcomes worth knowing by sight:
| outcome | tamper-evident | attributable | what happened |
|---|---|---|---|
| both ok | ok | ok | log intact and bound to the sealing identity |
| chain ok, sig fails | ok | FAIL | log intact, but the key doesn't match the DID — wrong / rotated identity |
| chain fails | FAIL | FAIL | a line changed, was inserted, dropped, or reordered |
| no ledger | error | never sealed (crash) or the seal was deleted — absence, not proof | |
The honest read of that matrix: only the top row is a clean bill of health. Everything below it is a question you now know to ask, which is the entire value — the record stopped being something you have to take on faith.
where identity LIVES
depth rung · skippable — the key, and how it survives a redeploy
The signature is only as durable as the key behind it. Each tenant gets an
Ed25519 keypair, minted once at repo init and stored under
.workbooks/<tenant>.ed25519. The private half never enters
version control — .workbooks/ is in the auto-generated
.gitignore, owned by one boundary so the rule can't drift
between the git, bundle, and library egress paths. A bulk
git add -A is safe precisely because of this: the sweep
physically cannot pick up the signing key or the telemetry database.
Now the failure mode the deploy model forces you to confront. Containers are wiped on every redeploy. Without a persisted seed, a redeploy would mint a brand-new DID — and every prior signature, every sealed run, would suddenly be attributed to a key nobody holds. So the primary tenant's keypair is restored deterministically from a 32-byte seed kept as a deployment secret:
$ fly secrets set WB_SIGNING_KEY=$(head -c 32 /dev/urandom | base64) # WB_PRIMARY_TENANT defaults to "dev"
sequenceDiagram
participant B as boot
participant E as env
participant K as keystore
B->>E: WB_SIGNING_KEY present?
alt seed persisted
E-->>B: yes — 32-byte seed
B->>K: restore the same keypair
Note over K: same DID as last month —
old ledgers still verify
else no seed
B->>K: mint a fresh keypair
Note over K: new DID — prior signatures
orphaned (chain still checks)
end
Read the sequence as a fork at boot. Boot asks the environment whether the
seed is set. If it is, the keystore restores the exact keypair from before —
same DID this month as last, so a run sealed weeks ago still verifies under
today's key. If the seed is missing, boot mints a fresh keypair: the old
ledgers stay tamper-evident — the chain is self-contained and still
checks — but attributable now binds them to a key no one holds.
Set the secret and the whole history stays signed by one continuous
identity. One caveat stated plainly: only the primary tenant is restored
this way; other tenants regenerate on redeploy until bring-your-own-key
storage lands, which is fine in the single-tenant case but worth knowing.
one key, two RAILS
depth rung · skippable — the same identity, signing two different things
The did:key that signs runs is not single-purpose. The very
same key signs published artifacts: a manifest over the canonical
JSON of a workbook, embedded in the published HTML, verifiable with
wb verify <file>. One identity, two rails — run provenance
and artifact provenance — and a third power waiting in the wings.
| rail | what gets signed | travels with the bytes? |
|---|---|---|
| ledger | the head over a run's _steps.jsonl | no — verify needs the workdir |
| manifest | a canonical-JSON manifest in published HTML | yes — embedded in the file |
| x25519 | (receives) wrapped content keys for sealed bundles | the wrap travels; release is escrow-only |
The verdict of that table: the manifest rail is the portable one — a
signed workbook carries its proof inside itself and wb verify
checks it anywhere. The ledger rail is not yet portable; verifying a run
still needs filesystem access to the workdir. And the same identity has a
third power — mapped to an X25519 key so encrypted content keys can be
wrapped to a DID — but that map is not yet checked against the reference
vectors, so key release stays escrow-only for now. One keypair, three powers:
sign runs, sign artifacts, receive encrypted keys.
private by default, public by ANCHOR
By default the ledger never leaves home. _steps.jsonl,
the status and trace files, the telemetry database, and
_ledger.json itself are session-private — never committed,
bundled, or shipped. The rule the whole system holds: sharing exposes the
work, never the session that produced it. So the seal exists, and
proves the run, without anyone outside ever seeing it.
Making the head public is a separate, explicit act: anchor. It
commits _ledger.json into the tenant's git repo, authored by a
fixed ledger identity, so the head is witnessed by git's own
content-addressing — outside the workdir, where the agent can't reach it.
flowchart LR
subgraph wd["the workdir — private by default"]
steps["_steps.jsonl"]
led["_ledger.json"]
end
subgraph repo["the tenant git repo — witnessed"]
commit["ledger-<slug> commit
head content-addressed"]
end
led -- "anchor (explicit opt-in)" --> commit
style wd fill:#fbfaf6,stroke:#121316
style repo fill:#fbfaf6,stroke:#121316
style led fill:#9fc4e8,stroke:#121316,stroke-width:2.5px
style commit fill:#aee5c2,stroke:#121316
Read the flow as a deliberate door. On the left, the workdir holds the steps and the sealed ledger, both private. The single arrow is anchoring — an opt-in act that copies the head into the repo on the right, where git's hashing witnesses it from outside. One honest caveat: anchoring beyond git — to an ATproto record or an external op-log — is the planned next leg, and the anchor step is not yet wired to run automatically on every run. The seal is automatic; making it externally witnessed is still a choice you make.
what the seal does NOT do
Honesty section, because the failure modes are part of understanding the guarantee.
- Evidence, not prevention. The chain catches a rewrite; it does
not stop one. Anyone with file access can still edit
_steps.jsonl— they just can't do it and keep the head matching the signature. The value is that tampering becomes impossible to hide, not impossible to attempt. - Seal is end-of-run and best-effort. It happens once, at the close of a workflow run, and never blocks the result. A run that crashes before it seals has no ledger — verify returns no ledger for this run, which is honest about the gap rather than papering over it.
- Anchoring is the unfinished leg. External verifiability is opt-in and partially built, as the last section said. Don't read the ledger as automatically witnessed off-host today.
- Verification needs the workdir. Checking a run is a runtime or
CLI operation against the files, not yet a portable offline artifact check.
Contrast
wb verifyon a signed workbook, which is portable — that gap is real. - Provenance of actions, not a transcript. The chained line carries only the first 200 characters of each tool's output. The ledger proves what was done; it is not a full recording of every byte produced.
- The trust root is the chokepoint. The chain proves the log wasn't changed since sealing. It cannot prove the steps were logged honestly during the run — that guarantee comes from logging living at the loop, where nothing escapes by construction, which is exactly why the writing was never put in the agent's hands.
questions people actually ASK
Can the agent just delete the log and the ledger?
It can — and then verify returns no ledger for this run, which is not a pass. Absence is the tell: a missing ledger is itself a signal something is wrong. And anchoring exists for exactly this — once the head is committed into git, deleting the workdir copy doesn't unwitness it.
Why not a blockchain?
Because you don't need consensus to get the property you want. A hash chain gives you tamper-evidence and one signature gives you attribution — that's the whole guarantee, in about thirty lines. Consensus is for agreeing with strangers who don't trust each other; this is one tenant signing their own runs.
What if I edit my own steps file?
Verify FAILs — and that's the point, not a bug. The design goal is that neither the agent nor the user can rewrite the record. You holding the key lets you sign runs; it doesn't let you alter a sealed one without the head and signature disagreeing.
Does verifying cost anything?
One file read and N hashes — recompute the chain over the current
_steps.jsonl and compare to the sealed head and count. It's an
index-speed operation, not a service call.
Is args in the chain?
Yes — the whole raw line is hashed, byte for byte, and
args is part of that line. Nothing in a step is folded in
selectively; if it's in the log, it's in the chain.
Whose identity signs it — and does it survive a redeploy?
The tenant's did:key, restored from a persisted 32-byte
seed for the primary tenant so the DID is stable across redeploys. Skip the
seed and the next boot mints a new identity — old ledgers stay
tamper-evident but lose attribution to a key nobody holds.
keep GOING
The ledger is one of the real edges around an agent — follow the others.