deployed software you can't DIFF
An agent is editing a live system while it runs — adding a blog post, reshaping a page, shipping a build. Three questions follow immediately, and the usual tools answer none of them. What changed? Can I trust that record? And how do I push my own change into the running system without stepping on the agent — without standing up a deploy pipeline for every edit?
The classic answers fail in opposite directions. Redeploying flattens the agent's in-flight work: the tar-over-ssh that copies a fresh tree over the box erases whatever the agent wrote since the last build. That clobber is not hypothetical — it's the origin story for this whole subsystem. And the other answer, "the agent says it updated the page," is not an audit trail. A claim in a chat log is not verifiable history.
So the requirement is sharp: a system where every change has a diff, the record is the kind a stranger can check, and edits flow in and out without either side erasing the other. There's a forty-year-old tool shaped exactly like that.
the DEFINITION
1. the discipline where the engine's per-tenant data root is a git repo: agent commits flow out as the public changelog, your pushes flow in as live updates, and access to the repo is the authorization — no deploy pipeline in between.
It is not "we integrate with git." The runtime's files — org sources,
page content, built output — literally live in a repo at
<WB_DATA>/<tenant>, one per tenant. So history, diff,
and rollback are free, and the loop has exactly two directions:
| direction | the call | what it means |
|---|---|---|
| outbound | commit_and_push | the agent commits → the host publishes the site → pushes to origin, in one step |
| inbound | pull | a human/CI push is fetched and merged in on the keeper tick — live within one tick |
The repo is the versioned source of truth; a SQLite control plane stays
alongside it as the fast index. The git layer itself is a thin wrapper that
shells to the git CLI — the same discipline by which the
package manager shells to cargo and bun.
every deploy is a COMMIT
Deploying a workbook does two things in one call: it writes the SQLite
index and it commits to the tenant repo. The commit message is
deploy <name>, and — this is the part that earns its
keep — the commit's author is the authenticated identity. The host
builds <tenant>@workbooks.local from the connection's
auth assigns and exports it as the git author and committer. Git's
attribution becomes auth's who.
sequenceDiagram participant C as client (authed) participant H as control plane participant DB as SQLite index participant G as tenant repo (git) C->>H: deploy "lander" H->>DB: write the fast index H->>G: commit "deploy lander" — author [email protected] Note over G: history · diff · rollback,
all free, all attributed to WHO deployed
Walk that picture as a sequence. An authenticated client asks the control
plane to deploy a workbook called lander. The plane updates the
SQLite index — the fast lookup — and in the same operation writes a git
commit, deploy lander, authored as the tenant identity. Two
records, one act: a queryable index for speed, and a content-addressed
commit for history. Because the author came from auth, the question "who
shipped this?" already has a cryptographic-grade answer; you don't bolt an
audit log on later.
One operational scar lives here too. Every git operation runs with
-c safe.directory=*, because on a deployed box the data
volume's uid mismatch made git silently refuse every command — agents wrote
stories to disk for days that never shipped, because the commit quietly
failed. The flag is the fix; the lesson is that a silent git is worse than a
loud one.
commits are the CHANGELOG
The keeper — the agent on the tick — does its work inside the tenant repo. Its working directory is the repo. So the keeper's commits are not a side effect of the changelog; they are the changelog. There is no separate "publish a release note" step that can drift from what actually happened.
That history is readable without an account. An anonymous
GET /_changes on the public plane returns the app's real git
log — capped at thirty entries — plus the keeper's status. Each entry is
exactly the shape git already knows:
| sha | ts | author | msg |
|---|---|---|---|
3fa9c12 | 1749686400 | dev | deploy lander |
9b1e07a | 1749683100 | dev | blog: ship growth post |
c4d8f55 | 1749679500 | dev | agent: refresh pricing copy |
Four fields — sha, timestamp, author, message — and that is the whole
public record: a verifiable history, not marketing. The authenticated twin,
GET /rcp/changes, feeds the inspector's agent tab. The feed's
own mechanics — the planes it sits on, how the player renders it — belong to
the changelogs sibling lesson; here it's enough to
know the feed is the literal log, and the log is the literal truth.
commit ⇒ publish ⇒ push, ATOMICALLY
An agent does not run git. The git tool is
host-brokered: the agent supplies only a commit message, and
the host does the rest — ensures the ignore file, stages everything, commits
as the tenant identity, publishes the site, and pushes. An agent with no
exec capability is told plainly, git not permitted.
The agent never holds the keys to the repo; it asks the host to record a
change.
flowchart TD call["agent tool call
{ message: blog: ship growth post }"] call --> ig[".gitignore ensured
(scratch · keys · memory excluded)"] ig --> add["git add -A — safe, because the ignore
can't sweep in the signing key"] add --> commit["commit — author [email protected]"] commit --> pub["SitePublish — mirror content/** + blog/**
to the served root, SAME call"] pub --> push["push origin HEAD — best-effort"] push --> ret["returns: 3fa9c12 (pushed) (published 4)"] style commit fill:#aee5c2,stroke:#121316,stroke-width:2.5px style pub fill:#13d943,stroke:#121316
Read the flow top to bottom. A tool call arrives carrying one string, the
message. The host first ensures the auto-written .gitignore, so
the bulk git add -A that follows can sweep the whole tree
without ever catching the signing key or the agent's scratch. It commits as
the tenant. Then — and this is the load-bearing edge — it publishes
content/** and blog/** to the served site root
in the same call, before it even tries to push. Finally it pushes,
best-effort, and hands back the sha.
Why is publish welded to commit? Because the lander once shipped blog
posts that were committed but returned 404 — the run died between the commit
and a separate publish step, so the post existed in history but
never on the page. Folding publish into the commit closes that gap: if it's
committed, it's served. The push is the only soft part; the returned sha
carries its honesty as a suffix — (pushed),
(push failed: …), or (committed; no origin remote to
push). Publishing itself is pure Elixir file operations — no shell,
path-contained, with a junk filter that drops ._*,
.DS_Store, and Thumbs.db on the way in.
push-to-live, the INBOUND half
Outbound is the agent talking. Inbound is you talking back. Set
WB_GITOPS=1 and the keeper's tick begins with a reconcile: it
fetches origin and merges — it integrates upstream, it never
overwrites. A push to your repo lands live within one tick.
sequenceDiagram participant H as you (laptop) participant O as GitHub (origin) participant K as keeper tick participant R as tenant repo participant S as the live site H->>O: git push (design.org, or a built dist/) Note over K: WB_GITOPS=1 — top of the next tick K->>R: snapshot dirty agent work (wip) K->>O: fetch + merge (integrate, never clobber) K->>S: SitePublish content/** + ship pulled dist/ Note over S: live — within one tick
Trace the steps. You push from a laptop — maybe an edited
design.org, maybe a freshly built dist/. On the
next tick, with gitops enabled, the keeper first snapshots any pending agent
work into a wip: snapshot before reconcile commit so nothing
in-flight is at risk. It fetches and merges. Then it republishes
content/** and, if your push carried a built front-end, ships
that dist/ to the site root — clearing stale assets first,
doing nothing when there's no dist. The site is current within that single
tick. The reconcile is best-effort by design: a failure is logged and the
run continues, never blocked.
why MERGE, not overwrite
Depth rung — skippable, but it's the answer to the question everyone actually has: will this eat my agent's work? No. And the reason is a convention, not luck. The repo has two lanes that, in normal use, write disjoint paths:
| lane | examples | who writes it | pushed by |
|---|---|---|---|
| code | app src/, agent def, design.org, skills/ | humans / CI | you, from a laptop or pipeline |
| data | content/, blog/, plan.org, rem/ | the agent, live | the host, on commit |
Because the lanes don't touch the same files, a merge replays cleanly and
both survive — your design.org and the agent's
content/post.org land side by side. Only when the same
file diverges does git stop: the reconcile returns
{:conflict, files}, runs merge --abort, and leaves
it for a human. Never a silent overwrite. The possible outcomes are a closed
set — up-to-date, applied, conflict, no-remote, or error — so the caller
always knows exactly what happened.
This isn't asserted on faith; it's a test.
git_reconcile_test.exs stands up the two-actor timeline and
proves both files survive:
# HUMAN (laptop) # AGENT (live runtime, same repo) git clone <tenant origin> && cd … edit design.org "canon v2" writes content/post.org "post v2" git push origin main host commits: "agent: new post" # next keeper tick (WB_GITOPS=1): Keeper: GitOps — pulled 1 upstream commit(s); 1 file(s): design.org # both survive: design.org = human's v2, content/post.org = agent's v2 # same-file divergence instead → # "GitOps CONFLICT (design.org) — merge aborted, left for a human"
The human clones, edits design.org to "canon v2", and pushes.
Meanwhile the live agent writes content/post.org and the host
commits it. On the next tick the keeper pulls one upstream commit touching
one file — and afterward both files are present, each at its author's
version. Swap in a same-file edit and the last line is what you'd get
instead: the merge aborts and waits for a person. The canon doc records the
status as live-proven — an external GitHub push, pulled live.
what NEVER enters the repo
Depth rung. A repo that an agent runs add -A against needs a
bright line around what must never be committed. There is exactly one such
line: Workbooks.Private, the single source of truth for
session and personal data across every egress — git, bundle, and
library alike. It exists because of a real leak: beads task data once
pushed to GitHub. Now its ignore set is written into every tenant's
.gitignore automatically, which is precisely what makes the
bulk stage safe.
| ignored | what it is |
|---|---|
.workbooks/ | the per-tenant signing key — never in version control, by construction |
scratch/ · tmp/ · memory/ | agent working space and memory |
.beads/ · .claude/ | task data and tool state (the leak that started this) |
_*.jsonl · _*.json · _*.db | sidecars — _steps.jsonl, _telemetry.db, _ledger.json |
The verdict of that table: the agent's private life — its keys, its scratch, its memory, its telemetry — stays out of the public history by default, so the changelog is the work and only the work. This is the same boundary the vfs lesson draws in its own form; one module enforces it everywhere a byte could escape.
the repo carries a DID
Depth rung. The signing key the ignore file protects is not decoration.
Each tenant gets an Ed25519 keypair under .workbooks/, which
becomes a real did:key:z6Mk… — W3C and Radicle-interoperable,
the genuine multicodec-and-base58 article. Because a redeploy wipes the
container filesystem, the primary tenant's keypair restores deterministically
from WB_SIGNING_KEY, a base64 seed kept as a deployment secret —
so identity survives the box being rebuilt.
flowchart LR steps["_steps.jsonl
(append-only run log)"] steps --> chain["hash-chain
h_i = sha256(h_prev ‖ line_i)"] chain --> sign["sign the head
with the tenant did:key"] sign --> anchor["commit _ledger.json
into the tenant repo"] anchor --> wit["witnessed by git's
own content-addressing"] style anchor fill:#aee5c2,stroke:#121316,stroke-width:2.5px style wit fill:#13d943,stroke:#121316
Follow the chain. Every step the agent takes appends a line to
_steps.jsonl. The ledger seals those lines into a hash chain —
each link is the sha256 of the previous hash concatenated with the raw line,
from a fixed genesis — then signs the head with the tenant's did:key. The
anchor step commits the sealed _ledger.json into the repo, so
the head of the ledger is witnessed by git's own content-addressing: tamper
with a step and the chain breaks, and the commit history shows it. A second
native record corroborates this — JJ colocates over the same repo purely for
its operation log, and is a no-op when jj isn't present. The
ledger lesson takes this all the way down.
any forge, or NONE
Depth rung. Your tenant is plain git, so it can be mirrored anywhere. The
wbx CLI exposes this as two verbs, both calling the engine
rather than your local git binary — the dependency audit even
flags a local git as convertible: git exists engine-side, so call
through the engine.
| you run | what happens |
|---|---|
wbx mirror git@… | plain git push to any remote URL — host-agnostic |
wbx mirror github | a bare word is a forge — gh auto-provisions wb-<tenant>, private by default, and pushes |
wbx federate | Radicle — rad init --public mints a rad:… RID for P2P |
The forge lane picks whichever CLI is on PATH — gh for
GitHub, glab for GitLab, tea for Gitea, in that
priority. No CLI on PATH, no magic: you're told to create the remote
yourself and use mirror with a URL. Radicle is honest about its
ceiling too — rad keys are per-device, and using the tenant keypair as a
delegate needs a rad feature that doesn't exist yet; the limitation is
documented rather than papered over. These same operations are reachable
from the engine CLI (wb mirror, wb radicle) and
over RCP.
where it BITES
Honesty section. gitops is real and live-proven, but it is not finished, and pretending otherwise would betray the whole point of a verifiable changelog.
The inbound flow is opt-in and keeper-coupled. Reconcile only runs
when WB_GITOPS=1 and only on a keeper tick — no keeper, no
push-to-live. A dedicated reconcile controller with its own
POST /reconcile is roadmap, not shipped; today the keeper is the
trigger.
The honest gap is the build seam. The runtime can't run
vite build — there's no bun on the BEAM box — so app-source
pushes ride a CI bridge: a workflow builds, commits dist/ back
with [skip ci], and the runtime pulls source and serves the
built output. Push source, get a live page, no redeploy — but a CI hop in
the middle. Retiring it means building in the sandbox on pull, and that's
blocked on a concrete wall: QuickJS has no JIT and re-parses the multi-megabyte
svelte compiler every build, roughly seventeen CPU-minutes. The fixes —
bytecode precompile, a warm Wasmex instance, a content-addressed output
cache — are mapped but not done.
And the soft edges: a same-file conflict halts and waits for a human rather than guessing; forge provisioning needs gh/glab/tea on PATH; the Radicle delegate limitation is upstream, not ours to fix. None of these is hidden in the code, and none should be hidden from you.
questions people actually ASK
Can I clone my tenant repo?
Yes — it's plain git. Mirror it to any remote you control with
wbx mirror git@…, or let a forge CLI provision one with
wbx mirror github, then clone it like any other repo. There's
no proprietary format underneath; it's the same files the engine serves.
Will a push overwrite what the agent is doing?
No. Inbound is a merge, never an overwrite. Code paths and data paths are disjoint, so both sides' work survives; only a same-file divergence stops, and that one aborts the merge and waits for a human rather than picking a winner. A checked-in test proves both files survive.
Why isn't my push live yet?
Three usual causes. WB_GITOPS is unset, so reconcile never
runs. Or there's no keeper ticking, so nothing triggers the pull. Or the
tenant has no origin remote, so there was nothing to fetch. Check those in
that order.
Does my agent's memory get pushed?
No — the auto-written .gitignore keeps scratch, memory,
keys, task data, and telemetry sidecars out of version control by default.
One module, Workbooks.Private, enforces that on every egress.
The changelog is the work, not the agent's private life.
Is this how the platform itself releases?
No — that's CI, a separate concern. gitops is tenant-level: it's how your running system records and accepts changes. The platform's own runtime image ships through its own pipeline. Don't conflate the two.
What authorizes a push to take effect?
Access to the repo. There's no separate deploy credential — if you can push to the tenant's origin, the next reconcile integrates it. That's why the repo's keys and the auth identity are treated as carefully as they are.
keep GOING
gitops is the version-control half of owning your system — it sits under the nexus, beside the pieces that make the loop turn.