where did the HISTORY go?
Your project's history is already scattered, and you've stopped noticing.
Code lives in git. Content lives in a CMS. Application data lives in a SaaS.
And the record of what got deployed when lives in a dashboard's audit
log — a list you can read but can't clone, can't git log, can't
take with you. Four histories, four homes, none of them the same shape.
Now make the whole app one HTML file, and the worry sharpens: a single-file
artifact smells like a binary. If the app is one .html,
did you just lose version control entirely? Where do diffs live? Is your
history now locked inside someone's platform? That fear is the right fear to
have — and the answer is better than "don't worry about it." The answer is
that nothing about this format ever stopped being text, and the project
running it was a repo the whole time.
the DEFINITION
1. the workbook's history layer: the source rides inside the artifact as plain text, and the running project's data root is a real git repo — so history, diff, and rollback come for free, and you can clone any of it.
That phrase isn't ours by accident. The engine's own git module says it
plainly — a git-backed Workbook store, the versioned source of truth;
making the per-tenant data root a git repo buys history, diff, and rollback
for free. It's a thin wrapper over the git CLI you already
run. Not "integrates with git." Is a git repo — one per tenant, at
$WB_DATA/<tenant>, git init'd the moment the
project starts living.
what actually DIFFS
Here's the honest framing the whole page hangs on: git the source, ship the artifact, and the artifact carries the source back out. A workbook is three forms at once, and they diff differently — so it's worth being exact about which layer you read when you read history.
| layer | form | git diff verdict | the move when it doesn't |
|---|---|---|---|
| the org spec | plain text, in a script type=text/org block | clean line diffs — the designed diff surface | — |
the .html artifact | text, end to end | diffable in principle — it's text, not a blob | read the org source for the legible diff |
| the vfs.sqlite disk | binary database file | stored, but git can't line-diff it | wbx unbundle recovers a source tree; the legible-SQL projection is design intent (see honesty) |
Two things follow. First, the source is never trapped: a workbook ships as
a .wbundle — a plain zip carrying workbook.html and
workbook.org side by side, on purpose, so a recipient can
wbx unbundle it back into a diffable tree and re-author. The
comment in the bundler says it outright: source travels with the artifact —
recipients can unbundle and re-author. Second, the one binary layer —
SQLite — is the one place to be careful, and we come back to it honestly
rather than pretend it diffs.
every deploy is a COMMIT
This is the load-bearing fact. When you deploy, the runtime does two things
in one breath: it writes the fast SQLite index your app reads from, and
it commits your .org source into the tenant repo. One deploy, one
commit, with a real message and a real author.
$ wbx workbook deploy lander ./lander.org
→ runtime runs Workbooks.deploy("lander", org, "dev")
→ writes the SQLite index + Git.save commits <name>.org
→ commit "deploy lander" authored dev <[email protected]>
→ returns the sha, or :nochange if nothing moved
Picture it as a sequence. You hand the CLI a name and an org file; the runtime fans it into two stores — the index it serves from, and the git repo that remembers — and a short sha comes back to you as the receipt:
sequenceDiagram participant U as you (wbx) participant R as runtime (Workbooks.deploy) participant I as SQLite index participant G as git repo U->>R: deploy lander ./lander.org R->>I: write fast index R->>G: Git.save commit deploy lander G-->>R: 9f3c2ab R-->>U: sha (your receipt) Note over G: history answers
what changed last deploy
And because it's a real repo, the history comes straight back out with the commands you already know:
Workbooks.Git.log("dev") → ["9f3c2ab deploy lander", "4d1e0c7 deploy report", …]
Workbooks.Git.diff("dev") → literally `git diff HEAD~1 HEAD` — the last deploy, as a unified diff
So "what did that deploy change?" stops being a dashboard question with a
vendor-shaped answer. It's git diff HEAD~1 HEAD — the same answer
you'd get for any repo on your machine.
the agent's git HAND
Agents commit too — but they don't get a shell. The hand is host-brokered: the agent supplies only a commit message, and the host decides every word of the command line. There is no native-bash hatch here; the one that used to exist was deleted on purpose. The agent says what; the engine does how.
sequenceDiagram participant A as agent participant H as host (git tool) A->>H: commit_and_push add tuesday post H->>H: write .gitignore from Workbooks.Private H->>H: git add -A H->>H: commit (hooks off, author email set) H->>H: push origin if a remote exists H->>H: SitePublish.publish content and blog H-->>A: 9f3c2ab (pushed) (published 3)
Read the return string — those shapes are verbatim. You get
"9f3c2ab (pushed)" when a remote took it,
"9f3c2ab (committed; no origin remote to push)" when there's
nowhere to push yet, and a trailing (push failed: …) capped at
120 characters when the remote refuses. Honest receipts, not silent success.
The last step is the one that earns its place: commit implies publish,
atomically. After the commit lands, SitePublish.publish
mirrors content/** and blog/** to the live site —
so committed always implies live. This wasn't a flourish; it was a fix.
A lander once shipped blog posts that were committed to the repo but served
404, because the run died before a separate publish step ever ran. Now there
is no separate step to die before. Commit, and it's live.
the .gitignore nobody WRITES
A brokered git add -A is a sharp tool: a bulk add can sweep in
anything in the tree — including a signing key. So the danger is defused
before it can happen. Every tenant repo gets a .gitignore written
from one boundary module, Workbooks.Private, and the same
module is consulted by Git, by the bundler, and by the Library — so the
line can't drift. One definition of "private," three enforcers.
| ignored | what it protects |
|---|---|
.workbooks/ | the Ed25519 signing key — the one that must never leave |
memory/ | an agent's working memory — the session, not the work |
.beads/ .claude/ | task data and assistant scratch — born from a real leak |
scratch/ tmp/ | transient working space, nobody's history |
_*.jsonl _*.json _*.db | run sidecars: _steps.jsonl, _status.json, _trace.jsonl, _telemetry.db, _ledger.json |
The doctrine behind that list is one sentence, and it's worth keeping:
sharing exposes WORK, never the session that produced it. It's a
scarred rule — it exists because beads task data once got pushed to GitHub,
and the boundary module is the repair. The most elegant part: the Library
reuses git's own matcher, git check-ignore, to decide
what ships when you share — so the agent's native git instinct IS the
don't-share marker. The thing that keeps bytes out of a push is the same
thing that keeps bytes out of a share. You only have to be right once.
a second ledger: JJ
depth rung · skippable — the op-log layer, for the curious
On top of the same repo sits an optional second record. Workbooks.JJ
colocates Jujutsu directly onto the git repo —
jj git init --colocate — which means the same commits,
plus an operation log: every change to the repo state is itself a
recorded, addressable, undoable operation. Git records the commits; jj
records the moves between them.
flowchart TD repo["one repo — same commits"] repo --> g["git commits
content-addressed history"] repo --> j["jj op-log
replayable state changes"] g --> ev["tamper-evident
from two directions"] j --> ev style ev fill:#a8d4f0,stroke:#121316,stroke-width:2.5px
Two reasons it's there. It corroborates the signed ledger from the
author section — the run history becomes tamper-evident from two independent
directions at once. And it's the substrate for concurrent sub-agents: jj
auto-rebases text changes, while org validations arbitrate meaning. It's a
no-op if jj isn't on PATH, so it costs nothing where it isn't
wanted; jj's own identity is workbooks-agent at the same tenant
email, the same WHO, kept consistent.
the repo you already HAVE
Here's the antidote to "locked in your platform": one verb pushes your whole
history to your forge. It's plain git push underneath, so
it works with GitHub, GitLab, Gitea, or anything self-hosted — the forge CLIs
only exist to provision a remote the first time.
flowchart LR repo["tenant repo"] url["explicit URL"] forge["forge name"] rad["radicle"] repo --> url --> push1["plain git push
no forge CLI needed"] repo --> forge --> push2["gh / glab / tea
provision and push"] repo --> rad --> push3["P2P by DID
RID rad:…"] style repo fill:#a8d4f0,stroke:#121316,stroke-width:2.5px
The common case is one line:
$ wbx mirror github
→ POST /api/mirror/dev {"forge":"github"}
→ no origin yet, so: gh repo create wb-dev --source . --private --push --remote origin
→ {"ok":true, "url":"https://github.com/you/wb-dev.git"}
If you already have the remote, skip the forge CLI entirely — hand
mirror a URL (anything containing :// or starting
git@) and it's a bare git push, zero tooling
required: wbx mirror [email protected]:you/site.git. And when nothing
is installed, the engine doesn't fail quietly — it tells you the truth:
"no forge CLI (gh/glab/tea) on PATH — create the remote yourself and use
mirror/3." For the peer-to-peer crowd, wbx federate runs
rad init --public and hands back a Radicle RID
(rad:…), with delegates addressed by DID — the same
did:key from the author section, doing real work.
history as a public SURFACE
Because the project is a repo, its changelog can be its actual log
— not a marketing timeline somebody curated. A deployed app serves
GET /_changes on its public plane: anonymous, read-only, the real
git log, newest-first, capped at thirty entries, paired with the keeper's
status. The JSON is the diagram:
GET /_changes
{
changes: [
{ sha: 9f3c2ab, ts: 1749600000, author: dev, msg: deploy lander },
{ sha: 4d1e0c7, ts: 1749513600, author: dev, msg: deploy report }
],
agent: { … keeper status … }
}
The authed sibling GET /rcp/changes carries the same idea, and
its comment states the point: the changelog is verifiable history, not
marketing. Anyone can read it; anyone can clone the repo and check it. The
page narrates its own construction from facts it can't fake.
That's the outbound direction. The inbound direction — a push to
your forge reconciling back into the live project, the push-to-update loop —
is real but lives next door: Git.pull snapshots a dirty tree as
wip: snapshot before reconcile first, aborts cleanly on conflict rather
than clobbering agent data, and rides a keeper trigger. That whole reconcile
loop is its own lesson; this page owns the artifact-and-history side. See
gitops for when the loop reverses.
where this is HONEST
None of the hard edges are hidden — they're in the code's own comments, so here they are plainly.
- SQLite is binary. Git stores
vfs.sqlitebut can't line-diff it. The plan is a legible projection — an unpacked tree where the SQLite shows up as readable SQL dumps GitHub can render. That's design direction, not shipped behavior: there's no SQL-dump emitter today. Treat the legible-diff story as a roadmap, not a feature. - Radicle needs peers. A clone requires at least two nodes online, and rad's keys are per-device with no external-key import — an upstream limitation, not missing code on our side. It's a deployment-topology fact.
- Inbound app builds still ride CI. The reconcile loop integrates upstream changes, but rebuilding the app from them leans on a CI bridge for now. Multi-tenant GitHub-App integration is plan-status.
- Auto-provision needs a forge CLI. No
gh/glab/teaon PATH means you create the remote yourself and mirror to an explicit URL — the plain-push rail always works regardless. - Running git in a container is real ops. Every git op passes
-c safe.directory=*for a reason: on deployed boxes the/datavolume's uid differs from the runtime's, and git's dubious-ownership guard refuses every command — silently blocking commit and publish. That bit a set of agents who wrote stories to disk for days that never shipped. The flag is the scar.
questions people actually ASK
Can I just git init my workbook source directory?
Yes — that's the native mode, not a workaround. The source is org text; nothing about the format fights git. The running project simply does the same thing for you automatically, one repo per tenant.
Is my history locked inside the runtime?
No. It's plain git. wbx mirror <url> pushes the whole
history to your forge with a bare git push — no platform CLI
required when you bring the URL. Clone it anywhere and it's yours.
What diffs badly?
vfs.sqlite — it's a binary database. Git stores it but won't
line-diff it. Your source (org) and the unpacked tree are the
legible diff surface; the readable-SQL projection of the disk is design
intent, not shipped yet.
Do agent commits pollute my history?
They're authored — every agent commit carries a real
<tenant>@workbooks.local author and a message — and the
reconcile convention separates code paths from data paths, so agent activity
lands in its own lane rather than smearing across your source.
Does this replace GitHub?
No — it pushes to it. The runtime hosts a git repo; GitHub (or
GitLab, or Gitea, or your own server) stays your forge. wbx mirror
github wires them together; you keep your existing workflow.
How do I get a diffable tree out of a workbook someone sent me?
wbx unbundle <file>. A .wbundle carries
workbook.org alongside the artifact specifically so you can
extract the source and re-author it. The source travels with the artifact.
keep GOING
This deep dive lives under the workbook — it assumes that page's anatomy and shows what version control looks like when the unit is a file that is also a repo.