learn / 01·5 — under workbook · git

the projectthat is aREPO

A single file looks like a binary — so where did your version control go? It didn't go anywhere. The source rides inside the artifact as plain text, and the running project is literally a git repo: every deploy is a commit, the author is who you are, and the whole history clones to the forge you already have.

the repo11 min read
A small archivist standing before a towering wall of identical luminous green ledger-spines that recede into the distance, each spine stamped with a short glowing hash, a single thread of light linking one volume to the next — 1970s sci-fi style, bright and monumental

where did the HISTORY go?

Your project's history is already scattered, and you've stopped noticing. Code lives in git. Content lives in a CMS. Application data lives in a SaaS. And the record of what got deployed when lives in a dashboard's audit log — a list you can read but can't clone, can't git log, can't take with you. Four histories, four homes, none of them the same shape.

Now make the whole app one HTML file, and the worry sharpens: a single-file artifact smells like a binary. If the app is one .html, did you just lose version control entirely? Where do diffs live? Is your history now locked inside someone's platform? That fear is the right fear to have — and the answer is better than "don't worry about it." The answer is that nothing about this format ever stopped being text, and the project running it was a repo the whole time.

the DEFINITION

git /ɡɪt/ here, a property of the project — not a tool you bolt on

1. the workbook's history layer: the source rides inside the artifact as plain text, and the running project's data root is a real git repo — so history, diff, and rollback come for free, and you can clone any of it.

That phrase isn't ours by accident. The engine's own git module says it plainly — a git-backed Workbook store, the versioned source of truth; making the per-tenant data root a git repo buys history, diff, and rollback for free. It's a thin wrapper over the git CLI you already run. Not "integrates with git." Is a git repo — one per tenant, at $WB_DATA/<tenant>, git init'd the moment the project starts living.

what actually DIFFS

Here's the honest framing the whole page hangs on: git the source, ship the artifact, and the artifact carries the source back out. A workbook is three forms at once, and they diff differently — so it's worth being exact about which layer you read when you read history.

layerformgit diff verdictthe move when it doesn't
the org specplain text, in a script type=text/org blockclean line diffs — the designed diff surface
the .html artifacttext, end to enddiffable in principle — it's text, not a blobread the org source for the legible diff
the vfs.sqlite diskbinary database filestored, but git can't line-diff itwbx unbundle recovers a source tree; the legible-SQL projection is design intent (see honesty)

Two things follow. First, the source is never trapped: a workbook ships as a .wbundle — a plain zip carrying workbook.html and workbook.org side by side, on purpose, so a recipient can wbx unbundle it back into a diffable tree and re-author. The comment in the bundler says it outright: source travels with the artifact — recipients can unbundle and re-author. Second, the one binary layer — SQLite — is the one place to be careful, and we come back to it honestly rather than pretend it diffs.

every deploy is a COMMIT

This is the load-bearing fact. When you deploy, the runtime does two things in one breath: it writes the fast SQLite index your app reads from, and it commits your .org source into the tenant repo. One deploy, one commit, with a real message and a real author.

$ wbx workbook deploy lander ./lander.org

   → runtime runs Workbooks.deploy("lander", org, "dev")
   → writes the SQLite index  +  Git.save commits <name>.org
   → commit "deploy lander"  authored  dev <[email protected]>
   → returns the sha, or :nochange if nothing moved

Picture it as a sequence. You hand the CLI a name and an org file; the runtime fans it into two stores — the index it serves from, and the git repo that remembers — and a short sha comes back to you as the receipt:

sequenceDiagram
  participant U as you (wbx)
  participant R as runtime (Workbooks.deploy)
  participant I as SQLite index
  participant G as git repo
  U->>R: deploy lander ./lander.org
  R->>I: write fast index
  R->>G: Git.save commit deploy lander
  G-->>R: 9f3c2ab
  R-->>U: sha (your receipt)
  Note over G: history answers
what changed last deploy

And because it's a real repo, the history comes straight back out with the commands you already know:

Workbooks.Git.log("dev")    → ["9f3c2ab deploy lander", "4d1e0c7 deploy report", …]
Workbooks.Git.diff("dev")   → literally `git diff HEAD~1 HEAD` — the last deploy, as a unified diff

So "what did that deploy change?" stops being a dashboard question with a vendor-shaped answer. It's git diff HEAD~1 HEAD — the same answer you'd get for any repo on your machine.

who is the AUTHOR

depth rung · skippable — the identity bridge, for the curious

A commit needs an author, and here the author isn't a config field you forgot to set — it's who you are. The engine bridges auth's WHO directly into git's attribution: the author name defaults to the tenant, the email is <tenant>@workbooks.local, and every commit sets GIT_AUTHOR_NAME/EMAIL and GIT_COMMITTER_NAME/EMAIL from that one identity. The module's own words: git's attribution is auth's WHO.

Underneath that name sits real cryptographic identity. Each tenant gets a per-tenant Ed25519 keypair, kept under .workbooks/<tenant>.ed25519 and never committed. From its public key the engine derives a did:key:z6Mk… — a genuine W3C decentralized identifier, the same shape Radicle uses, so the author is portable beyond this platform. And the identity is stable across redeploys: the primary tenant's key is restored deterministically from WB_SIGNING_KEY, a 32-byte seed held as a secret — so your DID survives a fresh box.

flowchart LR
  auth["auth identity — the WHO"]
  env["GIT_AUTHOR + COMMITTER
name and tenant email"] did["Ed25519 keypair
did:key:z6Mk…"] commit["the commit
authored, attributable"] ledger["run ledger
anchored in git"] auth --> env --> commit auth --> did --> commit commit --> ledger style commit fill:#a8d4f0,stroke:#121316,stroke-width:2.5px style ledger fill:#aee5c2,stroke:#121316

The last arrow matters. An agent run leaves a hash-chained ledger — a sha256 fold over its steps from a genesis hash — and its head gets committed to git as a ledger-<run> commit, so the head is witnessed by git's own content-addressing. The chain proves the steps; git proves when the chain's head existed. Two independent witnesses, one repo.

the agent's git HAND

Agents commit too — but they don't get a shell. The hand is host-brokered: the agent supplies only a commit message, and the host decides every word of the command line. There is no native-bash hatch here; the one that used to exist was deleted on purpose. The agent says what; the engine does how.

sequenceDiagram
  participant A as agent
  participant H as host (git tool)
  A->>H: commit_and_push add tuesday post
  H->>H: write .gitignore from Workbooks.Private
  H->>H: git add -A
  H->>H: commit (hooks off, author email set)
  H->>H: push origin if a remote exists
  H->>H: SitePublish.publish content and blog
  H-->>A: 9f3c2ab (pushed) (published 3)
  

Read the return string — those shapes are verbatim. You get "9f3c2ab (pushed)" when a remote took it, "9f3c2ab (committed; no origin remote to push)" when there's nowhere to push yet, and a trailing (push failed: …) capped at 120 characters when the remote refuses. Honest receipts, not silent success.

The last step is the one that earns its place: commit implies publish, atomically. After the commit lands, SitePublish.publish mirrors content/** and blog/** to the live site — so committed always implies live. This wasn't a flourish; it was a fix. A lander once shipped blog posts that were committed to the repo but served 404, because the run died before a separate publish step ever ran. Now there is no separate step to die before. Commit, and it's live.

the .gitignore nobody WRITES

A brokered git add -A is a sharp tool: a bulk add can sweep in anything in the tree — including a signing key. So the danger is defused before it can happen. Every tenant repo gets a .gitignore written from one boundary module, Workbooks.Private, and the same module is consulted by Git, by the bundler, and by the Library — so the line can't drift. One definition of "private," three enforcers.

ignoredwhat it protects
.workbooks/the Ed25519 signing key — the one that must never leave
memory/an agent's working memory — the session, not the work
.beads/ .claude/task data and assistant scratch — born from a real leak
scratch/ tmp/transient working space, nobody's history
_*.jsonl _*.json _*.dbrun sidecars: _steps.jsonl, _status.json, _trace.jsonl, _telemetry.db, _ledger.json

The doctrine behind that list is one sentence, and it's worth keeping: sharing exposes WORK, never the session that produced it. It's a scarred rule — it exists because beads task data once got pushed to GitHub, and the boundary module is the repair. The most elegant part: the Library reuses git's own matcher, git check-ignore, to decide what ships when you share — so the agent's native git instinct IS the don't-share marker. The thing that keeps bytes out of a push is the same thing that keeps bytes out of a share. You only have to be right once.

a second ledger: JJ

depth rung · skippable — the op-log layer, for the curious

On top of the same repo sits an optional second record. Workbooks.JJ colocates Jujutsu directly onto the git repo — jj git init --colocate — which means the same commits, plus an operation log: every change to the repo state is itself a recorded, addressable, undoable operation. Git records the commits; jj records the moves between them.

flowchart TD
  repo["one repo — same commits"]
  repo --> g["git commits
content-addressed history"] repo --> j["jj op-log
replayable state changes"] g --> ev["tamper-evident
from two directions"] j --> ev style ev fill:#a8d4f0,stroke:#121316,stroke-width:2.5px

Two reasons it's there. It corroborates the signed ledger from the author section — the run history becomes tamper-evident from two independent directions at once. And it's the substrate for concurrent sub-agents: jj auto-rebases text changes, while org validations arbitrate meaning. It's a no-op if jj isn't on PATH, so it costs nothing where it isn't wanted; jj's own identity is workbooks-agent at the same tenant email, the same WHO, kept consistent.

the repo you already HAVE

Here's the antidote to "locked in your platform": one verb pushes your whole history to your forge. It's plain git push underneath, so it works with GitHub, GitLab, Gitea, or anything self-hosted — the forge CLIs only exist to provision a remote the first time.

flowchart LR
  repo["tenant repo"]
  url["explicit URL"]
  forge["forge name"]
  rad["radicle"]
  repo --> url --> push1["plain git push
no forge CLI needed"] repo --> forge --> push2["gh / glab / tea
provision and push"] repo --> rad --> push3["P2P by DID
RID rad:…"] style repo fill:#a8d4f0,stroke:#121316,stroke-width:2.5px

The common case is one line:

$ wbx mirror github
   → POST /api/mirror/dev {"forge":"github"}
   → no origin yet, so: gh repo create wb-dev --source . --private --push --remote origin
   → {"ok":true, "url":"https://github.com/you/wb-dev.git"}

If you already have the remote, skip the forge CLI entirely — hand mirror a URL (anything containing :// or starting git@) and it's a bare git push, zero tooling required: wbx mirror [email protected]:you/site.git. And when nothing is installed, the engine doesn't fail quietly — it tells you the truth: "no forge CLI (gh/glab/tea) on PATH — create the remote yourself and use mirror/3." For the peer-to-peer crowd, wbx federate runs rad init --public and hands back a Radicle RID (rad:…), with delegates addressed by DID — the same did:key from the author section, doing real work.

history as a public SURFACE

Because the project is a repo, its changelog can be its actual log — not a marketing timeline somebody curated. A deployed app serves GET /_changes on its public plane: anonymous, read-only, the real git log, newest-first, capped at thirty entries, paired with the keeper's status. The JSON is the diagram:

GET /_changes

{
  changes: [
    { sha: 9f3c2ab, ts: 1749600000, author: dev, msg: deploy lander },
    { sha: 4d1e0c7, ts: 1749513600, author: dev, msg: deploy report }
  ],
  agent: { … keeper status … }
}

The authed sibling GET /rcp/changes carries the same idea, and its comment states the point: the changelog is verifiable history, not marketing. Anyone can read it; anyone can clone the repo and check it. The page narrates its own construction from facts it can't fake.

That's the outbound direction. The inbound direction — a push to your forge reconciling back into the live project, the push-to-update loop — is real but lives next door: Git.pull snapshots a dirty tree as wip: snapshot before reconcile first, aborts cleanly on conflict rather than clobbering agent data, and rides a keeper trigger. That whole reconcile loop is its own lesson; this page owns the artifact-and-history side. See gitops for when the loop reverses.

where this is HONEST

None of the hard edges are hidden — they're in the code's own comments, so here they are plainly.

  • SQLite is binary. Git stores vfs.sqlite but can't line-diff it. The plan is a legible projection — an unpacked tree where the SQLite shows up as readable SQL dumps GitHub can render. That's design direction, not shipped behavior: there's no SQL-dump emitter today. Treat the legible-diff story as a roadmap, not a feature.
  • Radicle needs peers. A clone requires at least two nodes online, and rad's keys are per-device with no external-key import — an upstream limitation, not missing code on our side. It's a deployment-topology fact.
  • Inbound app builds still ride CI. The reconcile loop integrates upstream changes, but rebuilding the app from them leans on a CI bridge for now. Multi-tenant GitHub-App integration is plan-status.
  • Auto-provision needs a forge CLI. No gh/glab/tea on PATH means you create the remote yourself and mirror to an explicit URL — the plain-push rail always works regardless.
  • Running git in a container is real ops. Every git op passes -c safe.directory=* for a reason: on deployed boxes the /data volume's uid differs from the runtime's, and git's dubious-ownership guard refuses every command — silently blocking commit and publish. That bit a set of agents who wrote stories to disk for days that never shipped. The flag is the scar.

questions people actually ASK

Can I just git init my workbook source directory?

Yes — that's the native mode, not a workaround. The source is org text; nothing about the format fights git. The running project simply does the same thing for you automatically, one repo per tenant.

Is my history locked inside the runtime?

No. It's plain git. wbx mirror <url> pushes the whole history to your forge with a bare git push — no platform CLI required when you bring the URL. Clone it anywhere and it's yours.

What diffs badly?

vfs.sqlite — it's a binary database. Git stores it but won't line-diff it. Your source (org) and the unpacked tree are the legible diff surface; the readable-SQL projection of the disk is design intent, not shipped yet.

Do agent commits pollute my history?

They're authored — every agent commit carries a real <tenant>@workbooks.local author and a message — and the reconcile convention separates code paths from data paths, so agent activity lands in its own lane rather than smearing across your source.

Does this replace GitHub?

No — it pushes to it. The runtime hosts a git repo; GitHub (or GitLab, or Gitea, or your own server) stays your forge. wbx mirror github wires them together; you keep your existing workflow.

How do I get a diffable tree out of a workbook someone sent me?

wbx unbundle <file>. A .wbundle carries workbook.org alongside the artifact specifically so you can extract the source and re-author it. The source travels with the artifact.

keep GOING

This deep dive lives under the workbook — it assumes that page's anatomy and shows what version control looks like when the unit is a file that is also a repo.