the file that DEPLOYS you
You've been burned by a config file before. It started as ten honest
keys — a name, a port, a region — and then someone needed "just one hook,"
and a year later it was a Turing-complete YAML pipeline with an
exec: key, secret interpolation, and a templating language
nobody chose. The word declarative was on the box. The contents
were a program you couldn't read.
So when the parent lesson says "flip
local to cloud and your prototype is the
product," your reflex is right: that's a brochure line until proven
otherwise. Every platform claims declarative deploys. The claim is only
worth anything if you can audit it — read the whole reader in one
sitting and confirm there's no place a program could hide.
That's the bar this page sets, and then clears. A deployment description you can hold in your head. Inert by construction, not by promise. We'll read the parser, watch the validator refuse to let you lie to yourself, and bring the same engine up on a laptop and on a public host — to show the "one property" claim cashes out as a measurable fact, not a vibe.
the DEFINITION
1. an org file of five properties the engine converges to — engine place, tenancy, storage, database, and auth — where the place it runs is one of them. Inert by construction: read by a property scan, never executed.
Two words in that definition carry the whole lesson. Converges:
you describe the end state, the tooling makes reality match — no steps, no
order, idempotent. Never executed: the file is read by a hand-rolled
scan of the :PROPERTIES: drawer, not by the org engine that
tangles and runs other workbooks. A deployment file is
the one org file in the system that is pure config and nothing else — and
that's a property you can verify, not a posture.
five axes, one SCREEN
Here is the real local deployment, shipped in the kit, whole. One headline, one drawer, seven lines of properties — this is not a fragment:
* local :deployment: :PROPERTIES: :ID: local :ENGINE_PLACE: local :TENANCY_MODE: single :STORAGE: local-fs :DATABASE: sqlite :AUTH: trusted :END:
Five of those properties are closed enums — the only axes that exist, each with a fixed set of legal values:
| axis | legal values | default | what it flips |
|---|---|---|---|
ENGINE_PLACE | local · cloud | — required | microVM on your mac, or a provider recipe |
TENANCY_MODE | single · multi | single | one user, or tenants needing identity + isolation |
STORAGE | local-fs · s3 | local-fs | the working dir is the store, or an S3-compatible bucket |
DATABASE | sqlite · postgres | sqlite | one writer, or many — concurrency under multi-tenancy |
AUTH | trusted · betterauth · clerk · oidc | trusted | no identity, or a real token issuer |
Everything else — APP, PROVIDER (defaults to
fly), REGION, the storage endpoint and bucket,
the auth issuer — is an open property: a string the relevant axis reads
when it needs one. There is nothing else. No fifth verb hiding in a sixth
key.
Now the part that earns the word "inert." The reader is about twenty
lines. It walks the file top to bottom, flips a boolean when it sees
:PROPERTIES: and flips it back at :END:, and
inside the drawer it matches each line against one regular expression:
^:([A-Za-z_]+):\s+(.*\S)\s*$
│ └─ the value, trimmed
└─ the key, uppercased
A line that matches contributes a key and a value. A line that
doesn't match is silently ignored. That's the whole parser. No
interpolation, no ${}, no hooks, no template language, no
branch where a string becomes code. "Declarative" here is a measurable
property of a reader you could screenshot — not a marketing adjective. The
reason your old config files rotted is that their readers grew teeth. This
one can't, because there's nowhere for teeth to go.
the honesty ENGINE
A linter checks syntax. validate checks truth. It
runs with zero deploy — read-only, no machine touched — and its job is to
refuse configurations that would lie to you. Its error messages aren't
style notes; they're distributed-systems tradeoffs encoded as
refusals. Three of them are worth quoting verbatim, because each one is a
lesson that would otherwise cost you a 2 a.m. page:
TENANCY_MODE: multi + DATABASE: sqlite → rejected: "multi on DATABASE: sqlite serializes every tenant through one writer — set DATABASE: postgres" TENANCY_MODE: multi + AUTH: trusted → rejected: "`trusted` has no identity to isolate tenants by" STORAGE: s3 + missing secrets → rejected: "STORAGE: s3 needs WB_S3_KEY + WB_S3_SECRET in your deploy ENV (secrets, not the file)"
Read what those refusals know. Multi-tenant on one SQLite writer
doesn't fail — it works, slowly, funneling every tenant through a
single lock until the day it doesn't, and the validator would rather you
learn that now than in production. trusted auth has no token,
no subject, no identity — so there is literally nothing to scope a tenant
by, and "multi-tenant with trusted auth" isn't a configuration, it's a
contradiction. And s3 demands its credentials from the
environment, by name, because secrets in the file is the original
sin and the validator won't be complicit in it.
The full set of checks, as a flow — enum legality first, then the combinations that can't coexist, then the secrets the chosen axes require:
flowchart TD
cfg[["deployment.org"]] --> enums{"five axes
legal values?"}
enums -- no --> bad["invalid — unknown value on an axis"]
enums -- yes --> combo{"combinations
coherent?"}
combo -- "multi + sqlite" --> bad2["one writer for every tenant"]
combo -- "multi + trusted" --> bad3["no identity to isolate by"]
combo -- ok --> sec{"required secrets
present in ENV?"}
sec -- "s3, no WB_S3_KEY" --> bad4["secrets, not the file"]
sec -- "postgres, no WB_DATABASE_URL" --> bad5["the DSN is a secret"]
sec -- "real auth, no ISSUER" --> bad6["a token needs an issuer"]
sec -- all present --> ok["valid — local · single-tenant
storage=local-fs · db=sqlite"]
style cfg fill:#aee5c2,stroke:#121316,stroke-width:2.5px
style ok fill:#13d943,stroke:#121316,stroke-width:2.5px
style bad fill:#ffffff,stroke:#121316
style bad2 fill:#ffffff,stroke:#121316
style bad3 fill:#ffffff,stroke:#121316
style bad4 fill:#ffffff,stroke:#121316
style bad5 fill:#ffffff,stroke:#121316
style bad6 fill:#ffffff,stroke:#121316
On success it prints the shape it understood —
valid — local · single-tenant · storage=local-fs · db=sqlite
— which is the same summary every verb speaks. Notice the deliberate
split: postgres needs WB_DATABASE_URL ("the DSN
is a secret"), s3 needs its endpoint and bucket in the
file but its key and secret in the env, and any real auth
needs an ISSUER — the origin a token is checked against. Two
separate checks, two separate error strings, because they are two
separate kinds of thing. The validator is the part of the system that
refuses to let you write a config that would quietly lie.
a cloud on your DESK
Depth rung — skippable, but it's where "local isn't a simulator"
becomes a fact you can run. Local place is not a mock. It's the same
ghcr.io/workbooks-sh/runtime:latest OCI image cloud uses,
booted in a real Linux microVM on your mac via libkrun and
krunvm. The microVM is the outer
isolation boundary — distinct from, and wrapped around, the wasm
sandbox each workbook runs in. krunvm is the mac arm of a cross-platform
seam; podman, docker, and WSL2 slot in behind the same
ensure→create→run→status→down contract.
The bring-up, command by command, is shorter than the explanation:
$ wbx deploy doctor created the case-sensitive APFS volume; prereqs OK now $ wbx deploy apply local runtime up — http://127.0.0.1:55123 (survives app quit; `wbx deploy down` to stop) $ wbx deploy verify runtime healthy — http://127.0.0.1:55123/health → 200
Under that calm output is a sequence with two real macOS war stories
baked into it. First the one-time prereq: krunvm's OCI store needs a
case-sensitive APFS volume, which doctor creates for
you with diskutil apfs addVolume <disk> "Case-sensitive APFS"
krunvm — no sudo, non-destructive, self-healing. Then the create,
with flags that each exist for a reason:
krunvm create ghcr.io/workbooks-sh/runtime:latest \ --name workbooks-runtime --cpus 2 --mem 2048 \ --workdir /app \ ← krunvm won't inherit the image WORKDIR --port 55123:4000 \ ← free host port → guest's fixed 4000 --volume …/data:/data \ ← your data, on the host --volume …/disco:/disco ← the discovery handshake lands here
The boot command is the first war story. You'd expect
bin/workbooks start — but the release's start
runs a foreground console that needs a TTY, and under krunvm's no-TTY
guest it blocks at console init before the app ever boots:
silent, no discovery, nothing. So the boot is
/app/bin/workbooks eval '…ensure_all_started… Process.sleep(:infinity)'
— boot the app, then park forever; on failure System.halt(1)
so it surfaces in wbx deploy logs instead of hanging.
sequenceDiagram participant U as wbx (your session) participant K as krunvm participant V as the microVM (the runtime image) participant D as /disco (bind mount) U->>K: doctor — APFS volume ready U->>K: create + start --env WB_DESKTOP=1 … K->>V: boot via `eval` (no TTY needed) → park V->>D: write runtime.json (port, pid, token) U->>D: poll up to 20s for runtime.json D-->>U: found — port + bearer token U->>V: GET http://127.0.0.1:55123/health (bearer) V-->>U: 200 · body contains "ok"
The second war story is why the VM survives you closing the app.
libkrun needs the user's GUI session — run under a background LaunchAgent
it exits 78 (EX_CONFIG). So the default lifecycle is a direct
nohup-detached spawn into your Aqua session, with the pid in a pidfile: it
survives the app quitting (reparented), and dies on logout. A LaunchAgent
path exists for start-on-login, but it's the path that fights libkrun, so
it's opt-in.
And the handshake itself: the runtime inside the VM writes
/disco/runtime.json — port, pid, and a bearer token — into
the bind-mounted directory, and the host reads it straight off the mount.
That file is the local control-plane handshake. status
and verify read it; verify uses the token to hit
/health and expects a 200 with "ok" in the body. When you run
down, both lifecycles are removed and the VM is deleted — but
/data and the APFS volume are preserved. Your data
outlives the machine.
five hooks and a TEMPFILE
Cloud isn't magic either. A provider is a directory:
cli/deploy-kit/providers/<place>/bootstrap.sh. The
place resolves to a backend — local resolves to the built-in
Machine module, anything else resolves to that bootstrap script. Adding a
provider is dropping a directory; the core never changes. Each recipe
fills the same five hooks plus a lifecycle trio:
provider_ensure_app ← create the app if missing provider_set_secrets ← stage credentials, out of band provider_attach_volume ← a persistent disk for /data provider_deploy_image ← run the OCI image provider_public_url ← print url: https://… + provider_down / provider_status / provider_logs
The kit drives the script with an action — up runs those
first five in order; down, status,
logs, url do the obvious thing. The provider's
exit code is honored all the way out to the CLI's exit code, so a failed
host failure is a failed command.
flowchart TD
apply["wbx deploy apply"] --> place{"ENGINE_PLACE?"}
place -- local --> machine["built-in Machine
(krunvm microVM)"]
place -- cloud --> recipe["resolve providers/<place>/bootstrap.sh"]
recipe --> up["WB_RECIPE_ACTION=up"]
up --> h1["ensure_app"] --> h2["set_secrets"] --> h3["attach_volume"] --> h4["deploy_image"] --> h5["public_url → url:"]
style apply fill:#aee5c2,stroke:#121316,stroke-width:2.5px
style machine fill:#ffffff,stroke:#121316
style h5 fill:#13d943,stroke:#121316
Fly is the one bundled recipe — and the comment at the top of it says so
plainly: Fly is our personal preference, not a privileged target, just one
recipe behind the provider-neutral spine. Here's what makes it concrete.
On apply, the Fly recipe creates the app if missing
("converge, don't complain"), stages secrets, creates a 1 GB volume
wbdata if absent — then synthesizes a minimal
fly.toml in a tempdir, deploys it, and deletes it:
app = "my-app" primary_region = "sjc" [build] image = "ghcr.io/workbooks-sh/runtime:latest" [mounts] source = "wbdata" destination = "/data" [http_service] internal_port = 4000 force_https = true auto_stop_machines = "stop" auto_start_machines = true min_machines_running = 0
The public URL is https://my-app.fly.dev. And the env every
provider forwards is the quiet keystone: WB_WEB=1,
PORT=4000, WB_DATA=/data, and crucially
WB_REGISTRY=/data/registry.db. The registry defaults to
:memory: in the engine — pointing it at the volume is exactly
what makes deployed workbooks survive a restart or a scale-to-zero, with
litestream replicating that SQLite db. Those
auto_stop_machines / min_machines_running = 0
lines are scale-to-zero; the volume + registry pointer is what keeps it
safe.
The punchline is the contract's smallness. A Hetzner, Render, or AWS recipe is the same ~90 lines with different commands, because the engine asks the host for only four things: run my container, attach a volume, deliver my secrets, expose a port. Everything a PaaS does beyond that is the PaaS's business, not the engine's.
the actual DIFF
Now the parent's claim, made literal. Moving from local to cloud is
editing ENGINE_PLACE — and then adding exactly the properties
the validator forces you to add. Here is the diff against the
shipped cloud SaaS deployment:
* local → * cloud-saas :deployment:
:PROPERTIES: :PROPERTIES:
- :ENGINE_PLACE: local + :ENGINE_PLACE: cloud
+ + :PROVIDER: fly
+ + :APP: my-app
:TENANCY_MODE: single + :TENANCY_MODE: multi
- :STORAGE: local-fs + :STORAGE: s3
+ + :STORAGE_ENDPOINT: https://s3.us-east-1.amazonaws.com
+ + :STORAGE_BUCKET: my-bucket
+ + :STORAGE_REGION: us-east-1
- :DATABASE: sqlite + :DATABASE: postgres
- :AUTH: trusted + :AUTH: clerk
+ + :ISSUER: https://my-tenant.clerk.accounts.dev
+ + :REGION: iad
:END: :END:
None of those additions are decoration — each one is the validator
refusing a half-edit. Try to go to multi tenancy while still
on sqlite and trusted, with s3 but
no credentials, and validate stops you cold:
$ wbx deploy validate cloud-saas.org
invalid deployment:
- TENANCY_MODE: multi on DATABASE: sqlite serializes every tenant
through one writer — set DATABASE: postgres
- STORAGE: s3 needs WB_S3_KEY + WB_S3_SECRET in your deploy ENV
(secrets, not the file)
So you don't add postgres because a guide told you to — you
add it because multi demanded it. The same translation runs in
both directions, and it's one function — Config.to_env/1 — that
turns the five axes into the engine's environment. Same keys, sourced
differently per place:
| engine env | sourced from | local | cloud (SaaS) |
|---|---|---|---|
WB_STORAGE | the file's STORAGE axis | local | s3 |
WB_TENANCY_MODE | the file's TENANCY_MODE | single | multi |
WB_S3_BUCKET / _ENDPOINT | the file (not secret) | — | from the drawer |
WB_S3_KEY / _SECRET | your deploy ENV (secret) | — | forwarded, never echoed |
WB_DATABASE_URL | your deploy ENV (secret) | — | forwarded |
WB_AUTH_ISSUER / WB_JWKS_URL | the file's ISSUER, JWKS derived | — | issuer + a provider-specific path |
WB_IMAGE | generated — same image both ways | runtime:latest | runtime:latest |
The verdict of that table is the whole "one property" claim in one row:
WB_IMAGE is identical on both sides. Same image, same
env-assembly function, two delivery mechanisms. The JWKS path is the one
bit of cleverness — BetterAuth's is non-standard
(issuer + /api/auth/jwks), anything OIDC-ish is the
well-known path (issuer + /.well-known/jwks.json), and an
explicit JWKS_URL property overrides both. Beyond that, the
diff is the deploy.
names in the file, values NOWHERE near it
Depth rung — skippable, but it's the discipline that lets the file stay inert and public. The doctrine is stated in the header of the file users read first: secrets are not in this file. Axes live in the drawer; credentials live in your deploy environment. The validator enforces it from the other side, and the translator forwards secret values from the env without ever echoing them.
The richer flow, in the Rust kit: you declare secrets by name
in the file with #+DEPLOY_SECRETS: KEY …, set values with
wbx deploy secrets set KEY=VALUE (stored mode-0600 in
secrets.env under the app dir), and list them by name only —
secrets list never prints a value. Locally they're delivered
as --env-file; in the cloud they're staged through the
provider's set_secrets hook. Values never enter
deployment.org or any image. And apply refuses to
deploy with a declared secret unset — it tells you which one and how to set
it.
flowchart LR v["a secret's value"] --> store["env / secrets.env (0600)"] store --> recipe["recipe env (in memory)"] recipe --> stage["fly secrets set --stage"] stage --> ctr["container env"] org[["deployment.org
(declares the NAME only)"]] -. "never touched" .-> ctr style org fill:#aee5c2,stroke:#121316,stroke-width:2.5px style ctr fill:#13d943,stroke:#121316
The graph's point is the dotted line: the value travels env → recipe →
staged secret → container, and deployment.org sits off to the
side, naming the key and touching the value never.
Two secrets earn special handling because they are your
engine's identity. On the first cloud apply, the kit generates
WB_PUBLIC_BEARER — 32 random bytes, hex — exactly once,
persists it, and never rotates it. Rotating it would 401 every live client
and break the tenant's stable identity; the kit prints
control-plane locked — export WB_ENGINE_TOKEN=<token>
and means it. And WB_SIGNING_KEY — a 32-byte Ed25519 seed —
is the engine's DID: set it once (generate with one line of Elixir) and
your engine's identity survives every redeploy. Skip it and the DID
regenerates on each deploy, and every prior signature and ledger entry
stops verifying. Tenant scoping, finally, is enforced above the
storage backend on purpose — so swapping S3 for R2 can't widen access.
converge, prove, TEAR down
The verbs are built for agents as much as humans: every one is
non-interactive, idempotent, takes --json, and exits non-zero
on failure. The arc is init → validate → apply → status|verify|logs
→ down, with two shortcuts: wbx deploy local for the
zero-config bring-up (think docker run) and
wbx deploy doctor to check and self-heal prereqs.
| verb | what it does | what it reads | proves success by |
|---|---|---|---|
init [local|cloud] | writes a starter deployment.org | a template (won't clobber without --force) | file exists, validates |
validate | checks truth, deploys nothing | the file + your env | prints the summary shape |
apply | converges reality to the file | file + env → engine env | idempotent — re-run is a no-op |
verify | proves the live runtime | local: discovery + bearer · cloud: provider url | GET /health → 200 |
status | reports presence + endpoint | discovery / provider status | "runtime up — http://… (pid …)" |
down | tears down, keeps your data | pidfile / provider down | "data + APFS volume preserved" |
doctor | checks + self-heals prereqs | tools, recipes, the APFS volume | "prereqs OK now" |
Two rows carry the weight. verify doesn't trust the deploy
log — it proves the live engine: locally it reads the discovery
file and hits http://127.0.0.1:<port>/health with the
discovery bearer token, expecting a 200 with "ok"; in the cloud it asks
the provider recipe for its URL, then hits <url>/health.
And doctor doesn't just diagnose — if the APFS volume is
missing it creates it and reports the system is now ready. The
contract — non-zero exit on failure, JSON on request — is what lets a
workflow run the whole arc unattended and a human run it by hand, from the
same surface.
where the seam ENDS
Honesty section. The macOS path carries real, un-pretty engineering.
The TTY boot hack and the EX_CONFIG/Aqua-session requirement
aren't elegance — they're scars, the direct-spawn VM dies on logout, and
the LaunchAgent path that would survive login conflicts with libkrun. The
default lifecycle is the one that works, not the one that's tidy.
The cloud side ships exactly one recipe: Fly. The seam is provider-neutral by design and any host fits the five hooks, but "any host" is a promise about the contract, not a directory of pre-written recipes — today you'd write the Hetzner one yourself. The Rust kit's krunvm lane also can't deliver the kit's secrets env yet, so if you stage kit secrets locally it tells you to use docker or podman instead.
There is no fleet story. One deployment.org is one machine.
No regions array, no blue-green, no rolling orchestration — convergence is
per-file, and scale beyond a single machine is out of this lesson's scope
and out of the tooling's current scope. We'd rather say that than imply a
control plane that isn't there.
And there are, in transition, two CLI dialects: the
:PROPERTIES: drawer with ENGINE_PLACE (taught
here, used by the shipped examples and the parent lesson) and an older
#+DEPLOY_* keyword form the Rust binary's templates still
emit. Teach yourself the drawer form — it's the canon — and treat the
other as drift to be normalized, not a second thing to learn.
questions people actually ASK
Do I need Docker?
On a mac, no — local runs in a krunvm microVM, prereqs installed by
wbx deploy doctor. The Rust kit also speaks docker and
podman if you'd rather, picked in automation-friendliness order. Cloud
needs nothing local but the provider's CLI.
Is local really identical to cloud?
It's the same OCI image — ghcr.io/workbooks-sh/runtime:latest
— booted from the same env-assembly function. What differs is the
isolation mechanism (a microVM on your desk, a provider's
machine in the cloud) and the storage/database axes you chose, not the
engine. The image is the constant; the place is one property.
Can I add AWS, Hetzner, or Render?
Yes — drop a directory with a bootstrap.sh that fills the
five hooks and the lifecycle trio. It's about ninety lines with different
commands; the engine asks the host for only run-my-container, a volume,
secrets, and a port. The core never changes.
Where do my secrets actually end up?
Never in the file or the image. Values live in your deploy env or a
mode-0600 secrets.env, are forwarded into the recipe's
memory, staged through the provider, and land in the container's
environment. deployment.org names the key and touches the
value never.
Where does my data live locally, and does down delete it?
It lives in a host directory bind-mounted to /data (under
Application Support by default). down removes the VM and the
lifecycles but preserves both your data and the APFS volume — it
says so when it finishes. Your data outlives the machine.
What if I just edit deployment.org by hand?
That's the intended interface. Edit it, validate,
apply — and apply is idempotent, so re-running
on an unchanged file is a no-op and re-running after an edit converges
the difference. The file is the source; the verbs make reality match it.
keep GOING
This sub-lesson is the "how" under the parent's "why" — and it leans on two formats and a CLI, each with its own lesson.