rotate everything — until it BREAKS you
You deployed an engine to your own cloud, and now two anxieties collide. One: it's on the public internet — what's the lock? Two: the hygiene reflex everyone's been trained on — rotate your credentials, regularly, on a schedule. Workbooks does something that looks heretical against that second instinct. It generates the engine's bearer token once, and then refuses to rotate it.
Here's what rotating a deployed engine's bearer actually does. Every live
client holding the old token — your CLI, your CI job, the script on the cron,
the teammate who wired it into their shell — gets a 401 at the
same instant. There is no overlap window. Rotation isn't hygiene here; it's an
outage you scheduled for yourself. The code says so in a comment, verbatim:
never rotate on subsequent applies — rotation would 401 all live
clients.
Rotating the signing key is worse. That one doesn't lock anyone out — it un-signs your history. Every artifact and ledger the engine ever produced was attributed to a cryptographic identity derived from that key; regenerate it and every prior signature stops verifying against the engine's new name. The chain survives; the attribution dies. That asymmetry is the most concrete sentence on this page, and we'll earn it below.
three tokens, three LIFETIMES
1. a credential an engine mints, whose lifetime is the blast radius of rotating it — minted constantly when rotation costs nothing, minted once when rotation would cost everything.
The Nexus mints three, and they are not the same kind of thing wearing three names. Each has different entropy, a different lifetime, and a different consequence if you rotate it. Read the last column first — it explains every other column:
| credential | minted by | entropy | lifetime | what rotation breaks |
|---|---|---|---|---|
| desktop per-boot token | the local runtime, each boot | 24 bytes, base64url | every boot | nothing — the shell re-reads a file on the same disk |
| WB_PUBLIC_BEARER | the deploy kit, on first apply | 32 bytes, hex (256-bit) | once per deployment | every live client holding it, at once |
| WB_SIGNING_KEY | you, once, ever | 32-byte Ed25519 seed | once per identity | every signature and ledger the engine ever produced |
| JWT (the OIDC rung) | your identity provider | per-user, signed claims | constantly, by design | nothing — short-lived, re-minted on every login |
Notice the symmetry at the two ends of that table. The desktop token and the JWT rotate maximally — and rotation breaks nothing for either, because neither has to be distributed to clients who'd be caught holding a stale copy. The two in the middle are the dangerous ones, and they're the two the system mints exactly once. Lifetime tracks blast radius. That's the whole lesson; everything below is mechanism.
the ONCE machine
The cloud bearer is born on your first wbx deploy apply to any
non-local target — before the provider recipe ever runs up. The
function is ensure_cloud_bearer, and the word ensure is
load-bearing: it doesn't generate, it guarantees one exists. Run it a
hundred times and you get exactly one token.
The logic is a single fork. If secrets.env already holds a
WB_PUBLIC_BEARER key, it returns an empty string — no notice, no
change, silence. Otherwise it fills 32 bytes from OsRng,
hex-encodes them to a 64-character lowercase string, saves it to
secrets.env, and prints a one-time human notice. The "no" branch
loops back to reuse and says nothing:
flowchart TD apply["wbx deploy apply
(non-local target)"] --> check{"WB_PUBLIC_BEARER
already in secrets.env?"} check -- "yes" --> reuse["return empty string
— reuse, say nothing"] check -- "no" --> mint["OsRng → 32 bytes → 64 hex chars"] mint --> save["secrets_save → secrets.env
chmod 0600"] save --> notice["print the one-time notice
export WB_ENGINE_TOKEN=…"] reuse --> up["provider recipe runs 'up'"] notice --> up style apply fill:#aee5c2,stroke:#121316,stroke-width:2.5px style mint fill:#13d943,stroke:#121316 style reuse fill:#d9dbd3,stroke:#121316 style notice fill:#ffffff,stroke:#121316
The 32 bytes come from OsRng.fill_bytes — 256 bits of operating
system entropy, the same source you'd seed a private key from. There's no
structure to it: no header, no claims, no expiry baked in. It's a pure shared
secret, and that simplicity is deliberate — we'll come back to why hex beats a
JWT here. The file it lands in lives in the wbx app dir, not your
project dir, and it's chmod'd 0600 the moment it's written.
The notice text is the only time you'll ever see the token printed, and the writer of that text was careful to make a promise in it. Here's the first apply, and then the same command run again:
$ wbx deploy apply
==> [fly] deploying my-workbooks-engine (ghcr.io/workbooks-sh/runtime:latest)
url: https://my-workbooks-engine.fly.dev
control-plane locked — bearer token generated and persisted in secrets.env.
To talk to your engine locally, set:
export WB_ENGINE_TOKEN=9f2c61a8…e4b7 ← 64 hex chars, 256 bits, OsRng
(stored in ~/…/secrets.env — will NOT be regenerated on future applies)
$ wbx deploy apply # run it again
==> [fly] deploying my-workbooks-engine (…)
url: https://my-workbooks-engine.fly.dev
# no notice. same token. idempotent.
That second run is the whole point of the machine. Will NOT be regenerated is not a hope — it's the silent reuse branch of the fork above, executing. You cannot rotate this token by re-deploying, because the system was built so that the most natural thing you'd do — apply again — leaves it untouched.
one env var locks the PLANE
Persisting the token isn't enough on its own — it gets forwarded into the
engine's environment as WB_PUBLIC_BEARER (Fly stages it; the
deploy releases it). And the instant that variable is set and non-empty, the
engine's auth posture changes. There's no separate "enable auth" switch.
The presence of the env var is the lock. One knob.
Inside the engine, every request runs a five-rung ladder, first match wins.
Picture a request arriving and falling down it. Rung one: is the path on the
public allowlist — /health, the two well-known docs? Then it's
open, always. Rung two: is this the desktop app presenting its per-boot token?
Rung three: is the plane locked, and does the bearer match? Rung four: is there
a valid OIDC JWT? Rung five — and only when the plane is not locked
and not multi-tenant — the dev x-tenant header:
flowchart TD
req["incoming request"] --> r1{"path on public
allowlist?"}
r1 -- "yes" --> open["200 — always open
(/health, well-known docs)"]
r1 -- "no" --> r2{"desktop per-boot
token? (WB_DESKTOP=1)"}
r2 -- "match" --> okdesk["200 — desktop"]
r2 -- "no" --> r3{"locked? AND
bearer matches?"}
r3 -- "no bearer, locked" --> deny401["401 authentication required
— NO dev fallback"]
r3 -- "wrong bearer" --> deny["401 unauthorized"]
r3 -- "match" --> oktenant["200 — tenant = WB_TENANT"]
r3 -- "not locked" --> r4{"valid OIDC JWT?"}
r4 -- "yes" --> okjwt["200 — per-user"]
r4 -- "no" --> r5["dev x-tenant fallback
(only when unlocked + single-tenant)"]
style open fill:#d9dbd3,stroke:#121316
style oktenant fill:#13d943,stroke:#121316,stroke-width:2.5px
style deny401 fill:#f3c5a3,stroke:#121316
style deny fill:#f3c5a3,stroke:#121316
The bearer comparison is secure_compare — constant-time, so a
wrong guess leaks no timing about how many leading characters were right. On a
match, the request's tenant becomes WB_TENANT (default
local); any other bearer is a flat 401 unauthorized.
And the line that matters most for your mental model: the moment the lock
exists, the dev x-tenant fallback dies. A locked engine
with no bearer presented returns 401 authentication required with
no fallback whatsoever. Setting the env var didn't add a layer on top of dev
mode — it ended dev mode.
You can watch the whole ladder with three curls against a locked engine:
$ curl https://my-workbooks-engine.fly.dev/health # public allowlist
ok
$ curl https://my-workbooks-engine.fly.dev/api/library # no bearer
401 {"error":"authentication required"} # locked ⇒ no dev fallback
$ curl -H "Authorization: Bearer wrong" …/api/library
401 {"error":"unauthorized"} # secure_compare failed
$ curl -H "Authorization: Bearer $WB_ENGINE_TOKEN" …/api/library
200 … # tenant = WB_TENANT
The /gk/* paths sit outside this ladder entirely — the
groundskeeper router enforces its own credential and fails closed — but for
every ordinary capability, that ladder is the law.
the client HALF
The engine reads the secret as WB_PUBLIC_BEARER. Clients
present the very same secret under a different name: WB_ENGINE_TOKEN.
Same 64 hex characters, two roles — one is the lock's combination, the other is
you dialing it. Both ride the standard Authorization: Bearer <token>
header.
The CLI resolves where to talk and what to present in one step. If
WB_ENGINE_URL is set, it uses that endpoint and your
WB_ENGINE_TOKEN as the bearer. Otherwise it reads the local
discovery file and uses the token written there. The handshake is identical
either way — the difference is only which engine and which
token:
sequenceDiagram participant C as wbx (client) participant E as engine (locked) Note over C: WB_ENGINE_URL + WB_ENGINE_TOKEN set C->>E: GET /api/library (no Authorization) E-->>C: 401 authentication required Note over C: attach Authorization: Bearer $WB_ENGINE_TOKEN C->>E: GET /api/library (Bearer …e4b7) E->>E: secure_compare(token, WB_PUBLIC_BEARER) E-->>C: 200 (tenant = WB_TENANT)
When the token is wrong or missing, the CLI doesn't bury the failure. It
exits with a dedicated auth code whose message tells you the two ways out:
the engine rejected your credentials — set WB_ENGINE_TOKEN, or re-run wbx
deploy local to refresh discovery. If you got a 401 wiring a
script into CI, that's the sentence to read — either the token's stale, or
you're pointed at a cloud engine without exporting it.
There's a small piece of dogfooding worth naming: when an apply carries a
#+DEPLOY_TOOLKITS: push, the kit sets WB_ENGINE_URL
for itself mid-apply and talks to the engine it just locked — eating its own
token, the same way any client would.
reading the rung FIRST
Depth rung — skippable. There's a chicken-and-egg buried in the ladder: a
client needs to know which credential an engine wants before it can
present one. The allowlist solves it. Three paths are always open, regardless
of lock state, and one of them is the capabilities doc at
/.well-known/workbooks-runtime. A client reads it with zero
credentials to learn the rung it must satisfy before it has one.
The doc is plain JSON. The auth.rung field tells you the
posture — trusted for a shared-bearer engine, oidc-jwt
for one fronted by an identity provider — and an issuer is only
advertised when a verifier is actually configured:
$ curl https://my-workbooks-engine.fly.dev/.well-known/workbooks-runtime
{ "auth": { "rung": "trusted" },
"tenancy": "single",
"transports": { "http": …, "ws": … },
"capabilities": ["oql","workflow","instances","workbook",
"agent","library","search","publish","browse","telemetry"] }
The JSON is the diagram. A client that reads it knows, before risking a single guarded request, that this engine speaks the trusted rung — so it had better have a bearer ready — and exactly which capabilities it can ask for once it's in.
the one that DOES rotate
Depth rung — skippable, but it's the contrast that proves the whole thesis.
The desktop app's runtime mints a fresh token every single boot: 24
bytes from strong_rand_bytes, base64url, memoized in memory for
the life of the process. New every time the app starts. By the hygiene reflex,
this is the model citizen — maximal rotation.
But ask why it can rotate so freely. The token is written to a discovery
file — runtime.json — on the same local disk the Tauri shell reads
from. There's no client across a network holding a stale copy. When the token
changes, the shell simply re-reads the file. Rotation has no distribution
problem here, so it has no rotation problem. The desktop token rotates
maximally for exactly the reason the cloud bearer can't: nobody is downstream
of it but a sibling process on the same machine. (It's accepted only when
WB_DESKTOP=1 — rung two of the ladder — so it can never be
confused for the cloud bearer.)
Put the two side by side and the design principle is undeniable. Same idea — a random token a runtime mints — two opposite rotation policies, and the only variable that flipped is whether anyone has to be told the token changed.
the token that is a NAME
The third credential isn't a password at all. WB_SIGNING_KEY is
a 32-byte Ed25519 seed, and from it the engine deterministically regrows the
same keypair on every redeploy. That keypair is the engine's identity
— its did:key:z6Mk…, the same decentralized-identifier form
Radicle and the W3C DID spec use. Rotate it and you haven't changed a lock,
you've become a different person.
Here's why it matters, mechanically. The keystore lives on the container
filesystem, which is wiped on every redeploy. Without a persisted seed,
each redeploy mints a brand-new keypair — a new DID. And every run your engine
has ever sealed records the DID it was signed under. A run's step log is hashed
into a hash-chained ledger whose head is signed; verification checks two
separate properties — tamper_evident (is the chain intact?) and
attributable (does the signature match the recorded DID?). Walk
the fork:
flowchart TD redeploy["redeploy
(container fs wiped)"] --> seed{"WB_SIGNING_KEY
persisted?"} seed -- "yes" --> regrow["deterministically regrow
same keypair from seed"] regrow --> same["did:key:z6Mk… UNCHANGED"] same --> verify1["old _ledger.json:
tamper_evident ✓ attributable ✓"] seed -- "no" --> mint["mint a fresh keypair"] mint --> diff["did:key changes"] diff --> verify2["old _ledger.json:
tamper_evident ✓ attributable ✗"] style same fill:#13d943,stroke:#121316,stroke-width:2.5px style diff fill:#f3c5a3,stroke:#121316 style verify1 fill:#aee5c2,stroke:#121316 style verify2 fill:#f3c5a3,stroke:#121316
Read the two leaves. With the seed, the DID is unchanged and every prior
ledger still verifies on both counts. Without it, the chain is still
intact — tamper_evident passes, nobody touched the data — but
attributable fails for every prior ledger, because the recorded
DID no longer matches the engine's key. One redeploy, all history orphaned. The
chain survives; the attribution dies. That's the asymmetry from the top of the
page, made literal.
The same key wears two DID forms over one identity. The
did:web document served at /.well-known/did.json lifts
the publicKeyMultibase straight from the did:key and
lists the did:key under alsoKnownAs — a signature
verifies under either. You pin the seed once, before anything signs, and the
identity holds across every redeploy after:
$ elixir -e 'IO.puts(Base.encode64(:crypto.strong_rand_bytes(32)))'
nq3K…0Yw=
$ wbx deploy secrets set WB_SIGNING_KEY=nq3K…0Yw=
staged 1 secret(s) — applied on next `wbx deploy apply`
$ curl https://my-workbooks-engine.fly.dev/.well-known/did.json
{ "id":"did:web:my-workbooks-engine.fly.dev",
"verificationMethod":[{"type":"Ed25519VerificationKey2020","publicKeyMultibase":"z6Mk…"}],
"alsoKnownAs":["did:key:z6Mk…"], … }
One honest caveat lives here: today only the primary tenant's key is
seed-pinned (WB_PRIMARY_TENANT, default dev). Other
tenants' keys still regenerate on redeploy — an acknowledged gap, with
bring-your-own-storage as the next focus. If you're single-tenant, the seed
covers you completely.
rotating ANYWAY
Sometimes you genuinely must rotate the bearer — it leaked, someone left,
the laptop with secrets.env on it walked off. There is, very
deliberately, no rotate verb. The absence is the design,
not an omission. You do it by hand, and the manual steps make you feel the
blast radius the system is protecting you from:
| bearer rotation (possible, manual) | signing-key “rotation” (don't) | |
|---|---|---|
| step 1 | wbx deploy secrets unset WB_PUBLIC_BEARER | there is no step — you don't rotate a name |
| step 2 | wbx deploy apply → ensure_cloud_bearer mints a new one | a new seed is a new engine, identity-wise |
| step 3 | redistribute WB_ENGINE_TOKEN to every client | every prior ledger goes un-attributable |
| cost | every client 401s until updated — no overlap window | you don't become secure, you become someone else |
The bearer path works because unset removes the key, so the
next apply's ensure_cloud_bearer sees no token and mints a fresh
one — the only way to deliberately get a new bearer. There's no automatic
overlap: old clients are rejected the moment the new secret takes effect, and
the exact sequencing of when Fly's staged secret releases is Fly's, not the
kit's — so don't assume atomicity. Plan the redistribution before you unset.
The signing key has no equivalent column on purpose. You don't rotate a name. If you regenerate the seed, you haven't refreshed a credential — you've disowned every signature the engine ever made. The right move is to keep it, persisted and backed up, for the life of the identity.
what this ISN'T
Honesty section. The bearer is a shared secret, not an identity.
Everyone who presents it is the same tenant — WB_TENANT. There is
no per-user attribution under this rung; if you need to know who did
something, that's the JWT rung (rung four), with a real identity provider
behind it. Multi-tenant deployments require that rung — the shared bearer is a
single-tenant lock.
The token has no scopes and no TTL. It doesn't expire, it doesn't carry permissions, it can't be narrowed to one capability. It's 256 bits of randomness compared in constant time, full stop. That's a feature for what it's for — there's nothing to parse, nothing to expire, nothing to get subtly wrong — and a limit you should know: a leaked bearer is total access until you do the manual rotation above.
And secrets.env is plaintext on disk, protected by
0600 file permissions and nothing more. No encryption at rest
beyond the mode bits. Treat the file as the secret it contains: don't sync it to
a shared drive, don't commit it, and if the machine holding it is compromised,
rotate. These aren't bugs — they're the honest shape of a shared secret, and
knowing the shape is how you defend the design in a security review.
questions people actually ASK
Where's my token? I lost the first-apply notice.
It's in secrets.env, in the wbx app dir — the
notice prints the path. The file is plain KEY=VALUE lines.
wbx deploy secrets list shows key names only, never
values, so to read the actual token you open the file (it's yours, mode
0600). The notice is one-time; the file is forever.
Do I set WB_SIGNING_KEY before first apply?
Yes — before anything signs. The seed pins your DID; if the engine signs
its first ledger under an auto-generated key and you set the seed later, that
early history is attributed to the old, now-orphaned identity. Generate it
with elixir -e 'IO.puts(Base.encode64(:crypto.strong_rand_bytes(32)))',
stage it, and apply.
Is the bearer per-user?
No. Everyone presenting it is tenant WB_TENANT — one shared
secret, one identity. Per-user means the JWT rung, with an identity provider
behind it. If your security model needs to attribute actions to people, the
bearer alone won't do it.
What if secrets.env leaks?
Rotate, manually: wbx deploy secrets unset WB_PUBLIC_BEARER,
then wbx deploy apply to mint a new one, then push the new
WB_ENGINE_TOKEN to every client. There's no overlap window, so
line up the redistribution first. If the signing seed leaked, that's a harder
conversation — you can't rotate a name without orphaning history.
Why a hex string and not a JWT?
Because there's nothing to expire and nothing to parse. A JWT carries claims, a signature, and an expiry to verify on every request; this token carries none of that because it doesn't need to — it's a constant-time comparison against one env var. The simplicity is the security surface: fewer moving parts, fewer ways to get it subtly wrong. When you outgrow a shared secret, you don't add scopes to it — you move to the JWT rung.
Can I rotate the bearer without taking clients down?
Not cleanly. There's no overlap window and no rotate verb by
design — the manual unset + apply + redistribute
path means every client 401s until it's updated. Schedule the redistribution,
or front the engine with the JWT rung if you need rolling credentials.
keep GOING
Tokens are the credential mechanics under the Nexus — the lock on the plane you deployed. Up one level, and out to what these tokens guard.