the question every engine INHERITS
The VFS lesson ended with a clean handoff: put the catalog in the file and the warehouse behind the Nexus. But the moment you stand an engine up, you've inherited the oldest question in infrastructure — where do the bytes actually live, and what happens when you outgrow your first answer?
Every platform makes you choose a storage stack on day one and then punishes you for changing your mind. The punishment has a shape: a migration, a fork in the code, a vendor's SDK threaded through every call site so that "switch from a disk to a bucket" means editing the program, not the config. And the scarier version arrives the day you go multi-tenant — does moving to a shared bucket quietly widen who can read what?
This page is about a runtime that refused to answer the question. The code never chose a backend, so you can choose any of them. The same image that a weekend project runs on a single volume is the image a company runs on object storage plus a managed database — and the difference between those two deployments is a handful of environment variables, not a single line of source.
the DEFINITION
1. a storage provider sitting behind one of the runtime's two seams — the blob seam for large opaque files, or the structured seam for small queryable rows — selected entirely by config. The runtime code is identical across every deployment; only the env changes.
The load-bearing distinction is those two data classes. Blobs are
large, opaque, content-addressed — tenant git repos, .wbundles,
signed artifacts, VFS files, sealed ledgers — and they go to an object store
or a volume through Workbooks.Storage. Structured data is
small, relational, queried — vars, agent memory, telemetry, the
command and package registries, tenant metadata — and it goes to Postgres or
SQLite through Workbooks.DB. Keeping them apart is the point:
you can put blobs on R2 and structured data on Railway, or both on a Fly
volume, and the runtime doesn't notice.
one screen of CONFIG
The entire storage and identity posture of a deployment is one file —
storage.env.example — and its header states the contract in one
sentence: the runtime code is identical across every deployment, only this
config changes. You set these as platform secrets; they are never baked into
the image. Here is the real surface, abridged, with what each knob flips:
# ── identity ────────────────────────────────────────── WB_SIGNING_KEY=… → the Ed25519 seed behind the DID — must survive redeploys WB_PRIMARY_TENANT=dev → the default tenant scope # ── blobs (the Workbooks.Storage seam) ──────────────── WB_STORAGE=local → local | s3 | r2 — one word picks the adapter WB_DATA=/data → where local blobs + sqlite + models live WB_S3_ENDPOINT=… → s3.us-east-1.amazonaws.com OR <acct>.r2.cloudflarestorage.com WB_S3_BUCKET=… WB_S3_KEY=… WB_S3_SECRET=… WB_S3_REGION=auto → us-east-1 for AWS, auto for R2 # ── structured (the Workbooks.DB seam) ──────────────── WB_DATABASE_URL=… → set = Postgres + pgvector; unset = SQLite # ── embeddings (a SEPARATE knob from where vectors live) ─ WB_EMBED=local → hash | local | openrouter — how text becomes a vector WB_EMBED_MODEL=minishlab/potion-base-8M
Read that file top to bottom and you've read the whole storage design.
One word in WB_STORAGE swaps the blob backend. The presence or
absence of WB_DATABASE_URL swaps the structured one. And
WB_EMBED is deliberately a different knob — it decides
how text becomes a vector, not where the vectors live. Two
orthogonal axes, no hidden third one.
two seams, four VERBS
Underneath the config are two tiny interfaces, and their smallness is
the whole trick. The blob seam, Workbooks.Storage, is a
behaviour with exactly four callbacks — put, get,
list, delete — and every one of them takes
tenant as its first argument. The structured seam,
Workbooks.DB, is a single handle you open and run statements
against. That's the entire surface area a backend has to satisfy.
Adapter selection is not a framework, a registry, or a plugin system. It is a five-line case statement on one env var, and it is small enough to quote in full:
case System.get_env("WB_STORAGE") do
"s3" -> Workbooks.Storage.S3
"r2" -> Workbooks.Storage.S3 → note: r2 and s3 are the SAME module
_ -> Workbooks.Storage.Local
end
Adding a provider is one module plus one config line — never a runtime fork. It's the same pattern as the Browse provider slot elsewhere in the system. The flowchart is honest about how little switching there is: code hits a seam, the seam hits a case, the case names a module, and the bytes land wherever that module puts them.
flowchart LR code["runtime code
store a bundle · query a var"] subgraph seams["two seams — the only interfaces code knows"] direction TB st["Workbooks.Storage
put · get · list · delete"] db["Workbooks.DB
open · execute · query"] end case1{"WB_STORAGE"} case2{"WB_DATABASE_URL?"} local["Local — files under WB_DATA"] s3["S3 / R2 — one module"] sqlite["SQLite — a file"] pg["Postgres — a URL"] code --> st --> case1 code --> db --> case2 case1 -- "local" --> local case1 -- "s3 / r2" --> s3 case2 -- "unset" --> sqlite case2 -- "set" --> pg style seams fill:#fbfaf6,stroke:#121316 style st fill:#a8d4f0,stroke:#121316 style db fill:#aee5c2,stroke:#121316 style code fill:#ffffff,stroke:#121316 style local fill:#f2ddb0,stroke:#121316 style s3 fill:#f2ddb0,stroke:#121316 style sqlite fill:#f3c5a3,stroke:#121316 style pg fill:#f3c5a3,stroke:#121316
isolation lives ABOVE the backend
Here is the claim that makes swapping backends safe rather than terrifying. Tenant is the first argument of every storage call and every DB call, by construction. Isolation lives above the backend — there is no code path in which the backend decides who sees what. Swapping a Fly volume for R2 cannot widen access, because access was never the backend's job. Backends are interchangeable precisely because they were never trusted with security.
Auth decides who: a JWT, verified via JWKS, scoped to an
organizationId that is the tenant. Storage decides nothing — it
just stores under whatever scope it was handed. And one small function,
safe_key/1, strips the empty string, .,
.., and stray slashes from every key path, so a hostile key can
never climb out of its tenant prefix on a filesystem backend.
flowchart TD jwt["a request — JWT verified via JWKS"] scope["tenant = organizationId
the first argument of every call"] seam["the seam — put/get/list/delete(tenant, …)
safe_key strips '..' before any adapter runs"] local["Local — /data/<tenant>/blobs/<key>"] s3["S3 / R2 — <tenant>/blobs/<key>"] bad["backend decides access"] jwt --> scope --> seam seam --> local seam --> s3 seam -. "no such code path" .-x bad style scope fill:#a8d4f0,stroke:#121316,stroke-width:2.5px style seam fill:#fbfaf6,stroke:#121316,stroke-width:2px style local fill:#f2ddb0,stroke:#121316 style s3 fill:#f2ddb0,stroke:#121316 style bad fill:#d9dbd3,stroke:#121316,stroke-dasharray:4 4 style jwt fill:#ffffff,stroke:#121316
The blob seam itself is enforced-by-construction rather than proven by a
cross-tenant test: its one test is a single-tenant store → fetch → install
round-trip through Storage.put/Library.fetch, and
there is no end-to-end cross-tenant-denial or ..-escape test on
the Local/S3 adapters today. The denial holds because tenant is
the first argument of every call and safe_key/1 runs before any
adapter — but that is the shape of the code, not a green test, and the honesty
section says so.
A separate seam does carry a named cross-tenant test. The brokered
KV store — Workbooks.StorageBroker, a guest-facing key/value table
on SQLite, distinct from this blob seam — has a case the suite names in
capitals: TENANT ISOLATION — a tenant cannot read or overwrite another's
keys. Alice and Bob store the same key name with different values; the
scoped listing keeps them apart; a cross-tenant read is denied. The same file
tests durability across a close-and-reopen and per-tenant quotas. That proves
the broker's isolation, not the blob adapters' — the two share the
tenant-first shape, but only one is covered end-to-end.
one adapter, every BUCKET
depth rung · skippable — how the blob adapters actually work
The Local adapter is about forty-five lines. Blobs live at
<WB_DATA>/<tenant>/blobs/<key>, and the
implementation is plain File.write, File.read,
File.rm, and a Path.wildcard for listing. The
simplest durable deploy on earth is this adapter with WB_DATA
mounted on a Fly volume: persistent across redeploys, zero code change — the
volume is the storage.
The S3 adapter is the surprising one, because it's only one module and it
serves both AWS S3 and Cloudflare R2. R2 is S3-compatible, so the
difference is purely config — a different endpoint, a different region. The
SigV4 signing is hand-rolled on Erlang's :crypto and
:httpc, which means the whole thing ships with no new
dependency; the canonical-request and signing-key chain is right there in
the source, and the signing function is deliberately left public so a known AWS
test vector could verify it — the hook is there, though no such vector
test is in the suite yet. Tenant isolation is the identical key prefix —
<tenant>/blobs/<key> — enforced above the backend,
exactly as on the Local side.
| endpoint | region | where <tenant>/blobs/<key> lands | |
|---|---|---|---|
| Local | — (filesystem) | — | <WB_DATA>/<tenant>/blobs/<key> on disk |
| S3 | s3.us-east-1.amazonaws.com | us-east-1 | an object key in the bucket |
| R2 | <acct>.r2.cloudflarestorage.com | auto | an object key in the bucket — same module as S3 |
The verdict of that table is the one-module fact: Local writes a file, S3
and R2 write an object, and the only thing distinguishing the two cloud rows
is two lines of config. The same <tenant>/blobs/<key>
shape names the file on disk and the object in the bucket, so the scope you
reason about never changes when the destination does.
Postgres is just a URL
The structured seam flips on one question:
Workbooks.DB.backend/0 returns :postgres when
WB_DATABASE_URL is set and non-empty, and :sqlite
otherwise — where SQLite files live at
<WB_DATA>/_db/<name>.sqlite. The reason any Postgres
provider works is that it's just a connection URL: CrunchyData, Fly PG,
Railway, Neon, Supabase are indistinguishable to the runtime. One code path,
no per-provider branches.
The entire dialect bridge between the two backends is one regex. Stores
write portable SQL with ?1 and ?2 placeholders; for
Postgres, the pg/1 helper rewrites them to $1 and
$2 — same numbers, different sigil. Rows come back as lists
either way, so the store query code is byte-identical regardless of backend.
SSL defaults on for any non-localhost host, with ?sslmode=disable
in the URL as the opt-out.
sequenceDiagram
participant S as a store
participant D as Workbooks.DB
participant Q as the backend
S->>D: query with ?1 / ?2 placeholders
alt WB_DATABASE_URL set
D->>Q: pg/1 rewrites ?1 → $1, opens Postgres (SSL on)
else unset
D->>Q: runs as-is against SQLite under WB_DATA/_db
end
Q-->>S: rows — as lists, either way
Read that exchange as one promise: the store never learns which backend
answered. It hands down ?1 placeholders and gets back a list of
rows; in between, the seam either translated the sigils and dialed a remote
Postgres over SSL, or ran the statement untouched against a local SQLite
file. The caller's code is the same in both branches — which is exactly why
moving from a file to Neon is a config change, not a rewrite.
the flip you get for FREE
Setting WB_DATABASE_URL doesn't only move tables — it flips
semantic search from brute-force to ANN, and you get that upgrade for free.
On SQLite, Workbooks.Vector stores vectors as JSON text and
computes cosine similarity in Elixir, scanning every row — O(n) per query.
This is the live-tested default, and it is genuinely fine for one library's
worth of vectors. On Postgres, the engine runs
CREATE EXTENSION IF NOT EXISTS vector on first open, stores a
real vector column, and ranks in the database with
pgvector's <=> cosine-distance operator and an
ORDER BY … LIMIT k. The interface is the same either way, so
callers never branch.
What's elegant is that the linear cost is made visible rather than hidden.
Once a SQLite brute-force search crosses @scan_warn — twenty-five
thousand vectors — the engine logs a one-time warning telling the operator
exactly what to do:
[warning] Vector: brute-force search over 31204 vectors on SQLite (O(n) per query). For sub-linear ANN at scale, set WB_DATABASE_URL → pgvector. See docs/VECTOR-QUERY.org.
A work pool would only buy a constant factor — the number of cores — so
the real fix at scale is sub-linear ANN, and the warning says so. Then the
flip itself is anticlimactic: the same Vector.search(tenant, query, k: 5)
call, but on Postgres it becomes
… ORDER BY vec <=> $2::vector LIMIT 5, with the score read
as 1 - (vec <=> query). The operator's entire migration was
one URL.
flowchart TD
q["Vector.search(tenant, query_vec, k: 5)"]
br{"pg?()"}
sq["SQLite — load every row
cosine in Elixir, O(n)
warns past 25,000 vectors"]
pg["Postgres — ORDER BY vec <=> query::vector LIMIT k
ranked in-DB, sub-linear"]
q --> br
br -- "unset" --> sq
br -- "set" --> pg
style q fill:#a8d4f0,stroke:#121316
style sq fill:#f3c5a3,stroke:#121316
style pg fill:#13d943,stroke:#121316,stroke-width:2.5px
style br fill:#fbfaf6,stroke:#121316
One last separation worth nailing down: where the vectors live is
this knob; how text becomes a vector is the separate
WB_EMBED knob — hash for a zero-dependency lexical
embedding, local for pure-Elixir static embeddings, or
openrouter for a hosted model. The two axes are orthogonal, and
the embedding side is the vectors deep-dive's subject,
not this page's.
three real POSTURES
Abstraction earns its keep when it collapses to a few concrete choices. There are three realistic postures, and each one is just a few literal env lines — the same runtime image in all three:
# posture 1 — single box (the defaults; mount a Fly volume at /data and you're durable) WB_STORAGE=local WB_DATA=/data # posture 2 — blobs on Cloudflare R2 (same adapter as S3; two lines of difference) WB_STORAGE=r2 WB_S3_ENDPOINT=https://<account>.r2.cloudflarestorage.com WB_S3_BUCKET=acme-workbooks WB_S3_KEY=… WB_S3_SECRET=… WB_S3_REGION=auto # posture 3 — add ANY Postgres; vector search flips to pgvector ANN as a side effect WB_DATABASE_URL=postgres://user:[email protected]/wb?sslmode=require
| posture | what's set | what you get | when to move on |
|---|---|---|---|
| single box | local + a volume | durable across redeploys, zero deps | when blobs need off-box durability |
| durable blobs | r2 + S3 creds | blobs on object storage, still SQLite for rows | when the vector warning fires at 25k |
| scale | add WB_DATABASE_URL | Postgres rows + pgvector ANN — for free | you're at the top of the ladder |
The honest read of that table: most single-box deploys never leave posture one, and shouldn't. SQLite on a volume is the right answer for the great majority of deployments. You climb the ladder when a specific need shows up — off-box blob durability, then sub-linear vector search — and each rung is additive config, not a migration of the runtime.
the knob that isn't STORAGE
depth rung · skippable — the one env var that's about identity, not bytes
One knob in that file isn't storage at all. WB_SIGNING_KEY is
a thirty-two-byte Ed25519 seed, base64-encoded, and it's the seed behind the
deployment's DID. Without it, the DID regenerates on every deploy — and the
moment the identity changes, every signature made under the old one stops
verifying, and sealed ledgers no longer check out. So it has to survive
redeploys.
Generating it is a one-liner —
elixir -e 'IO.puts(Base.encode64(:crypto.strong_rand_bytes(32)))'
— and like every value in this file, it lives in the platform's secret store,
set with something like fly secrets set, never baked into the
image. That's the same rule the whole file follows: the image is byte-identical
everywhere, and the things that differ between deployments are secrets, not
source.
where the seam ENDS today
Honesty section. The structured seam is migrating incrementally, and the
code says so. Today only the vectors store opens through
Workbooks.DB — it's the sole call site. The other structured
stores — vars, library, lifecycle, telemetry, the registries — still ride
SQLite directly until each is migrated onto the seam. The deploy doc's own
status block is candid: the seam, both blob adapters, the BYO-Postgres path,
and signing-key persistence are built and tested; what remains is migrating
each structured store onto Workbooks.DB incrementally, plus a
live S3/R2 and live Postgres round-trip once operator credentials exist.
So calibrate accordingly. The blob side — Local and S3/R2 with hand-rolled SigV4 — has a single-tenant store-fetch-install round-trip test; its cross-tenant denial and the SigV4 signing are enforced by construction and left verifiable (the signing fn is public for an AWS test vector), but neither has a dedicated test in the suite yet. The named cross-tenant test belongs to the separate brokered KV store, not these blob adapters. The SQLite brute-force vector path is the live-tested default. The pgvector path is shape-tested with the live round-trip documented and pending real creds. And two hardening items are explicitly noted but not yet built: per-tenant at-rest blob encryption with a tenant data key, and a microVM boundary for multi-tenant compute isolation.
The anti-hype, said plainly: this page is not an argument to run Postgres. It's an argument that picking the default doesn't lock you in. SQLite plus a volume is genuinely the right answer for most single-box deploys, and the runtime will tell you — at twenty-five thousand vectors — the one moment it isn't.
questions people actually ASK
Do I need Postgres?
No — and the runtime tells you when you might. SQLite on a mounted
volume is the default and is fine for most single-box deploys. The one
signal to flip WB_DATABASE_URL is the logged warning when a
brute-force vector search crosses twenty-five thousand vectors; setting the
URL there hands semantic search to pgvector's in-database ANN. Until that
moment, you don't need it.
Is R2 different from S3?
Not to the runtime. R2 is S3-compatible, so a single adapter module
serves both — in the selection case, r2 resolves to the exact
same module as s3. The difference is two config lines: the
endpoint (<acct>.r2.cloudflarestorage.com versus
s3.us-east-1.amazonaws.com) and the region
(auto versus us-east-1).
Can a tenant read another tenant's bucket prefix?
No — by construction. The seam scopes every call above the backend —
tenant is the first argument of put,
get, list, and delete, and
safe_key strips any .. before an adapter runs. The
blob adapters have no dedicated cross-tenant test yet; the named test where
Alice cannot read Bob's key lives on a separate seam, the brokered
KV store (StorageBroker). Swapping the backend can't widen the
blob scope either way, because the backend was never the thing deciding
access.
Does swapping backends migrate my data?
No — the seam swaps the destination, not the contents. Change
WB_STORAGE or set WB_DATABASE_URL and new writes
go to the new backend; moving existing bytes over is a separate operation
you run. The example file says to set these as platform secrets; it doesn't
promise automated data migration, so don't assume one.
Where does the embedding model live?
When WB_EMBED=local, the static embedding matrix —
Model2Vec, pure Elixir, no native code — is about thirty megabytes,
downloaded once to <WB_DATA>/_models/<id>/ and
reused thereafter. That's the embedder knob, separate from where the
vectors it produces are stored — see the vectors
lesson for the embedding side in full.
What about keeping the live disk in sync?
That's Litestream, and it uses the same s3:// trick: it
replicates a WAL-mode SQLite VFS continuously off-box, with a replica URL
that's a local file:// path in development and an
s3:// bucket — on R2 — in production. Same command, the only
difference is the URL. The sync lesson is its home.
keep GOING
This page gave the warehouse a floor. Its neighbors tell you what sits on it.