a thousand disks, one MACHINE
The parent lesson closed with a promise wearing a metaphor: dock a workbook and "the disk now has a heartbeat." Heartbeat is a lovely word and a complete non-answer. The ops-literate reader has real questions, and they all arrive at once.
If a thousand tenants dock workbooks into one engine, do a thousand processes sit resident in RAM forever? What happens to a workbook nobody has touched in a week — does it bill memory while it sleeps? What survives the machine actually dying mid-write? And how do a thousand users each get their own copy of one template app without a thousand provisioning jobs grinding in the background?
Every one of those is a famous-hard problem in operations — snapshot, hibernate, migrate, replicate, clone — and every one is hard for the same reason: the unit of state has always been a machine. A VM. A container. A box with a kernel and a process table and gigabytes of dirty RAM. This lesson is what those problems become when the unit of state is instead one SQLite file.
the DEFINITION
1. the lifecycle by which a docked VFS moves between RAM, local disk, cold storage, and off-box replicas — where every move is a file operation, because the disk is a file. The state machine isn't VM orchestration; it's a schedule for when to do which copy.
That definition is the whole page in one sentence. Freeze is
File.cp. Clone is File.cp. Off-box durability is
tailing the file's write-ahead log. Resume is cp back, plus
wiping the scratch volume. The apparatus that keeps software alive doesn't
collapse because we were clever — it collapses because the thing being kept
alive is small enough to copy.
five states, one FILE
A docked workbook is always in exactly one of five states, and the legal
moves between them are a small, explicit map — there are no implicit
transitions. This lives in Workbooks.Lifecycle, and the
transition table is the spine of everything that follows:
stateDiagram-v2 [*] --> created created --> active active --> suspended active --> archived suspended --> active suspended --> frozen frozen --> active frozen --> archived archived --> active archived --> deleted deleted --> [*]
Read it left to right as a life. A workbook is created, then goes active — a live process in the engine. From active it can be suspended (stopped, but its file warm on local disk) or archived. A suspended workbook can wake back to active, or sink further to frozen — its file pushed to cold storage. Frozen wakes to active or archives. Archived wakes to active or, finally, is deleted — the one terminal state, the only door that doesn't come back.
The function that enforces this is four lines and returns one of two things:
Workbooks.Lifecycle.transition(:active, :suspended)
→ {:ok, :suspended} -- a legal edge in the map
Workbooks.Lifecycle.transition(:frozen, :suspended)
→ {:error, :invalid} -- no such edge; the machine refuses
You cannot slide from frozen straight to suspended, because that edge doesn't exist in the map. Every state change is a deliberate, named move — which is exactly what makes the lifecycle auditable instead of emergent.
the idle LADDER
Transitions are explicit, but two of them have a policy attached: a workbook nobody is using should not hold expensive resources. That policy is a ladder, and it is two numbers.
After 15 minutes idle, an active workbook should drop
to suspended — stop the process, keep the file warm. After
24 hours idle, a suspended workbook should drop to
frozen — push the file to cold storage. The function that
encodes this is pure — it reads a state and an idle-time and returns the
next state, or nil to stay put:
Workbooks.Lifecycle.auto_next(:active, 16 * 60) → :suspended (past 15 min) Workbooks.Lifecycle.auto_next(:active, 60) → nil (fresh — stay) Workbooks.Lifecycle.auto_next(:suspended, 2 * 24 * 3600) → :frozen (past 24 h) Workbooks.Lifecycle.auto_next(:frozen, 999_999) → nil (frozen needs an explicit wake)
The ladder is really a cost dial. Each rung trades wake-latency for cheapness — the colder the rung, the less it costs to leave a workbook there, and the longer it takes to bring it back:
| state | where the bytes live | what it costs | wake latency |
|---|---|---|---|
| active | RAM — a live process | memory, the whole time | none — it's already up |
| suspended | warm local disk | disk only — no process | fast — start the process, file is right there |
| frozen | cold storage — a copied-away file | cheapest — bytes at rest | slowest — copy the file back first |
One honest note, stated plainly: these thresholds are the policy,
and the policy is shipped. The pure function above is real and tested. What
is not yet wired in the codebase is the cron-like ticker that walks
the live sessions and applies auto_next on a beat — today the
only caller is a demo. So read this section as the rules of the ladder, not
as a claim that a background loop is already climbing it for you. The
thresholds are the design; the driver is the next turn of the crank.
freeze is a COPY
Here is the move that justifies the whole framing. To freeze a workbook you stop its process and copy its file into cold storage. That's the function, in full:
def freeze(session_id, vfs_path, cold_dir) do
File.cp(vfs_path, Path.join(cold_dir, "#{session_id}.sqlite"))
end
No VM image. No hypervisor-specific snapshot of dirty pages. The SQLite file is the durable state — freezing is "stop the process, keep the file." Resume is the mirror image: copy the file back into the live directory, reopen it, and clear the scratch volume:
def resume(session_id, cold_dir, live_dir) do
File.cp(Path.join(cold_dir, "#{session_id}.sqlite"), live_path)
{:ok, conn} = Workbooks.VFS.open(live_path)
Workbooks.VFS.clear(conn, "tmp") -- scratch does not survive a resume
end
Walk the round trip as three actors — the running instance, the cold directory, and the live directory it wakes back into:
sequenceDiagram participant I as the instance participant C as cold_dir participant L as live_dir Note over I: idle past threshold — freeze I->>I: stop the process I->>C: cp vfs.sqlite → cold_dir/<id>.sqlite Note over C: bytes at rest — a portable file Note over L: a request arrives — resume C->>L: cp the file back into live_dir L->>L: open the VFS L->>L: clear the tmp volume — scratch is gone Note over L: process restarts on warm bytes
Contrast the alternative the industry reaches for. A VM snapshot is a multi-gigabyte image of RAM and device state, tied to one hypervisor, slow to capture and slow to move. Ours is a SQLite file you could email. The freeze costs what copying a project-sized database costs, and the frozen artifact is portable to anywhere a file can go.
warm, cold, and PREFETCH
Waking a workbook has two cases, and naming them is half the work.
A session is warm when its instance is already live in the engine —
a registry lookup finds it. A session is cold when only its VFS
persists, on local disk or frozen away. Sessions.resume branches
on exactly that question:
flowchart TD
start["Sessions.resume(id, bytes)"]
warm{"already warm?
(registry lookup)"}
local{"local file —
a warm cache?"}
frozen{"state == frozen
and a cold_dir?"}
hitwarm["{:warm, :already_active}
zero work"]
fromcold["Lifecycle.resume
cp from cold storage"]
fromcache["open the local file"]
fresh["fresh :memory: VFS"]
starti["start the instance ·
set state active"]
done["{:cold, vfs_path}"]
start --> warm
warm -- yes --> hitwarm
warm -- no --> local
local -- yes --> fromcache --> starti
local -- no --> frozen
frozen -- yes --> fromcold --> starti
frozen -- no --> fresh --> starti
starti --> done
style hitwarm fill:#13d943,stroke:#121316,stroke-width:2.5px
style done fill:#a8d4f0,stroke:#121316
style start fill:#ffffff,stroke:#121316
style warm fill:#fbfaf6,stroke:#121316
style local fill:#fbfaf6,stroke:#121316
style frozen fill:#fbfaf6,stroke:#121316
Follow the tree. If the session is already warm, the call returns
{:warm, :already_active} and does no work at all. If it's cold,
the engine locates the VFS — a local file is a warm cache and opens
directly; a frozen session with a cold directory gets restored by
Lifecycle.resume; failing both, a fresh in-memory store is
born. Then the instance starts, the control plane marks it
active, and the call returns {:cold, vfs_path}.
There is one more move worth its own name: prefetch on auth.
Sessions.prefetch restores or locates the VFS without
starting the instance — so by the time a user's first request lands after
login, the file is already warm and the first interaction doesn't pay the
restore cost. Logging in is the signal; the disk gets ready before the
request arrives.
what SURVIVES
depth rung · skippable — the survival contract, volume by volume
The resume code cleared tmp for a reason: not every volume
is supposed to outlive a freeze. The VFS has three volumes, and what each
one does across a freeze/resume cycle is a designed contract, not an
accident. The volumes sibling treats them in full;
here is only their behaviour through sync:
| volume | survives freeze → resume? | leaves on egress? |
|---|---|---|
| workspace | yes — the workbook's files persist | yes — it's the public part |
| memory | yes — what the agent learned persists | no — stripped at the boundary |
| tmp | no — cleared on resume | no — never leaves |
The most quotable proof of that contract is a single map the
freeze-resume demo returns: after a full cycle,
workspace: {:ok, "the workbook files"},
memory: {:ok, "what the agent learned"}, and
tmp_after_resume: :error. The scratch read fails because the
scratch is gone — by design.
One layer beneath that: components are stateless between
run calls. The VFS is the orthogonal-persistence layer they
checkpoint into. A component that declares :persist is signing
a contract — "I checkpoint my state to the VFS" — not asking for a raw
snapshot of its linear memory. That's why the survival rules can be this
clean: there's nothing to snapshot except a file, and the file's volumes
already say what stays.
the WAL HEARTBEAT
Freeze and resume keep a workbook alive across a process restart. But the parent's word was heartbeat — durability against the machine itself dying. That's a different mechanism, and it's the one that earns the metaphor: the file's write-ahead log, streamed off-box.
Turn a VFS into WAL mode and a long-lived companion process tails that log to a replica. The module is four functions over the litestream binary — enable WAL, replicate, stop, restore:
Workbooks.Litestream.enable_wal(conn) -- PRAGMA journal_mode=WAL
port = Workbooks.Litestream.replicate(db, "file:///tmp/ls-42/replica")
-- prod is the SAME call with "s3://my-bucket/vfs/sess-9" + creds in env
Workbooks.Litestream.stop(port)
{:ok, _} = Workbooks.Litestream.restore("file:///tmp/ls-42/replica", "/tmp/restored.db")
→ restored_n2: {:ok, "row 2"} · restored_rows: "[[3]]"
The dev/prod difference is one string. A file:// URL writes
the replica to a local path; an s3:// URL (Cloudflare R2 in
production) writes it to a bucket. Same command, the only difference is the
URL plus storage credentials in the environment. The binary is baked into
the engine container — ARG LITESTREAM_VERSION=0.5.11 in the
runtime Dockerfile, whose headline reads "BEAM + wasmtime + litestream
(off-box VFS durability)." The deploy-kit wires the control-plane registry
onto the durable, replicated volume:
WB_REGISTRY=${WB_DATA}/registry.db, with
WB_DATA the mounted volume litestream tails.
Walk the heartbeat as a story: writes land in the VFS, the WAL records them, the litestream port ships the WAL to the replica — and when the engine dies, restore pulls the file back from the replica and resume picks up where it left off:
sequenceDiagram participant V as the VFS participant W as the WAL participant LS as litestream port participant R as replica (s3:// or file://) V->>W: every write appends to the log W->>LS: the port tails the WAL LS->>R: ships changes off-box (asynchronously) Note over V,R: ⚡ the engine dies R->>V: restore — pull the file back from the replica Note over V: resume on the restored bytes
Two honesties belong right here. First, the word
asynchronously is load-bearing: the round-trip demo literally calls
Process.sleep(3000) before trusting the replica. That sleep is
the lag window. This is durability, not synchronous consistency — a replica
is a few seconds behind, not a mirror. Second, the engine carries the binary
and the seam, and the round trip is demo-proven, but automatic per-VFS
replication is not visible as wired in the codebase. Turning it on is
configuration — the engine ships the heart; you decide which disks get one.
one template, a thousand TENANTS
Back to the last problem from the top: a thousand users each getting
their own copy of one template app. With a machine as the unit of state
that's a provisioning pipeline. With a file as the unit of state it's a
cp. clone_for copies a read-only base VFS into a
fresh, uniquely-named file per tenant:
def clone_for(base_path, tenant, dir \\ System.tmp_dir!()) do
dest = Path.join(dir, "tenant-#{tenant}-#{System.unique_integer([:positive])}.sqlite")
File.cp(base_path, dest)
{:ok, dest}
end
Every tenant starts from the same seeded template; their writes go to their own isolated copy; the base is never mutated. This is the base-image model — without an image registry, because the image is a file. The isolation isn't a hope; the template demo proves it. Seed a base, clone for two tenants, let each write its own path:
{:ok, conn} = Workbooks.VFS.open(base)
Workbooks.VFS.put(conn, "/seed", "from the template")
{:ok, a} = Workbooks.Lifecycle.clone_for(base, "acme") → /tmp/tenant-acme-1234.sqlite
{:ok, b} = Workbooks.Lifecycle.clone_for(base, "globex")
-- each writes its own /own, then we check the receipts:
%{a_seed: {:ok, "from the template"}, a_sees_b: false, base_untouched: true}
a_sees_b: false — acme cannot see globex's write.
base_untouched: true — the template is exactly as seeded. One
base file fans out to many isolated copies:
flowchart TD base["base.sqlite
read-only template · /seed"] acme["tenant-acme-1234.sqlite
+ its own /own"] globex["tenant-globex-5678.sqlite
+ its own /own"] more["tenant-…-….sqlite"] base -- "File.cp" --> acme base -- "File.cp" --> globex base -- "File.cp" --> more style base fill:#f2ddb0,stroke:#121316,stroke-width:2.5px style acme fill:#a8d4f0,stroke:#121316 style globex fill:#a8d4f0,stroke:#121316 style more fill:#fbfaf6,stroke:#121316,stroke-dasharray:4 3
One contract to respect: the base must be read-only. The copy is safe
because the template is never written while it's being cloned — that's a
convention the design relies on, not a lock the code enforces. "Provisioning
a tenant" is File.cp; the discipline is keeping the source
still.
who keeps the LEDGER
depth rung · skippable — the registry that tracks who's in what state
Something has to remember which sessions exist and what state each is in. That's the control plane — and it is itself just SQLite. One table, one row per instance:
| column | holds |
|---|---|
id | the session id — primary key |
tenant | who it belongs to |
state | created · active · suspended · frozen · archived |
vfs_path | where the data plane's file lives |
updated | last transition time |
The distinction that makes this clean: SQLite is the control
plane — the ledger of who's in what state — and each instance's VFS is
the data plane, the actual bytes. The ledger is small; the data is
distributed across one file per workbook. A new session
registers with state created; every freeze, wake,
and demotion writes a row update here.
And the multi-machine story is a swap, not a rewrite. The moduledoc says
it directly: SQLite is the control plane, Postgres is only needed if we go
multi-machine — the same flow, with the registry backed by Postgres instead
of a local file. That swap is designed, not built; today the ledger is a
single-machine SQLite db pointed at by WB_REGISTRY.
where the heartbeat ENDS
Honesty section. Several pieces of this lesson are real-and-shipped, and a few are real-but-not-yet-wired. Telling them apart is the point.
The idle ticker isn't wired. The 15-minute and 24-hour thresholds are a pure, tested policy function — but no production loop walks live sessions and applies it on a beat. The thresholds are the design; the driver is the next turn of the crank.
Litestream is a seam, not an autopilot. The engine carries the v0.5.11 binary and the four-function module, and the replicate-restore round trip is demo-proven. But attaching replication to every VFS is configuration, not an automatic property — there's no supervisor streaming each disk by default.
The registry is single-machine. Postgres as the multi-machine backend is designed and named, not deployed. Today the ledger is one local SQLite file.
Replication has lag. The replica is asynchronous — seconds behind, the gap the demo's three-second sleep makes visible. This is durable, not synchronously consistent. If the machine dies, you lose the very last, un-shipped writes, not the workbook.
archived is a state without a home yet. It's a legal
node in the machine, but no archival-storage behaviour hangs off it in code
— the slot exists ahead of the shelf.
None of these caveats touch the spine: the unit of state is a file, and every move in the lifecycle is a copy of that file. That part is true all the way down, today.
questions people actually ASK
Does an idle workbook cost RAM forever?
No — that's what the ladder is for. The policy demotes an active workbook to suspended after 15 minutes idle (process stopped, file warm on disk) and to frozen after 24 hours (file pushed to cold storage). Each rung trades wake-latency for cheapness. Caveat in the same breath: the thresholds are shipped policy, but the background ticker that applies them on a beat isn't wired yet — so today this is the design of the cost dial, not an automatic sweep.
Is freeze a VM snapshot?
No. Freeze stops the process and runs File.cp on one
SQLite file into cold storage — no RAM image, no hypervisor format. The
SQLite file is the durable state. Resume copies it back and clears the
scratch volume. The frozen artifact is portable to anywhere a file can go.
What if the machine dies mid-write?
The VFS runs in WAL mode and litestream tails the WAL to an off-box
replica (file:// in dev, s3:// / R2 in prod —
same command, different URL). On a new machine, restore pulls the file
back and resume continues. Honest caveat: replication is asynchronous —
seconds of lag — so you can lose the last un-shipped writes, not the
workbook. Durable, not synchronously consistent.
Do tenants share a disk?
No. clone_for copies a read-only base VFS into a uniquely
named file per tenant — tenant-acme-1234.sqlite — and writes
go to that isolated copy. The template is never mutated. The demo proves
it: one tenant cannot see another's writes (a_sees_b: false)
and the base is untouched. The contract is that the base stays read-only
while it's cloned.
Why isn't this just Postgres?
Because the data plane is one file per workbook, not one shared database. SQLite is the control plane — the small ledger of who's in what state — and each VFS is the data plane. That's what makes freeze, clone, and off-box durability all reduce to file copies. Postgres is the designed swap for the registry when you go multi-machine, not a replacement for the per-workbook disk.
Can I run all of this myself?
Yes — every mechanism here is an invokable demo from iex -S
mix: freeze-and-resume, the three-volume survival cycle, the idle
ladder, the two-resume warm/cold trace, and the litestream round trip. The
lesson's claims are functions you can call, not diagrams you have to
trust.
keep GOING
Sync is the disk's view of staying alive — its neighbors show you the same machine from other angles.