learn / 09·3 — under vfs · sync

the diskGROWSa heartbeat

The VFS lesson ended on a metaphor — dock the workbook and the disk gets a heartbeat. This page is the mechanics under that line. Freeze, hibernate, replicate, clone — every hard problem of keeping software alive collapses into a file copy, because the unit of state is one SQLite file. The heartbeat is cheap because the heart is a file.

sync11 min read
A lone technician dwarfed by a towering wall of identical glowing data drawers, each pulsing with a slow synchronized light like a heartbeat, one drawer slid open and copied to a smaller cart beside her — bright, monumental, 1970s sci-fi style

a thousand disks, one MACHINE

The parent lesson closed with a promise wearing a metaphor: dock a workbook and "the disk now has a heartbeat." Heartbeat is a lovely word and a complete non-answer. The ops-literate reader has real questions, and they all arrive at once.

If a thousand tenants dock workbooks into one engine, do a thousand processes sit resident in RAM forever? What happens to a workbook nobody has touched in a week — does it bill memory while it sleeps? What survives the machine actually dying mid-write? And how do a thousand users each get their own copy of one template app without a thousand provisioning jobs grinding in the background?

Every one of those is a famous-hard problem in operations — snapshot, hibernate, migrate, replicate, clone — and every one is hard for the same reason: the unit of state has always been a machine. A VM. A container. A box with a kernel and a process table and gigabytes of dirty RAM. This lesson is what those problems become when the unit of state is instead one SQLite file.

the DEFINITION

sync /sɪŋk/ noun

1. the lifecycle by which a docked VFS moves between RAM, local disk, cold storage, and off-box replicas — where every move is a file operation, because the disk is a file. The state machine isn't VM orchestration; it's a schedule for when to do which copy.

That definition is the whole page in one sentence. Freeze is File.cp. Clone is File.cp. Off-box durability is tailing the file's write-ahead log. Resume is cp back, plus wiping the scratch volume. The apparatus that keeps software alive doesn't collapse because we were clever — it collapses because the thing being kept alive is small enough to copy.

five states, one FILE

A docked workbook is always in exactly one of five states, and the legal moves between them are a small, explicit map — there are no implicit transitions. This lives in Workbooks.Lifecycle, and the transition table is the spine of everything that follows:

stateDiagram-v2
  [*] --> created
  created --> active
  active --> suspended
  active --> archived
  suspended --> active
  suspended --> frozen
  frozen --> active
  frozen --> archived
  archived --> active
  archived --> deleted
  deleted --> [*]
  

Read it left to right as a life. A workbook is created, then goes active — a live process in the engine. From active it can be suspended (stopped, but its file warm on local disk) or archived. A suspended workbook can wake back to active, or sink further to frozen — its file pushed to cold storage. Frozen wakes to active or archives. Archived wakes to active or, finally, is deleted — the one terminal state, the only door that doesn't come back.

The function that enforces this is four lines and returns one of two things:

Workbooks.Lifecycle.transition(:active, :suspended)
   → {:ok, :suspended}        -- a legal edge in the map

Workbooks.Lifecycle.transition(:frozen, :suspended)
   → {:error, :invalid}       -- no such edge; the machine refuses

You cannot slide from frozen straight to suspended, because that edge doesn't exist in the map. Every state change is a deliberate, named move — which is exactly what makes the lifecycle auditable instead of emergent.

the idle LADDER

Transitions are explicit, but two of them have a policy attached: a workbook nobody is using should not hold expensive resources. That policy is a ladder, and it is two numbers.

After 15 minutes idle, an active workbook should drop to suspended — stop the process, keep the file warm. After 24 hours idle, a suspended workbook should drop to frozen — push the file to cold storage. The function that encodes this is pure — it reads a state and an idle-time and returns the next state, or nil to stay put:

Workbooks.Lifecycle.auto_next(:active, 16 * 60)          → :suspended  (past 15 min)
Workbooks.Lifecycle.auto_next(:active, 60)               → nil         (fresh — stay)
Workbooks.Lifecycle.auto_next(:suspended, 2 * 24 * 3600) → :frozen     (past 24 h)
Workbooks.Lifecycle.auto_next(:frozen, 999_999)          → nil         (frozen needs an explicit wake)

The ladder is really a cost dial. Each rung trades wake-latency for cheapness — the colder the rung, the less it costs to leave a workbook there, and the longer it takes to bring it back:

statewhere the bytes livewhat it costswake latency
activeRAM — a live processmemory, the whole timenone — it's already up
suspendedwarm local diskdisk only — no processfast — start the process, file is right there
frozencold storage — a copied-away filecheapest — bytes at restslowest — copy the file back first

One honest note, stated plainly: these thresholds are the policy, and the policy is shipped. The pure function above is real and tested. What is not yet wired in the codebase is the cron-like ticker that walks the live sessions and applies auto_next on a beat — today the only caller is a demo. So read this section as the rules of the ladder, not as a claim that a background loop is already climbing it for you. The thresholds are the design; the driver is the next turn of the crank.

freeze is a COPY

Here is the move that justifies the whole framing. To freeze a workbook you stop its process and copy its file into cold storage. That's the function, in full:

def freeze(session_id, vfs_path, cold_dir) do
  File.cp(vfs_path, Path.join(cold_dir, "#{session_id}.sqlite"))
end

No VM image. No hypervisor-specific snapshot of dirty pages. The SQLite file is the durable state — freezing is "stop the process, keep the file." Resume is the mirror image: copy the file back into the live directory, reopen it, and clear the scratch volume:

def resume(session_id, cold_dir, live_dir) do
  File.cp(Path.join(cold_dir, "#{session_id}.sqlite"), live_path)
  {:ok, conn} = Workbooks.VFS.open(live_path)
  Workbooks.VFS.clear(conn, "tmp")     -- scratch does not survive a resume
end

Walk the round trip as three actors — the running instance, the cold directory, and the live directory it wakes back into:

sequenceDiagram
  participant I as the instance
  participant C as cold_dir
  participant L as live_dir
  Note over I: idle past threshold — freeze
  I->>I: stop the process
  I->>C: cp vfs.sqlite → cold_dir/<id>.sqlite
  Note over C: bytes at rest — a portable file
  Note over L: a request arrives — resume
  C->>L: cp the file back into live_dir
  L->>L: open the VFS
  L->>L: clear the tmp volume — scratch is gone
  Note over L: process restarts on warm bytes
  

Contrast the alternative the industry reaches for. A VM snapshot is a multi-gigabyte image of RAM and device state, tied to one hypervisor, slow to capture and slow to move. Ours is a SQLite file you could email. The freeze costs what copying a project-sized database costs, and the frozen artifact is portable to anywhere a file can go.

warm, cold, and PREFETCH

Waking a workbook has two cases, and naming them is half the work. A session is warm when its instance is already live in the engine — a registry lookup finds it. A session is cold when only its VFS persists, on local disk or frozen away. Sessions.resume branches on exactly that question:

flowchart TD
  start["Sessions.resume(id, bytes)"]
  warm{"already warm?
(registry lookup)"} local{"local file —
a warm cache?"} frozen{"state == frozen
and a cold_dir?"} hitwarm["{:warm, :already_active}
zero work"] fromcold["Lifecycle.resume
cp from cold storage"] fromcache["open the local file"] fresh["fresh :memory: VFS"] starti["start the instance ·
set state active"] done["{:cold, vfs_path}"] start --> warm warm -- yes --> hitwarm warm -- no --> local local -- yes --> fromcache --> starti local -- no --> frozen frozen -- yes --> fromcold --> starti frozen -- no --> fresh --> starti starti --> done style hitwarm fill:#13d943,stroke:#121316,stroke-width:2.5px style done fill:#a8d4f0,stroke:#121316 style start fill:#ffffff,stroke:#121316 style warm fill:#fbfaf6,stroke:#121316 style local fill:#fbfaf6,stroke:#121316 style frozen fill:#fbfaf6,stroke:#121316

Follow the tree. If the session is already warm, the call returns {:warm, :already_active} and does no work at all. If it's cold, the engine locates the VFS — a local file is a warm cache and opens directly; a frozen session with a cold directory gets restored by Lifecycle.resume; failing both, a fresh in-memory store is born. Then the instance starts, the control plane marks it active, and the call returns {:cold, vfs_path}.

There is one more move worth its own name: prefetch on auth. Sessions.prefetch restores or locates the VFS without starting the instance — so by the time a user's first request lands after login, the file is already warm and the first interaction doesn't pay the restore cost. Logging in is the signal; the disk gets ready before the request arrives.

what SURVIVES

depth rung · skippable — the survival contract, volume by volume

The resume code cleared tmp for a reason: not every volume is supposed to outlive a freeze. The VFS has three volumes, and what each one does across a freeze/resume cycle is a designed contract, not an accident. The volumes sibling treats them in full; here is only their behaviour through sync:

volumesurvives freeze → resume?leaves on egress?
workspaceyes — the workbook's files persistyes — it's the public part
memoryyes — what the agent learned persistsno — stripped at the boundary
tmpno — cleared on resumeno — never leaves

The most quotable proof of that contract is a single map the freeze-resume demo returns: after a full cycle, workspace: {:ok, "the workbook files"}, memory: {:ok, "what the agent learned"}, and tmp_after_resume: :error. The scratch read fails because the scratch is gone — by design.

One layer beneath that: components are stateless between run calls. The VFS is the orthogonal-persistence layer they checkpoint into. A component that declares :persist is signing a contract — "I checkpoint my state to the VFS" — not asking for a raw snapshot of its linear memory. That's why the survival rules can be this clean: there's nothing to snapshot except a file, and the file's volumes already say what stays.

the WAL HEARTBEAT

Freeze and resume keep a workbook alive across a process restart. But the parent's word was heartbeat — durability against the machine itself dying. That's a different mechanism, and it's the one that earns the metaphor: the file's write-ahead log, streamed off-box.

Turn a VFS into WAL mode and a long-lived companion process tails that log to a replica. The module is four functions over the litestream binary — enable WAL, replicate, stop, restore:

Workbooks.Litestream.enable_wal(conn)          -- PRAGMA journal_mode=WAL
port = Workbooks.Litestream.replicate(db, "file:///tmp/ls-42/replica")
   -- prod is the SAME call with "s3://my-bucket/vfs/sess-9" + creds in env
Workbooks.Litestream.stop(port)
{:ok, _} = Workbooks.Litestream.restore("file:///tmp/ls-42/replica", "/tmp/restored.db")
   → restored_n2: {:ok, "row 2"} · restored_rows: "[[3]]"

The dev/prod difference is one string. A file:// URL writes the replica to a local path; an s3:// URL (Cloudflare R2 in production) writes it to a bucket. Same command, the only difference is the URL plus storage credentials in the environment. The binary is baked into the engine container — ARG LITESTREAM_VERSION=0.5.11 in the runtime Dockerfile, whose headline reads "BEAM + wasmtime + litestream (off-box VFS durability)." The deploy-kit wires the control-plane registry onto the durable, replicated volume: WB_REGISTRY=${WB_DATA}/registry.db, with WB_DATA the mounted volume litestream tails.

Walk the heartbeat as a story: writes land in the VFS, the WAL records them, the litestream port ships the WAL to the replica — and when the engine dies, restore pulls the file back from the replica and resume picks up where it left off:

sequenceDiagram
  participant V as the VFS
  participant W as the WAL
  participant LS as litestream port
  participant R as replica (s3:// or file://)
  V->>W: every write appends to the log
  W->>LS: the port tails the WAL
  LS->>R: ships changes off-box (asynchronously)
  Note over V,R: ⚡ the engine dies
  R->>V: restore — pull the file back from the replica
  Note over V: resume on the restored bytes
  

Two honesties belong right here. First, the word asynchronously is load-bearing: the round-trip demo literally calls Process.sleep(3000) before trusting the replica. That sleep is the lag window. This is durability, not synchronous consistency — a replica is a few seconds behind, not a mirror. Second, the engine carries the binary and the seam, and the round trip is demo-proven, but automatic per-VFS replication is not visible as wired in the codebase. Turning it on is configuration — the engine ships the heart; you decide which disks get one.

one template, a thousand TENANTS

Back to the last problem from the top: a thousand users each getting their own copy of one template app. With a machine as the unit of state that's a provisioning pipeline. With a file as the unit of state it's a cp. clone_for copies a read-only base VFS into a fresh, uniquely-named file per tenant:

def clone_for(base_path, tenant, dir \\ System.tmp_dir!()) do
  dest = Path.join(dir, "tenant-#{tenant}-#{System.unique_integer([:positive])}.sqlite")
  File.cp(base_path, dest)
  {:ok, dest}
end

Every tenant starts from the same seeded template; their writes go to their own isolated copy; the base is never mutated. This is the base-image model — without an image registry, because the image is a file. The isolation isn't a hope; the template demo proves it. Seed a base, clone for two tenants, let each write its own path:

{:ok, conn} = Workbooks.VFS.open(base)
Workbooks.VFS.put(conn, "/seed", "from the template")
{:ok, a} = Workbooks.Lifecycle.clone_for(base, "acme")    → /tmp/tenant-acme-1234.sqlite
{:ok, b} = Workbooks.Lifecycle.clone_for(base, "globex")
-- each writes its own /own, then we check the receipts:
%{a_seed: {:ok, "from the template"}, a_sees_b: false, base_untouched: true}

a_sees_b: false — acme cannot see globex's write. base_untouched: true — the template is exactly as seeded. One base file fans out to many isolated copies:

flowchart TD
  base["base.sqlite
read-only template · /seed"] acme["tenant-acme-1234.sqlite
+ its own /own"] globex["tenant-globex-5678.sqlite
+ its own /own"] more["tenant-…-….sqlite"] base -- "File.cp" --> acme base -- "File.cp" --> globex base -- "File.cp" --> more style base fill:#f2ddb0,stroke:#121316,stroke-width:2.5px style acme fill:#a8d4f0,stroke:#121316 style globex fill:#a8d4f0,stroke:#121316 style more fill:#fbfaf6,stroke:#121316,stroke-dasharray:4 3

One contract to respect: the base must be read-only. The copy is safe because the template is never written while it's being cloned — that's a convention the design relies on, not a lock the code enforces. "Provisioning a tenant" is File.cp; the discipline is keeping the source still.

who keeps the LEDGER

depth rung · skippable — the registry that tracks who's in what state

Something has to remember which sessions exist and what state each is in. That's the control plane — and it is itself just SQLite. One table, one row per instance:

columnholds
idthe session id — primary key
tenantwho it belongs to
statecreated · active · suspended · frozen · archived
vfs_pathwhere the data plane's file lives
updatedlast transition time

The distinction that makes this clean: SQLite is the control plane — the ledger of who's in what state — and each instance's VFS is the data plane, the actual bytes. The ledger is small; the data is distributed across one file per workbook. A new session registers with state created; every freeze, wake, and demotion writes a row update here.

And the multi-machine story is a swap, not a rewrite. The moduledoc says it directly: SQLite is the control plane, Postgres is only needed if we go multi-machine — the same flow, with the registry backed by Postgres instead of a local file. That swap is designed, not built; today the ledger is a single-machine SQLite db pointed at by WB_REGISTRY.

where the heartbeat ENDS

Honesty section. Several pieces of this lesson are real-and-shipped, and a few are real-but-not-yet-wired. Telling them apart is the point.

The idle ticker isn't wired. The 15-minute and 24-hour thresholds are a pure, tested policy function — but no production loop walks live sessions and applies it on a beat. The thresholds are the design; the driver is the next turn of the crank.

Litestream is a seam, not an autopilot. The engine carries the v0.5.11 binary and the four-function module, and the replicate-restore round trip is demo-proven. But attaching replication to every VFS is configuration, not an automatic property — there's no supervisor streaming each disk by default.

The registry is single-machine. Postgres as the multi-machine backend is designed and named, not deployed. Today the ledger is one local SQLite file.

Replication has lag. The replica is asynchronous — seconds behind, the gap the demo's three-second sleep makes visible. This is durable, not synchronously consistent. If the machine dies, you lose the very last, un-shipped writes, not the workbook.

archived is a state without a home yet. It's a legal node in the machine, but no archival-storage behaviour hangs off it in code — the slot exists ahead of the shelf.

None of these caveats touch the spine: the unit of state is a file, and every move in the lifecycle is a copy of that file. That part is true all the way down, today.

questions people actually ASK

Does an idle workbook cost RAM forever?

No — that's what the ladder is for. The policy demotes an active workbook to suspended after 15 minutes idle (process stopped, file warm on disk) and to frozen after 24 hours (file pushed to cold storage). Each rung trades wake-latency for cheapness. Caveat in the same breath: the thresholds are shipped policy, but the background ticker that applies them on a beat isn't wired yet — so today this is the design of the cost dial, not an automatic sweep.

Is freeze a VM snapshot?

No. Freeze stops the process and runs File.cp on one SQLite file into cold storage — no RAM image, no hypervisor format. The SQLite file is the durable state. Resume copies it back and clears the scratch volume. The frozen artifact is portable to anywhere a file can go.

What if the machine dies mid-write?

The VFS runs in WAL mode and litestream tails the WAL to an off-box replica (file:// in dev, s3:// / R2 in prod — same command, different URL). On a new machine, restore pulls the file back and resume continues. Honest caveat: replication is asynchronous — seconds of lag — so you can lose the last un-shipped writes, not the workbook. Durable, not synchronously consistent.

Do tenants share a disk?

No. clone_for copies a read-only base VFS into a uniquely named file per tenant — tenant-acme-1234.sqlite — and writes go to that isolated copy. The template is never mutated. The demo proves it: one tenant cannot see another's writes (a_sees_b: false) and the base is untouched. The contract is that the base stays read-only while it's cloned.

Why isn't this just Postgres?

Because the data plane is one file per workbook, not one shared database. SQLite is the control plane — the small ledger of who's in what state — and each VFS is the data plane. That's what makes freeze, clone, and off-box durability all reduce to file copies. Postgres is the designed swap for the registry when you go multi-machine, not a replacement for the per-workbook disk.

Can I run all of this myself?

Yes — every mechanism here is an invokable demo from iex -S mix: freeze-and-resume, the three-volume survival cycle, the idle ladder, the two-resume warm/cold trace, and the litestream round trip. The lesson's claims are functions you can call, not diagrams you have to trust.

keep GOING

Sync is the disk's view of staying alive — its neighbors show you the same machine from other angles.