the disk speaks SQL. say WHAT, exactly?
The parent lesson showed three dreamy queries and a line — your disk speaks SQL — and left it there. That was the poetry. This is the contract. If you're going to hand a SQL string to a component an LLM wrote, you want the answers underneath the slogan: what table, what columns, what exactly do I call, what comes back, and what happens when I get the query wrong.
The reassuring part is how small the truth is. There is no query API with a dozen methods. There is one table, one import, and one error shape — and once you've seen them, you've seen all of it. The parent page even drew the table a little stylized, with columns that don't exist; this page reconciles the illustration with the real four columns, then shows you the one statement that can ask a question of fifty workbooks at once.
the DEFINITION
1. the single Policy-gated Dock import that runs one SQL string against the Instance's own disk and returns JSON — rows, or a caught error, and never a host crash.
It is the working half of the VFS: that lesson described a disk you can query; this names the exact seam through which the query passes. One function, one string in, one string out — and the entire safety story rides on that narrow shape.
the whole SCHEMA, all of it
Here is the disk, in full. Not an inode graph, not a manifest plus a blob
store — one SQLite table, four columns, one file per Instance. The
CREATE TABLE the engine actually runs is this:
CREATE TABLE IF NOT EXISTS vfs ( volume TEXT NOT NULL, path TEXT NOT NULL, content BLOB, mtime INTEGER, PRIMARY KEY (volume, path) );
That is one hundred percent of the schema. The composite primary key
(volume, path) is the whole namespace: a path is unique
within a volume, and the volume column is the switch
between the three regions — workspace, memory, and
tmp, with workspace the default. That column is
deliberately visible to raw SQL so a component can scope a query to any one of
them. Column by column:
| column | type | what it means | example |
|---|---|---|---|
volume | TEXT | the region — workspace · memory · tmp | workspace |
path | TEXT | the file name within that volume | /reports/week-24.org |
content | BLOB | the bytes themselves, one row per file | 18,234 bytes |
mtime | INTEGER | unix seconds, written on every put | 1749600000 |
Now the reconciliation the parent lesson owes you. It drew queries against
a files table with size, dir, and
writer columns. None of those columns exist. Size is
length(content); a directory is a string operation on
path; and writer-attribution simply isn't in the schema at all —
there is no column that records who wrote a row. So the parent's
SELECT path FROM files WHERE mtime > '2026-06-09' becomes, told
truthfully — note mtime is unix seconds, so the date has to be
converted, and the table is vfs:
SELECT path, length(content) AS bytes
FROM vfs
WHERE volume = 'workspace'
AND mtime > strftime('%s', '2026-06-09');
→ [["/reports/week-24.org", 18234], ["/notes/june.org", 4101]]
Two things to carry out of that result. First, rows come back as
arrays of arrays — [[value, value], …] — not objects with
named keys; the host returns the raw row order, not a dictionary. Second, the
size you wanted was derived, not stored, and the volume = 'workspace'
clause is how you stay inside one region. Four columns is the whole grammar.
Everything else is SQL you already know.
one import, granted or ABSENT
There is exactly one way in. The engine's interface declares it as a single typed function in the Dock contract:
// world engine — package workbooks:engine import vfs-query: func(sql: string) -> string
A string in, a string out. But a component doesn't always have
that import — it has it only if its Policy profile
grants the vfs capability. When the host assembles a component's
imports, granting vfs adds precisely one entry to the map:
vfs-query, bound to a closure over this Instance's own
database connection. No grant, no entry.
And here is the part that makes the gating airtight: it is gating by
construction, not by a runtime check. The host provides only the imports
the Policy grants. A component that imports vfs-query without the
grant doesn't get an error when it calls — it fails to instantiate,
because the import it asks for isn't there to link against. The malformed case
is caught even earlier, at wb build, where conformance extracts
the component's WIT and allows only the three engine-level imports
(session-info, vfs-query, run-command).
An unauthorized capability is a wire that was never run.
flowchart TD prof["a Policy profile
compute · minimal · network · posix"] caps["its caps list
vfs is the floor — every profile has it"] imp["the imports map
granting vfs adds one entry: vfs-query"] ok["component instantiates
the import links to this Instance's disk"] no["component fails to instantiate
imported a cap it was never granted"] prof --> caps caps -- "vfs granted" --> imp --> ok caps -- "vfs absent" --> no style prof fill:#ffffff,stroke:#121316 style caps fill:#fbfaf6,stroke:#121316 style imp fill:#a8d4f0,stroke:#121316,stroke-width:2.5px style ok fill:#13d943,stroke:#121316,stroke-width:2.5px style no fill:#f3c5a3,stroke:#121316
Read that graph top to bottom as a single sentence. A profile names a caps
list; the caps list either contains vfs or it doesn't; if it
does, one import — vfs-query — appears in the map and the
component links cleanly to its own disk; if it doesn't, the component asking
for that import never finishes loading. Two endpoints, and which one you reach
was decided before a line of the component ran.
the round TRIP
Now follow one call. The component speaks the SDK's ergonomic surface — in
Rust, vfs::query::<T>(sql) after a one-line
dock::bind!(bindings, vfs); in JS, dock.vfs.query(sql)
after bind(imports). That call crosses the typed boundary into the
host closure, which hands the SQL to VFS.query_json, which prepares
and runs it against SQLite and encodes the result as a JSON string — back across
the same boundary, parsed by the SDK into rows.
sequenceDiagram
participant C as component
participant D as Dock import — vfs-query
participant V as VFS.query_json
participant S as SQLite — this Instance's disk
C->>D: query("SELECT path FROM vfs WHERE volume='memory'")
D->>V: sql string
V->>S: prepare + fetch_all
S-->>V: rows as lists
V-->>D: Jason.encode! → JSON string
D-->>C: parsed rows — [[…],[…]]
Walk it as a story. The component asks for every path in its memory volume.
The Dock import passes the raw string through to query_json on the
host; the host prepares the statement and fetches every row from the single
SQLite file that is this Instance's disk; the rows come back as plain
lists, get encoded into one JSON string, and ride that one
-> string wire home, where the SDK parses them into the
array-of-arrays the component reads. One string out, one string back. The
boundary never carries anything richer than text — which is exactly what makes
the next section possible.
error as a VALUE, crash as an impossibility
Hand SQL to a component an LLM wrote and it will get a query wrong
— a typo, a missing table, a malformed clause. The host treats your SQL as
untrusted by default, and the design line is blunt: the Instance must not be
able to take down its engine. So query_json wraps the prepare and
fetch in a with, and on failure it doesn't trap — it returns an
error envelope down the very same string channel as the rows:
// the good case — a JSON array of row arrays:
[["/notes/june.org", 1749600000]]
// the bad case — one key, exactly:
{"error": "near \"SELEC\": syntax error"}
That envelope is precisely one key, error, mapping to a string.
Both SDKs sniff for exactly that shape and raise it as a typed error: the JS
checkError throws a DockError only when the parsed
result has one key and that key is error; Rust's
check_error does the same inside rows() and
query_raw(). So a broken query surfaces in the component as a
catchable exception, while the engine keeps running, unbothered.
flowchart LR
q["prepare + fetch_all"]
ok{"ok?"}
rows["rows JSON
[[v,v],…]"]
err["error JSON
{error: …}"]
out["one -> string wire
back to the component"]
q --> ok
ok -- "yes" --> rows --> out
ok -- "no" --> err --> out
style q fill:#ffffff,stroke:#121316
style ok fill:#fbfaf6,stroke:#121316
style rows fill:#13d943,stroke:#121316,stroke-width:2.5px
style err fill:#f3c5a3,stroke:#121316,stroke-width:2.5px
style out fill:#a8d4f0,stroke:#121316
Both outcomes leave by the same door. Whether the statement succeeded or
failed, the host produces a JSON string and sends it down the one
-> string wire; the component decides which it got by looking
at the shape. There is no second channel for errors, no exception that crosses
the boundary, no way for a query to reach the engine's own stack.
One honest nuance, because it matters. The design describes this as
a read query, but nothing mechanically rejects a write — prepare
plus fetch_all will happily execute an INSERT or a
DELETE. The guarantee is not statement-kind; it is scope.
The only bytes that SQL can touch are this Instance's own disk. The blast
radius of the worst query you can write is one project-sized file — which is
the same containment the VFS lesson built, said in SQL.
who holds the CAP
depth rung · skippable — the four profiles, for the curious
The vfs capability is the floor. Every Policy profile
grants it; the only thing that changes profile to profile is what else comes
with it, and how much memory and wall-clock time the component gets. The
compute profile is the true sandbox — it grants
only vfs, nothing more — which makes it the right shape
for handing untrusted SQL to a component that should touch nothing but its own
disk.
| profile | caps | memory | timeout |
|---|---|---|---|
| compute | vfs only — the true sandbox | 64 MiB | 5,000 ms |
| minimal | vfs + local caps + raw sockets | 64 MiB | 5,000 ms |
| network | vfs + net | 128 MiB | 30,000 ms |
| posix | vfs + more | 256 MiB | 60,000 ms |
The verdict of that table in one line: vfs is in every row, so
the query surface is always available, but it is the floor, never the
ceiling. And the fail-safe is pointed the right way — a typo'd or unknown
profile fails closed to compute, the vfs-only sandbox,
never up to something more permissive. Memory is capped by the store limits and
CPU by the per-profile timeout, so a runaway query ends as a clean
{:error, :cpu_timeout}, not a hung engine.
one statement, a whole SHELF
Here is the payoff of one-table-per-disk. Because every workbook's disk is
the same four-column SQLite file, asking a question of your whole library needs
no new query engine at all. Library.query takes one SQL statement
and runs it across every member that is a .wbundle — and a
.wbundle is just a plain zip of workbook.html,
vfs.sqlite, and a manifest, so its disk is right there to read.
The mechanism is deliberately dumb, in the good way. For each member it
extracts that vfs.sqlite part, writes it to a throwaway temp
copy, opens it with the same VFS.open, runs your statement with
the same VFS.query_json, tags the rows with the member's id and
workspace, and deletes the temp file. The stored bundle is never mutated. A
member with no VFS — a plain HTML workbook, an unresolved reference — lands in
skipped, never silently dropped. A per-member failure becomes an
error for that member only.
flowchart TD sql["one SQL statement"] m1["member: weekly-orders
is a .wbundle"] m2["member: wulu
is a .wbundle"] m3["member: press-kit
html only — no vfs"] t1["temp copy of its vfs.sqlite
query → tag {member, workspace}"] t2["temp copy of its vfs.sqlite
query → tag {member, workspace}"] merge["rows: tagged + merged"] skip["skipped: [press-kit]"] sql --> m1 --> t1 --> merge sql --> m2 --> t2 --> merge sql --> m3 --> skip style sql fill:#a8d4f0,stroke:#121316,stroke-width:2.5px style m1 fill:#fbfaf6,stroke:#121316 style m2 fill:#fbfaf6,stroke:#121316 style m3 fill:#d9dbd3,stroke:#121316 style t1 fill:#ffffff,stroke:#121316 style t2 fill:#ffffff,stroke:#121316 style merge fill:#13d943,stroke:#121316,stroke-width:2.5px style skip fill:#f3c5a3,stroke:#121316
Read the fan-out as a story. One statement goes out to three members.
weekly-orders and wulu are both bundles, so each gets
its vfs.sqlite restored to its own temp copy, the statement runs,
and the rows come back tagged with which member and which workspace they came
from — then merge into one result. press-kit is HTML only, has no
disk to query, and so it takes the one branch to the side and is reported in
skipped — listed, never lost. The query is real, end to end:
$ wbx library query "SELECT path FROM vfs
WHERE volume='workspace' AND path LIKE '/reports/%'"
{
"rows": [
{"member": "weekly-orders", "workspace": "ops",
"rows": [["/reports/week-24.org"]]},
{"member": "wulu", "workspace": "sites", "rows": []}
],
"skipped": ["press-kit"]
}
weekly-orders is a bundle with a matching report, so it returns
a tagged row; wulu is a bundle with none, so its rows
is empty but it still answered; press-kit is HTML-only, so it
shows up in skipped rather than vanishing. The same query is
reachable over HTTP as POST /api/library/:tenant/query with body
{"sql": "…"} — the surface is the CLI, the API, or the host
directly; the engine underneath is the one you already learned.
when FROM isn't a FILE
depth rung · skippable — routing in front of the sweep
One thing sits in front of the library sweep. Before fanning a statement
out over members' disks, Library.query checks whether the table
named in FROM is a registered data source. If it is, federation
answers the query through a plugin instead, and the result is tagged
federated. If it isn't — :not_federated — the query
falls through to the per-member VFS sweep you just saw. So the same entry point
serves both a question about your files and a question about a foreign table,
and the routing decides which without you choosing a different verb. The depth
of that — what a data source is, how a plugin answers — belongs to the
foreign-tables lesson; here it's enough to know
the fork exists and which branch the file-sweep is.
where the query ENDS
Honesty section. Five edges worth naming plainly.
Rows are arrays, not named objects. The host returns
[[v, v], …] in column order, not [{name: v}, …]. If
you saw an object-shaped fixture in a test, that was a stub, not host output —
bind your results positionally.
There is no writer column. Who wrote a file is not recorded anywhere
in the schema. The parent lesson's writer = 'agent' query was
illustrative; you cannot run it, because the fact it queries doesn't exist.
Read-query is intent, not mechanism. Nothing rejects an
INSERT or DELETE. The guarantee is scope — your own
disk — not statement-kind. Don't rely on the query surface to be read-only;
rely on it to be contained.
vfs-query isn't in the step log. The Dock
logs run-command calls into a steps file; vfs-query
calls are not logged the same way. It's an honest asymmetry in the telemetry,
not a hidden audit trail.
The library sweep is linear. Library.query restores and
scans each member's vfs.sqlite in turn — it's O(members) restore-
and-scan, not a merged cross-bundle index. Lovely for a shelf; not a search
engine for ten thousand bundles.
questions people actually ASK
Can a bad query crash the engine?
No. A syntax error, a missing table, a malformed clause — all come back as
{"error": "…"}, the same string channel as rows. The Instance
cannot trap its host with a query; that's the explicit design line, and both
SDKs raise the one-key envelope as a typed error.
Can the SQL write, not just read?
Yes — and that's the point, not a leak. An INSERT or
DELETE will run, but the only disk it can touch is this
Instance's own VFS. The containment is scope, not statement-kind: the worst a
write can do is rearrange one project-sized file.
Why arrays instead of named objects?
Because the host returns rows as plain lists from fetch_all
and encodes them straight to JSON — [[v, v], …]. It's the raw
row order, no per-row key dictionary. Bind your columns by position.
Can I query another workbook's memory volume?
No. When a bundle ships, only the public volume goes with it — the
memory and tmp volumes are stripped on egress. So a
library sweep over a shared bundle sees its workspace and
nothing private. See privacy for why.
Is there a query budget?
Yes — the profile's wall-clock timeout. A component call is bounded by its
profile (5 seconds on compute, more on the others), so a runaway
query ends as a clean :cpu_timeout, not a hung engine. Memory is
capped the same way by the store limits.
How is this different from semantic search?
This is literal SQL — exact paths, exact predicates. The other query
surface is embedding recall, Library.search, which finds by
meaning rather than match. Same shelf, two ways to ask; see
vectors for the recall side.
keep GOING
This sub-lesson is the contract under one promise — the neighbors fill in the disk it queries and the membrane it passes through.