Queries — one SQL string against your own disk

the disk speaks SQL. say WHAT, exactly?

The parent lesson showed three dreamy queries and a line — your disk speaks SQL — and left it there. That was the poetry. This is the contract. If you're going to hand a SQL string to a component an LLM wrote, you want the answers underneath the slogan: what table, what columns, what exactly do I call, what comes back, and what happens when I get the query wrong.

The reassuring part is how small the truth is. There is no query API with a dozen methods. There is one table, one import, and one error shape — and once you've seen them, you've seen all of it. The parent page even drew the table a little stylized, with columns that don't exist; this page reconciles the illustration with the real four columns, then shows you the one statement that can ask a question of fifty workbooks at once.

the DEFINITION

vfs·query /ˈviː·ef·es ˈkwɪə·ri/ noun

1. the single Policy-gated Dock import that runs one SQL string against the Instance's own disk and returns JSON — rows, or a caught error, and never a host crash.

It is the working half of the VFS: that lesson described a disk you can query; this names the exact seam through which the query passes. One function, one string in, one string out — and the entire safety story rides on that narrow shape.

the whole SCHEMA, all of it

Here is the disk, in full. Not an inode graph, not a manifest plus a blob store — one SQLite table, four columns, one file per Instance. The CREATE TABLE the engine actually runs is this:

CREATE TABLE IF NOT EXISTS vfs (
  volume  TEXT NOT NULL,
  path    TEXT NOT NULL,
  content BLOB,
  mtime   INTEGER,
  PRIMARY KEY (volume, path)
);

That is one hundred percent of the schema. The composite primary key (volume, path) is the whole namespace: a path is unique within a volume, and the volume column is the switch between the three regions — workspace, memory, and tmp, with workspace the default. That column is deliberately visible to raw SQL so a component can scope a query to any one of them. Column by column:

column	type	what it means	example
`volume`	TEXT	the region — workspace · memory · tmp	`workspace`
`path`	TEXT	the file name within that volume	`/reports/week-24.org`
`content`	BLOB	the bytes themselves, one row per file	18,234 bytes
`mtime`	INTEGER	unix seconds, written on every put	`1749600000`

Now the reconciliation the parent lesson owes you. It drew queries against a files table with size, dir, and writer columns. None of those columns exist. Size is length(content); a directory is a string operation on path; and writer-attribution simply isn't in the schema at all — there is no column that records who wrote a row. So the parent's SELECT path FROM files WHERE mtime > '2026-06-09' becomes, told truthfully — note mtime is unix seconds, so the date has to be converted, and the table is vfs:

SELECT path, length(content) AS bytes
FROM vfs
WHERE volume = 'workspace'
  AND mtime > strftime('%s', '2026-06-09');
   → [["/reports/week-24.org", 18234], ["/notes/june.org", 4101]]

Two things to carry out of that result. First, rows come back as arrays of arrays — [[value, value], …] — not objects with named keys; the host returns the raw row order, not a dictionary. Second, the size you wanted was derived, not stored, and the volume = 'workspace' clause is how you stay inside one region. Four columns is the whole grammar. Everything else is SQL you already know.

one import, granted or ABSENT

There is exactly one way in. The engine's interface declares it as a single typed function in the Dock contract:

// world engine — package workbooks:engine
import vfs-query: func(sql: string) -> string

A string in, a string out. But a component doesn't always have that import — it has it only if its Policy profile grants the vfs capability. When the host assembles a component's imports, granting vfs adds precisely one entry to the map: vfs-query, bound to a closure over this Instance's own database connection. No grant, no entry.

And here is the part that makes the gating airtight: it is gating by construction, not by a runtime check. The host provides only the imports the Policy grants. A component that imports vfs-query without the grant doesn't get an error when it calls — it fails to instantiate, because the import it asks for isn't there to link against. The malformed case is caught even earlier, at wb build, where conformance extracts the component's WIT and allows only the three engine-level imports (session-info, vfs-query, run-command). An unauthorized capability is a wire that was never run.

flowchart TD
  prof["a Policy profile
compute · minimal · network · posix"]
  caps["its caps list
vfs is the floor — every profile has it"]
  imp["the imports map
granting vfs adds one entry: vfs-query"]
  ok["component instantiates
the import links to this Instance's disk"]
  no["component fails to instantiate
imported a cap it was never granted"]
  prof --> caps
  caps -- "vfs granted" --> imp --> ok
  caps -- "vfs absent" --> no
  style prof fill:#ffffff,stroke:#121316
  style caps fill:#fbfaf6,stroke:#121316
  style imp fill:#a8d4f0,stroke:#121316,stroke-width:2.5px
  style ok fill:#13d943,stroke:#121316,stroke-width:2.5px
  style no fill:#f3c5a3,stroke:#121316

Read that graph top to bottom as a single sentence. A profile names a caps list; the caps list either contains vfs or it doesn't; if it does, one import — vfs-query — appears in the map and the component links cleanly to its own disk; if it doesn't, the component asking for that import never finishes loading. Two endpoints, and which one you reach was decided before a line of the component ran.

the round TRIP

Now follow one call. The component speaks the SDK's ergonomic surface — in Rust, vfs::query::<T>(sql) after a one-line dock::bind!(bindings, vfs); in JS, dock.vfs.query(sql) after bind(imports). That call crosses the typed boundary into the host closure, which hands the SQL to VFS.query_json, which prepares and runs it against SQLite and encodes the result as a JSON string — back across the same boundary, parsed by the SDK into rows.

sequenceDiagram
  participant C as component
  participant D as Dock import — vfs-query
  participant V as VFS.query_json
  participant S as SQLite — this Instance's disk
  C->>D: query("SELECT path FROM vfs WHERE volume='memory'")
  D->>V: sql string
  V->>S: prepare + fetch_all
  S-->>V: rows as lists
  V-->>D: Jason.encode! → JSON string
  D-->>C: parsed rows — [[…],[…]]

Walk it as a story. The component asks for every path in its memory volume. The Dock import passes the raw string through to query_json on the host; the host prepares the statement and fetches every row from the single SQLite file that is this Instance's disk; the rows come back as plain lists, get encoded into one JSON string, and ride that one -> string wire home, where the SDK parses them into the array-of-arrays the component reads. One string out, one string back. The boundary never carries anything richer than text — which is exactly what makes the next section possible.

error as a VALUE, crash as an impossibility

Hand SQL to a component an LLM wrote and it will get a query wrong — a typo, a missing table, a malformed clause. The host treats your SQL as untrusted by default, and the design line is blunt: the Instance must not be able to take down its engine. So query_json wraps the prepare and fetch in a with, and on failure it doesn't trap — it returns an error envelope down the very same string channel as the rows:

// the good case — a JSON array of row arrays:
[["/notes/june.org", 1749600000]]

// the bad case — one key, exactly:
{"error": "near \"SELEC\": syntax error"}

That envelope is precisely one key, error, mapping to a string. Both SDKs sniff for exactly that shape and raise it as a typed error: the JS checkError throws a DockError only when the parsed result has one key and that key is error; Rust's check_error does the same inside rows() and query_raw(). So a broken query surfaces in the component as a catchable exception, while the engine keeps running, unbothered.

flowchart LR
  q["prepare + fetch_all"]
  ok{"ok?"}
  rows["rows JSON
[[v,v],…]"]
  err["error JSON
{error: …}"]
  out["one -> string wire
back to the component"]
  q --> ok
  ok -- "yes" --> rows --> out
  ok -- "no" --> err --> out
  style q fill:#ffffff,stroke:#121316
  style ok fill:#fbfaf6,stroke:#121316
  style rows fill:#13d943,stroke:#121316,stroke-width:2.5px
  style err fill:#f3c5a3,stroke:#121316,stroke-width:2.5px
  style out fill:#a8d4f0,stroke:#121316

Both outcomes leave by the same door. Whether the statement succeeded or failed, the host produces a JSON string and sends it down the one -> string wire; the component decides which it got by looking at the shape. There is no second channel for errors, no exception that crosses the boundary, no way for a query to reach the engine's own stack.

One honest nuance, because it matters. The design describes this as a read query, but nothing mechanically rejects a write — prepare plus fetch_all will happily execute an INSERT or a DELETE. The guarantee is not statement-kind; it is scope. The only bytes that SQL can touch are this Instance's own disk. The blast radius of the worst query you can write is one project-sized file — which is the same containment the VFS lesson built, said in SQL.

who holds the CAP

depth rung · skippable — the four profiles, for the curious

The vfs capability is the floor. Every Policy profile grants it; the only thing that changes profile to profile is what else comes with it, and how much memory and wall-clock time the component gets. The compute profile is the true sandbox — it grants only vfs, nothing more — which makes it the right shape for handing untrusted SQL to a component that should touch nothing but its own disk.

profile	caps	memory	timeout
compute	vfs only — the true sandbox	64 MiB	5,000 ms
minimal	vfs + local caps + raw sockets	64 MiB	5,000 ms
network	vfs + net	128 MiB	30,000 ms
posix	vfs + more	256 MiB	60,000 ms

The verdict of that table in one line: vfs is in every row, so the query surface is always available, but it is the floor, never the ceiling. And the fail-safe is pointed the right way — a typo'd or unknown profile fails closed to compute, the vfs-only sandbox, never up to something more permissive. Memory is capped by the store limits and CPU by the per-profile timeout, so a runaway query ends as a clean {:error, :cpu_timeout}, not a hung engine.

one statement, a whole SHELF

Here is the payoff of one-table-per-disk. Because every workbook's disk is the same four-column SQLite file, asking a question of your whole library needs no new query engine at all. Library.query takes one SQL statement and runs it across every member that is a .wbundle — and a .wbundle is just a plain zip of workbook.html, vfs.sqlite, and a manifest, so its disk is right there to read.

The mechanism is deliberately dumb, in the good way. For each member it extracts that vfs.sqlite part, writes it to a throwaway temp copy, opens it with the same VFS.open, runs your statement with the same VFS.query_json, tags the rows with the member's id and workspace, and deletes the temp file. The stored bundle is never mutated. A member with no VFS — a plain HTML workbook, an unresolved reference — lands in skipped, never silently dropped. A per-member failure becomes an error for that member only.

flowchart TD
  sql["one SQL statement"]
  m1["member: weekly-orders
is a .wbundle"]
  m2["member: wulu
is a .wbundle"]
  m3["member: press-kit
html only — no vfs"]
  t1["temp copy of its vfs.sqlite
query → tag {member, workspace}"]
  t2["temp copy of its vfs.sqlite
query → tag {member, workspace}"]
  merge["rows: tagged + merged"]
  skip["skipped: [press-kit]"]
  sql --> m1 --> t1 --> merge
  sql --> m2 --> t2 --> merge
  sql --> m3 --> skip
  style sql fill:#a8d4f0,stroke:#121316,stroke-width:2.5px
  style m1 fill:#fbfaf6,stroke:#121316
  style m2 fill:#fbfaf6,stroke:#121316
  style m3 fill:#d9dbd3,stroke:#121316
  style t1 fill:#ffffff,stroke:#121316
  style t2 fill:#ffffff,stroke:#121316
  style merge fill:#13d943,stroke:#121316,stroke-width:2.5px
  style skip fill:#f3c5a3,stroke:#121316

Read the fan-out as a story. One statement goes out to three members. weekly-orders and wulu are both bundles, so each gets its vfs.sqlite restored to its own temp copy, the statement runs, and the rows come back tagged with which member and which workspace they came from — then merge into one result. press-kit is HTML only, has no disk to query, and so it takes the one branch to the side and is reported in skipped — listed, never lost. The query is real, end to end:

$ wbx library query "SELECT path FROM vfs
    WHERE volume='workspace' AND path LIKE '/reports/%'"

{
  "rows": [
    {"member": "weekly-orders", "workspace": "ops",
     "rows": [["/reports/week-24.org"]]},
    {"member": "wulu", "workspace": "sites", "rows": []}
  ],
  "skipped": ["press-kit"]
}

weekly-orders is a bundle with a matching report, so it returns a tagged row; wulu is a bundle with none, so its rows is empty but it still answered; press-kit is HTML-only, so it shows up in skipped rather than vanishing. The same query is reachable over HTTP as POST /api/library/:tenant/query with body {"sql": "…"} — the surface is the CLI, the API, or the host directly; the engine underneath is the one you already learned.

when FROM isn't a FILE

depth rung · skippable — routing in front of the sweep

One thing sits in front of the library sweep. Before fanning a statement out over members' disks, Library.query checks whether the table named in FROM is a registered data source. If it is, federation answers the query through a plugin instead, and the result is tagged federated. If it isn't — :not_federated — the query falls through to the per-member VFS sweep you just saw. So the same entry point serves both a question about your files and a question about a foreign table, and the routing decides which without you choosing a different verb. The depth of that — what a data source is, how a plugin answers — belongs to the foreign-tables lesson; here it's enough to know the fork exists and which branch the file-sweep is.

where the query ENDS

Honesty section. Five edges worth naming plainly.

Rows are arrays, not named objects. The host returns [[v, v], …] in column order, not [{name: v}, …]. If you saw an object-shaped fixture in a test, that was a stub, not host output — bind your results positionally.

There is no writer column. Who wrote a file is not recorded anywhere in the schema. The parent lesson's writer = 'agent' query was illustrative; you cannot run it, because the fact it queries doesn't exist.

Read-query is intent, not mechanism. Nothing rejects an INSERT or DELETE. The guarantee is scope — your own disk — not statement-kind. Don't rely on the query surface to be read-only; rely on it to be contained.

vfs-query isn't in the step log. The Dock logs run-command calls into a steps file; vfs-query calls are not logged the same way. It's an honest asymmetry in the telemetry, not a hidden audit trail.

The library sweep is linear. Library.query restores and scans each member's vfs.sqlite in turn — it's O(members) restore- and-scan, not a merged cross-bundle index. Lovely for a shelf; not a search engine for ten thousand bundles.

questions people actually ASK

Can a bad query crash the engine?

No. A syntax error, a missing table, a malformed clause — all come back as {"error": "…"}, the same string channel as rows. The Instance cannot trap its host with a query; that's the explicit design line, and both SDKs raise the one-key envelope as a typed error.

Can the SQL write, not just read?

Yes — and that's the point, not a leak. An INSERT or DELETE will run, but the only disk it can touch is this Instance's own VFS. The containment is scope, not statement-kind: the worst a write can do is rearrange one project-sized file.

Why arrays instead of named objects?

Because the host returns rows as plain lists from fetch_all and encodes them straight to JSON — [[v, v], …]. It's the raw row order, no per-row key dictionary. Bind your columns by position.

Can I query another workbook's memory volume?

No. When a bundle ships, only the public volume goes with it — the memory and tmp volumes are stripped on egress. So a library sweep over a shared bundle sees its workspace and nothing private. See privacy for why.

Is there a query budget?

Yes — the profile's wall-clock timeout. A component call is bounded by its profile (5 seconds on compute, more on the others), so a runaway query ends as a clean :cpu_timeout, not a hung engine. Memory is capped the same way by the store limits.

How is this different from semantic search?

This is literal SQL — exact paths, exact predicates. The other query surface is embedding recall, Library.search, which finds by meaning rather than match. Same shelf, two ways to ask; see vectors for the recall side.

keep GOING

This sub-lesson is the contract under one promise — the neighbors fill in the disk it queries and the membrane it passes through.

The VFSthe disk this queries

→

Volumesthe three regions the column names

→

Vectorsthe other query surface — recall, not match

→

Foreign tableswhen FROM isn't a file

→