learn / 07·2 — under wbx · pipelines

org goes inJSONcomes out

Three kernel verbs — query, tangle, lint — take - for stdin and emit JSON. So wbx | jq | wbx is a real composition surface, not a trick: generated org never touches a temp file, and the kernel doing the work is the literal same one the engine runs.

pipelines11 min read
A small figure feeding a paper scroll into the mouth of a monumental sorting machine; bright typed cards stream out the far end into labelled bins, no waste bin anywhere — 1970s sci-fi style

the temp-file TAX

You are generating org. A model emits a workflow; a script assembles headlines from an API; a Makefile wants a structure check before it ships. And the moment you reach for a normal CLI, the same sad loop assembles itself: write the org to a temp file, run the tool, scrape its human-shaped stdout back into structure, delete the file — and hope nothing crashed between step two and step four and left that file behind.

Every hop in that loop loses something. The temp file is a race and a cleanup liability. The prose output was built for eyes, so parsing it back into data is brittle guesswork. Most tools force this on you because their output was never meant to be a pipe's input — it was meant to be read.

Agents feel this most. An agent that has to mint a temp path, shell out, and regex the result burns turns on filesystem ceremony that has nothing to do with the work. The thing it actually wanted was simple — here is some org, tell me what's in it — and the tooling made it a four-step errand.

the DEFINITION

pipe·line /ˈpaɪp·laɪn/ noun

1. a composition of kernel verbs over stdin and stdout: - is the file, JSON is the medium, and the OQL kernel is the function — org flows in one end, a typed build plan or a diagnostics array comes out the other, and nothing is written to disk.

Three facts hide in that one sentence, and the rest of the page is just them unpacked. - means read from the pipe, not the filesystem. JSON means the output is already structure, so the next stage doesn't parse prose. And "the kernel is the function" is the load -bearing one: these verbs are pure functions exposed as processes, which is exactly what makes them safe to chain.

three pure functions in a TRENCH COAT

There are exactly three local kernel verbs that take a pipe: query, tangle, and lint. They sound like three tools. They're closer to one — the whole dispatch lives in a 37-line file (cli/src/kernel.rs), and all three funnel through one helper, read_org. That helper is where - earns its keep: when the argument is a single dash it reads stdin to a string; otherwise it goes through the I/O seam. Same code path on a native binary and the in-sandbox wasm build — for these three verbs only.

flowchart LR
  in["stdin (-)  ·  or a path"]
  ro["read_org()"]
  q["oql::parse_headlines()"]
  t["oql::tangle_plan()"]
  l["oql::validate()"]
  out["JSON on stdout"]
  in --> ro
  ro --> q --> out
  ro --> t --> out
  ro --> l --> out
  style in fill:#d9dbd3,stroke:#121316
  style ro fill:#ffffff,stroke:#121316,stroke-width:2.5px
  style out fill:#13d943,stroke:#121316
  

After read_org the three verbs split for one line each: query calls parse_headlines, tangle calls tangle_plan, lint calls validate. Each returns a JSON string and prints it — pure output, no rendering layer, nothing that wants a terminal. That is the entire surface, and its smallness is the point: there is almost no machinery between the org you pipe in and the JSON you pipe out.

One honest nuance the source forces. The CLI's lint calls the kernel's validatenot the kernel's own lint export, which is currently a stub that returns an empty array. The two are separate functions in the kernel's interface; the diagnostics you get from wbx lint come from validate. Don't picture lint calling lint.

what comes OUT

Three verbs, three JSON shapes. Knowing them by heart is what turns a pipe from a guess into a contract.

verbkernel fnshapeone row per…
queryparse_headlinesarray of headline objectsheadline in the tree
tangletangle_plan{worlds: […]}compiled world
lintvalidatearray of diagnosticsproblem found

A query row is a full lens on a headline: level · title · state · id · tags · props · scheduled · deadline. Timestamps don't come back as strings to re-parse — they come pre-cracked into {at, repeat, active}. Here's a real one, verified against wbx 0.13.0:

$ printf '* hello :tag1:\n** TODO child\nSCHEDULED: <2026-06-15 +1w>\n' | wbx query -
[{"level":1,"title":"hello","tags":["tag1"],…},
 {"level":2,"state":"TODO",
  "scheduled":{"active":true,"at":"2026-06-15","repeat":"+1w"},…}]

A tangle plan is the compile of a workflow: each world carries its name, schedule, imports, exports, a list of components (name, lang, deps, uses, in, out, persist, src, dir), the edges between them, and a workflows sub-array for any nested plans. A lint diagnostic is the smallest of the three — just {level, scope, message} — because a problem only needs to say how bad it is, where it is, and what it is.

the plan is COMPUTED, not declared

Depth rung — skippable, but it's where tangle stops looking like a formatter and starts looking like a compiler. You never write an edge. You never list imports or exports. You write components with :in, :out, and :uses header args, and the kernel derives the wiring by matching producers to consumers.

flowchart LR
  subgraph plan["daily digest  :workflow:  (SCHEDULE 0 7 * * *)"]
    f["fetch  :component:
uses: http · out: raw"] s["summarize :component:
in: raw · out: digest"] end http(["import: http"]) -. uses .-> f f -- "edge: out raw → in raw" --> s s --> dig(["export: digest"]) style plan fill:#fbfaf6,stroke:#121316 style http fill:#ffffff,stroke:#121316 style dig fill:#13d943,stroke:#121316

Read the graph as a derivation. fetch declares :out raw; summarize declares :in raw — so an edge from fetch to summarize falls out of that match, nobody drew it. digest is produced by summarize and consumed by no one, so it's an export. http appears in a :uses and nothing produces it, so it's an import. The schedule comes from the :SCHEDULE: property as cron (or from a SCHEDULED timestamp). The whole topology of the plan is a function of the header args — change a name on one side and the edge appears or vanishes.

And because headlines nest, so do plans: a :workflow: heading inside another becomes an entry in the parent's workflows sub-array, and the whole tree compiles into one nested DAG. That recursion is the nesting lesson's compile story — tangle is where you watch it happen.

jq is the OTHER half

The verbs emit JSON precisely so the next stage can be jq. That's the composition surface: org in one end, a sentence of structure out the other, with no file in between. Here's the no-temp-file tangle — a whole workflow piped in, a one-line summary piped out:

printf '* daily digest :workflow:
  :PROPERTIES:
  :SCHEDULE: 0 7 * * *
  :END:
** fetch :component:
#+begin_src js :uses http :out raw
export async function run() { return fetch_feed(); }
#+end_src
** summarize :component:
#+begin_src js :in raw :out digest
export function run(raw) { return summarize(raw); }
#+end_src
' | wbx tangle - | jq -r '.worlds[] | "\(.name): \(.components|length) components, exports \(.exports|join(","))"'

daily digest: 2 components, exports digest. The full plan carries the derived parts the summary skipped: "edges":[{"from":"fetch","to":"summarize"}], "imports":["http"], "exports":["digest"], "schedule":{"cron":"0 7 * * *"} — none of it written by hand.

Or treat org as a database. Extract every open task by state:

$ printf '* plan\n** TODO write the fetch step\n** DONE wire the schedule\n** TODO add lint gate :ci:\n' \
  | wbx query - | jq -r '.[] | select(.state=="TODO") | .title'
write the fetch step
add lint gate

TODO state, tags, and SCHEDULED all survive the pipe — because org is a language, not formatting. The sequence has no filesystem in it, and that absence is the whole feature:

sequenceDiagram
  participant A as agent / script
  participant K as wbx tangle -
  participant J as jq
  A->>K: pipe generated org (stdin)
  K->>J: JSON plan (stdout)
  J->>A: one line of structure
  Note over A,J: no temp file · no cleanup · no disk
  

a gate that actually FAILS

Here is the gotcha, stated plainly so it never bites you in CI: wbx lint exits 0 even when its diagnostics contain errors. Diagnostics are data, not a process failure — the lint run succeeded; it found problems. Verified:

$ printf '* pipe :workflow:\n** step :component:\n#+begin_src js :in nonexistent :out result\nexport function run(x){return x}\n#+end_src\n' | wbx lint -
[{"level":"error","message":"input `nonexistent` has no upstream producer","scope":"step"}]
$ echo $?
0        # diagnostics are data — the run succeeded

So a naive wbx lint file && deploy never blocks a broken plan. The fix is to let jq turn the data into the exit code — jq -e exits non-zero when its result is false:

wbx lint workbook.org | jq -e 'map(select(.level=="error")) | length == 0'

That exits 1 when any error exists and 0 when the array is clean — a gate that actually fails the build. Contrast it with a real failure: a missing file is an I/O problem, and those do set an exit code.

what went wrongexitwho catches it
org has a diagnostic (bad :in, no source)0 — data, not failurejq -e
file not found / can't read4 — not foundthe shell (&&)
clap usage error2 — usagethe shell

The verdict: lint's exit code reports whether the tool ran, not whether your plan is healthy. The shell catches I/O; jq catches content. Wire both.

--json without DOUBLE-WRAPPING

Depth rung. Add --json and the raw output gets lifted into an envelope — {ok, verb, data} on success — but the clever part is what doesn't happen: the already-JSON output is embedded structurally, not stuffed in as an escaped string. data is the parsed array, so a downstream jq walks it directly. Verified:

$ printf '* hello :tag1:\n' | wbx query - --json
{"data":[{"deadline":null,"id":null,"level":1,"props":{},
  "scheduled":null,"state":null,"tags":["tag1"],"title":"hello"}],
 "ok":true,"verb":"query"}

You rarely need to type --json mid-pipeline, though. Mode is auto-detected: when stdout isn't a terminal, wbx assumes a machine is reading and switches to agent mode on its own — so every middle command in a pipe is already structured. The forcing knobs exist for the edges: --json and --agent are global flags that work on any verb in any position, and WBX_AGENT=1 pins it from the environment. The full mode model — the exit-code map, the failure envelope, the TTY rules — is its own page; this lesson only needs that the pipe stays structured by default. See the parent lesson for the whole contract.

your pipe is the engine's PARSER

This is the payoff that makes a shell pipeline more than convenient. The kernel doing the work in wbx query is the same Rust crate — package oql, runtime/kernel — that compiles to the wasm Component the engine and desktop embed. The CLI links it as a native rlib; the engine loads it as oql.wasm. One logic, every surface.

flowchart TD
  crate[["oql — one Rust crate
parse_headlines · tangle_plan · validate"]] crate --> native["native rlib"] crate --> wasm["oql.wasm Component
(WIT world: workbooks:oql)"] native --> cli["wbx — your laptop"] wasm --> engine["the engine — server"] wasm --> desk["desktop · in-browser"] style crate fill:#d9dbd3,stroke:#121316,stroke-width:2.5px style wasm fill:#ffffff,stroke:#121316 style cli fill:#13d943,stroke:#121316

The WIT world is deliberately austere — pure string-to-string, no host imports, no WASI context. The kernel can't read your disk or hit the network; faults trap inside the wasm engine. Which means there's no place for the two builds to diverge: wbx tangle - on your laptop and the engine's compile of the same file reach byte-identical conclusions, because they're running the identical function. A pipeline on your machine isn't an approximation of what the engine will do — it's a dry run of the exact same logic.

where the pipe surface ENDS

Honesty section. The pipe is narrow on purpose, and it's worth knowing exactly where it stops today.

Only three verbs take -. query, tangle, and lint special-case stdin; nothing else does. The author and trust verbs — sign, bundle, verify, workflow run, workbook deploy — want real paths. Pipe into one and it fails honestly:

$ printf '* x\n' | wbx sign -
wbx: read -: No such file or directory     (exit 4)

The SPEC names "stdin pipelines across author/trust verbs" as un-shipped reach work — it's a known frontier, not an accident.

Lint's exit code doesn't reflect severity — the gate section exists because of it; reach for jq -e, not &&.

Exit-code classification is a text heuristic today. Under the hood, codes are currently inferred from error text — the source says so outright. The contract is the code (4 means not-found, whatever the message says); the text is not. Branch on the number.

Two kernel exports have no CLI verb. check_upgrade (the upgrade gate) and render live in the kernel but are engine and desktop surfaces — there is no wbx render, and wbx upgrade is self-update of the binary, a different thing entirely. Don't reach for verbs that aren't there.

questions people actually ASK

Why does lint exit 0 when there are errors?

Because diagnostics are data, not a crash — the lint run succeeded and reported problems. The tool worked; your plan didn't. To fail a build on errors, pipe to jq -e 'map(select(.level=="error")) | length == 0', which returns a non-zero exit when any error is present.

Can I pipe into sign or bundle?

Not yet. Only query, tangle, and lint special-case -. Sign, bundle, verify, workflow run, and workbook deploy take real file paths, and piping into them fails with a not-found error. The SPEC lists stdin across the author and trust verbs as named frontier work.

Is the output stable enough to script against?

The JSON shape is the contract — headline rows, the worlds plan, the diagnostic triple — and --json adds the envelope (verb, ok, and on failure a typed code). Branch on the structure and the code, not on prose; the text inside a message may change, the shape is what you build on.

Do I need an engine running for this?

No. All three verbs are local — they link the kernel as a native rlib and read stdin directly. No server, no network, no auth. The engine runs the same logic, but your pipe doesn't call it.

What's the difference between lint and validate?

The CLI's lint calls the kernel's validate function. The kernel exposes a separate lint export too, but it's currently a stub that returns an empty array — reserved. So your diagnostics come from validate; lint is just the verb name you type.

Will my generated org ever hit disk?

Not in a pipeline. - reads stdin straight into memory and the output is JSON on stdout. The whole reason the surface exists is to delete the write-temp / run / parse / cleanup loop — there's no file to race on and nothing to clean up.

keep GOING

Pipelines are one verb-set on a bigger command, feeding a bigger machine.