learn / 04·2 — under org · tangling

the documentBECOMESthe build plan

Knuth's tangle extracted source files from a document. This one extracts a typed world — components, a capability import list, a dataflow DAG, a schedule — and it derives the whole interface from six header args, never reading the code. Which is exactly why one workflow can mix Rust, JS, and Go and still compile to one thing.

tangling11 min read
A small figure at a vast loom of glowing thread, where one written page feeds in and a monumental crystalline machine-graph weaves out the other side — three colored strands braiding into one cable — 1970s sci-fi style, bright and luminous

the wiring lives NOWHERE

Everyone's "agent workflow" lives in a YAML or JSON DSL nobody can read, next to docs nobody trusts, next to code in a repo the doc doesn't know about. What feeds what, what capabilities each piece needs, when the thing runs — that wiring is the most important part of the system and the part with no home. It's stranded in a proprietary orchestrator's config, or worse, in one person's head.

The deep cause is a split. The interface of a system — its ports, its dependencies, its schedule — is described in one language, while the logic is written in another, somewhere else. The two drift the instant either changes. The org lesson made a promise about closing splits like this: org holds runnable code, tangled out to something real. This page is where that promise gets weaponized — because the thing org tangles out to is not a pile of files. It's a typed plan for a whole system, derived from the document itself.

the DEFINITION

tan·gling /ˈtæŋ·ɡlɪŋ/ noun

1. compiling a document into a typed build plan — source blocks plus six header args become a world of components, a list of capability imports, and a dataflow DAG, with a schedule and a persistence contract derived along the way.

The word is Knuth's. In literate programming, tangle extracts the runnable code from a prose document and weave renders the human reading. The kernel here keeps both verbs — it can render org to HTML, and it can tangle. The twist is what tangle produces. Classic tangle writes out source files. This one writes out a world: a JSON plan shaped like a WIT world — components in, capability imports declared, a DAG of who feeds whom. Files to world. That's the whole move.

six words on a LINE

A workflow headline tagged :workflow: is a world. The descendants tagged :component: are its components. A component is just a headline title — its name — plus the first source block under it. Everything else about that component lives in six header args written on the source-block line. Extra source blocks are ignored; a component with no block at all gets no language and gets flagged.

Here are the six, each with its one-line contract — what it does on the block, what it becomes in the plan, and who downstream consumes it:

argon the blockin the planconsumed by
:deps a,bone comma-split listdeps arraythe compiler lanes — crates.io (rust) / npm (js·ts)
:uses caprepeatable — each one appendsuses per componentthe world's imports + the upgrade gate
:in namesingle valuein portedge matching — finds its producer
:out namesingle valueout portedges, or exports if nobody eats it
:persistbare flag, no valuepersist: truedurable_components + the VFS checkpoint contract
:dir pathsingle pathdir fieldMode-2 build — a real project dir, not inline source

Three of these are worth a second look. :uses is the only repeatable one — a component that touches the network and the VFS writes :uses twice. :persist takes no value at all; the source comment calls it Motoko-style orthogonal persistence — the runtime keeps the component's memory across runs. And :dir is the escape hatch: instead of an inline block, the component points at a real project directory with its own Cargo.toml or package.json. Unknown tokens are silently ignored — the parser is forgiving by design.

One honest note about the port names. A port like events:list is an opaque name:type string. The kernel matches it by exact string and never parses the type half — more on that in the seams.

the signature without the CODE

Here is the part that earns the lesson. The function that builds a component's signature — sig_ofnever reads a source body. It works entirely from the header args, and from them it derives three things:

  • imports — the sorted, deduplicated union of every :uses across all components. This is the world's capability surface: every workbook:* the system needs from the host.
  • edges — for every :in, find the component whose :out matches it exactly. That pair becomes a (producer, consumer) edge. The dataflow graph is computed, never drawn.
  • exports — every :out that nobody consumed. A terminal output, eaten by no downstream :in, leaks up to the world's surface as an export.

Because the signature comes from the prose layer and not the code, the language inside the block is irrelevant to wiring. The interface is in the header args; Rust, JS, and Go all declare ports the same way. That is the whole reason one world can be polyglot. Picture the real example — nightly-digest, three blocks in three languages, wired by nothing but matching strings:

flowchart LR
  subgraph world["Nightly digest  :workflow:"]
    direction LR
    fe["Fetch events · rust
:uses vfs/query
:out events:list"] su["Summarize · js
:in events:list
:uses llm/complete
:out summary:string"] se["Send · go
:in summary:string
:uses net/email"] fe -- "events:list" --> su su -- "summary:string" --> se end imp["imports ⟶ llm/complete · net/email · vfs/query"] exp["exports ⟶ (empty — every out was eaten)"] imp -.-> world world -.-> exp style world fill:#fbfaf6,stroke:#121316,stroke-width:2.5px style fe fill:#f3c5a3,stroke:#121316 style su fill:#f2ddb0,stroke:#121316 style se fill:#9fc4e8,stroke:#121316 style imp fill:#aee5c2,stroke:#121316 style exp fill:#d9dbd3,stroke:#121316

Read the graph as a chain. Fetch events declares it produces events:list; Summarize declares it consumes events:list — so an edge appears, derived from those two strings meeting. Summarize produces summary:string; Send consumes it — second edge. Send produces nothing, so the chain terminates. Up top, every :uses pooled into the world's three imports. And exports is empty, because both outputs were consumed inside the world. Nobody wrote a single edge down. They fell out of the args.

what comes OUT

Run it and you get JSON. Each world is {name, schedule, imports, exports, components, edges, workflows}workflows holds nested sub-worlds, which nesting owns. The schedule is derived too: a :SCHEDULE: property wins as a cron string; otherwise the org SCHEDULED: timestamp becomes {at, repeat, active}, with repeat cookies like +1d parsed and a bracketed timestamp read as inactive.

You can run this locally with no server at all. wb tangle links the kernel natively in the Rust CLI; it reads a file or - for stdin. Here is the genuine output of wb tangle nightly-digest.org, trimmed:

$ wb tangle nightly-digest.org
{ "worlds": [{
    "name": "Nightly digest",
    "schedule": { "at": "2026-06-06T06:00", "repeat": "+1d", "active": true },
    "imports": ["workbook:llm/complete", "workbook:net/email", "workbook:vfs/query"],
    "exports": [],
    "components": [
      { "name": "Fetch events", "lang": "rust", "uses": ["workbook:vfs/query"],
        "in": null, "out": "events:list", "persist": false, "deps": [], "dir": null, "src": "..." },
      ...
    ],
    "edges": [ { "from": "Fetch events", "to": "Summarize" },
               { "from": "Summarize",    "to": "Send" } ] }] }

Trace the load-bearing facts. The edges array was written nowhere in the document — it's :out events:list meeting :in events:list, and summary:string meeting its twin. The exports array is empty because every output found a consumer. Add :out digest:string to Send and the world grows an export — that output now leaks to the surface. And removing that export later would trip the upgrade gate, because exports may grow but never shrink. The same surface that wires the system is the surface that governs its evolution.

the plan, EXECUTED

Depth rung — skippable. The plan is a build plan, so the runtime builds and runs it. It schedules the components in topological waves: each wave is the set of components whose producers are all satisfied, and within a wave the steps run in parallel — up to eight at once, each with its own timeout. The edges are plumbing: a producer's standard output pipes into its consumer's standard input. A component with no producer gets the workflow's own input.

sequenceDiagram
  participant W as the workflow input
  participant F as Fetch events (rust)
  participant S as Summarize (js)
  participant D as Send (go)
  Note over F: wave 1 — no producer, runs first
  W->>F: (workflow input)
  F->>S: stdout → stdin  (events:list)
  Note over S: wave 2 — unblocked by Fetch
  S->>D: stdout → stdin  (summary:string)
  Note over D: wave 3 — unblocked by Summarize
  Note over D: terminal — produces no out
  

Read it as three waves falling out of two edges. Wave one is Fetch events — it has no producer, so it runs on the workflow input and emits events:list on stdout. That stdout pipes straight into Summarize's stdin, which unblocks wave two; Summarize emits summary:string, which pipes into Send, unblocking wave three. Send is terminal — it produces no :out, so the chain ends. Each component builds through a content-addressed cache keyed over its language, source, and deps, so a block that hasn't changed is never rebuilt.

Two special steps are worth naming. A component whose language is agent runs the agent loop instead of a compiled binary — its source block becomes the system prompt, and the piped input is the task. And the composition is honestly untyped today: it's stdin-to-stdout filters, not typed in-WASM linking. The source calls the typed version — wac plug — the named upgrade. Cycles, by the way, don't deadlock: a wave builder that can't resolve a remainder dumps it into one final wave rather than hanging.

two args that PUNCH up

Depth rung. Two of the six args do more than they look like.

:dir is the literate document citing a real repository. Instead of an inline source block, the component names a project directory — and the sandbox builds that project on its own terms. For Rust, the kernel parses the project's Cargo.toml [dependencies] (skipping optional, git, and path deps), then runs the in-sandbox toolchain. Native cargo is never invoked. Proc-macro and build.rs crates are the documented frontier — not yet, and said plainly.

** Scrape pages                                      :component:
   #+begin_src rust :dir tools/scraper :out pages:json
   #+end_src

The doc cites a real repo; that repo's Cargo.toml owns the deps; the sandbox builds it; native cargo never runs. The component is a whole project, referenced from one line of prose.

:persist is the persistence contract. It's a bare flag, and durable_components walks the plan — worlds and nested sub-workflows — to return the names of every component carrying it:

** Remember                                          :component:
   #+begin_src rust :persist :out memory:json
   #+end_src

For a plan with that block, durable_components(plan) returns ["Remember"] — the runtime's list of who survives a redeploy. But be precise about what persistence means here. Components are stateless between runs by default. The persistence substrate is the VFS, not raw linear memory — :persist is the contract that the component checkpoints its state to the VFS, which freeze and resume copy and which replication keeps durable. The VFS is the orthogonal-persistence layer. No magic memory snapshots.

the plan you can REFUSE

Before anything builds, the plan can be validated — and the validation runs with no server, the same native kernel as wb tangle. validate makes exactly two checks per workflow:

checklevelmessage
component has no source block / languageerrorcomponent is missing a body
:in with no upstream producererrorinput `X` has no upstream producer

Here is the second one, live. The broken workflow is a single component that consumes events:list with nothing producing it:

$ wb lint broken.org
[{"level":"error","scope":"Summarize","message":"input `events:list` has no upstream producer"}]

That is a test suite running against a document. The org lesson made a claim — you can't write a test suite for Markdown — and this is that claim cashed in: a dangling input is a checkable fact, named and scoped, before a single line of code compiles.

The same derived signature does deploy-time duty too. A second gate, check_upgrade, compares two versions of a world in WIT-subtyping style: a removed export is an error (exports may grow, never shrink), a new import is a warning (a new capability is now required), a changed output type is an error. That gate belongs to the nesting lesson — the point here is only that it reads the same sig_of surface this page taught. One signature, two jobs: wiring the system, and governing how it changes.

where the SEAMS are

Honesty section. The model is real but young, and the edges are sharp in named places.

  • Edges match by exact string. The :type half of a port name is an opaque label the kernel never parses. There is no real type system on the wire yet. Typed in-WASM composition — wac plug — is the named upgrade.
  • Duplicate producers — last one wins, undiagnosed. If two components declare the same :out, the producer map keeps the last, silently. Validate does not catch it.
  • :deps is honored for Rust and JS/TS only. Rust deps go to crates.io, JS/TS to npm, both fetched and built in-sandbox. Other languages ignore :deps for now — stated outright in the source.
  • :uses is signature-truth, not yet a grant. Today :uses feeds the world's imports and the upgrade warning. It does not yet wire per-component Dock capability grants — searching the runtime for it returns nothing. It's an honest seam, marked as one.
  • Cycles get lumped, not rejected. An unresolvable dependency remainder is dumped into one final wave so the engine doesn't deadlock — it runs, it just isn't ordered.

None of these are hidden. They're the difference between a typed-world compiler that exists today and the typed-world compiler the architecture is pointed at — and the gap between them is exactly the roadmap.

questions people actually ASK

Is this just org-babel / Babel?

No — there's no execute-in-editor step. The kernel compiles the document into a plan; the engine runs the plan, later, in a sandbox. Tangling is the compile half. Nothing executes while you're writing.

Do I write this JSON by hand?

Never. The plan is derived, not configured. You write org with source blocks and header args; wb tangle is a read-only derivation that emits the JSON. There's no plan file to keep in sync — that's the whole point.

Is events:list a real type?

Not yet. It's an opaque contract string — the kernel matches the whole name:type token by exact string and never parses the type half. A real wire type is the named upgrade; see the seams.

Can a component be a whole repository?

Yes — that's :dir. The component points at a project directory with its own Cargo.toml or package.json, and the sandbox builds it on its own terms. Native cargo is never invoked.

Where do the tangled files go?

Nowhere — and that's the twist. Knuth's tangle wrote source files out to disk. This tangle writes out a world: a plan describing components, imports, and a DAG. There are no extracted files to manage, because the document already is the source.

One world, three languages — really?

Really. The signature is derived from the header args, never the code, so the language inside a block is irrelevant to wiring. The nightly-digest example is Rust, JS, and Go in one world, composed by matching strings.

keep GOING

Tangling is the compiler under a stack of ideas — start with its parent, then the runtime that executes what it emits.