syntax — the subset of org the kernel surfaces

folklore vs GRAMMAR

Org-mode the Emacs thing is twenty years deep. Agenda views, footnotes, clock tables, capture templates, link abbreviations, column views — a folklore so large that no two org users have ever met who used the same tenth of it. Read its manual and you will learn a hundred constructs. Read this ecosystem's parser and you will learn that it honors maybe a dozen.

That gap is the whole problem this page exists to close. A spec tells you what a feature is supposed to do; specs are aspirational, and they lie by omission about what a given implementation actually reads. The kernel here is not org-mode — it's an opinionated lens pointed at org text, and it surfaces exactly the lines that carry meaning to the rest of the system. Every other line is rendered as prose and otherwise ignored.

So we won't learn grammar from a spec. We'll learn it from the parser itself: write a line, run it through the kernel, read the JSON it hands back. A construct that produces a field is grammar here. A construct that vanishes is folklore. The parser is the only authority that never lies, because it's the thing the system actually runs.

the DEFINITION

syn·tax /ˈsɪn·tæks/ noun

1. the subset of org-mode the kernel surfaces as structure — exactly eight fields per headline, plus a handful of section-level constructs. Everything else in the file is prose: rendered, never parsed into meaning.

The headline is the unit. Parse any org file and the kernel returns one row per headline, and every row has exactly these eight keys — no more, no fewer:

field	comes from	example value
`level`	the run of leading stars	`1`
`title`	the headline text, tags stripped	`rotate signing key`
`state`	a leading `TODO` / `DONE` keyword	`TODO` · else `null`
`tags`	the trailing `:a:b:` run	`["ops","security"]`
`props`	the `:PROPERTIES:` drawer, keys uppercased	`{"OWNER":"shane"}`
`id`	sugar — `props["ID"]` lifted to the top	`rotate-key`
`scheduled`	the `SCHEDULED:` timestamp	`{at,repeat,active}`
`deadline`	the `DEADLINE:` timestamp	`{at,repeat,active}`

Eight fields. That's the entire contract for a headline, and it's small enough to hold in your head precisely because the kernel is opinionated. The grammar isn't small by accident — it's small by design, so that what one human writes, another human and every agent read the same way.

the DRILL loop

Here's the method for the whole page, and the method you'll keep after it. Pipe org text into the kernel through the CLI and read the rows it returns:

printf '* TODO water the plants  :home:\n' > task.org
wbx query task.org

Point wbx query at any .org file and it calls the kernel's parse_headlines and prints the JSON array. There's no project, no config, no setup — the kernel is a pure function from text to rows. The same text, the same row, every time, on any of the kernel's surfaces. The flow is exactly this:

flowchart LR
  org["org text
(one line or a whole file)"]
  subgraph kernel["the OQL kernel — one pure function"]
    parse["orgize parse"] --> lens["the lens
(keep 8 fields, drop the rest)"]
  end
  org --> kernel
  kernel --> rows["JSON rows
[{level,title,state,…}]"]
  style kernel fill:#fbfaf6,stroke:#121316
  style rows fill:#f2ddb0,stroke:#121316,stroke-width:2.5px

Two stages, and the second is the one that matters. First the parser (orgize, an off-the-shelf Rust library) reads the org tree. Then the lens walks that tree and keeps only what the system cares about — the eight headline fields, source blocks, the workflow tags — and lets everything else fall through as prose. Every example below this line is real output from that function. Nothing is illustrative; each cell was machine-verified. You can reproduce any of them by saving the org and running wbx query on it.

anatomy of one LINE

A headline is positional. Read left to right, each region is a different field. Here is the most a single headline can carry, annotated:

*      ← stars → level (here: 1)
TODO   ← state keyword (TODO or DONE only)
[#A]  ← priority cookie — silently swallowed, no field
rotate signing key   ← title
:ops:security:       ← trailing tag run → tags

All on one line, that's * TODO [#A] rotate signing key :ops:security:, and it parses to a row where level is 1, state is "TODO", title is "rotate signing key" — clean, cookie gone — and tags is ["ops","security"]. The tags are stripped off the end of the title for you; the title comes back without them.

The edges are where confidence is won or lost, so drill them. These are all verified:

A headline needs a star at column zero and a space after it. *no space is not a headline — it vanishes from the rows entirely. An indented * star is not a headline either — also gone. Column zero, then a space. Nothing else counts.
Tags can't contain spaces. :ops:security: is two tags. :a b: is not a tag at all — it stays in the title as literal text, because the tag scanner only accepts a colon-run with no internal spaces.
The priority cookie has no field. [#A] is real org-mode syntax and orgize swallows it cleanly — the title comes back without it — but the kernel exposes no priority key anywhere on the row. If you lean on priorities, the lens does not see them. Honest limit.

the ONE-line rule

This is the single gotcha most likely to bite you, so it gets its own section. The planning line — the SCHEDULED: and DEADLINE: stamps — must sit on the one line directly under the headline, together. Split them across two lines and the parse quietly falls apart, and not just for the planning fields.

Here is the broken form and the fixed form, same content, side by side. The difference is one newline:

	what you wrote	what the kernel returns
broken	`SCHEDULED:` and `DEADLINE:` on separate lines, drawer below	only `scheduled` parses; `deadline` is `null` — and the `:PROPERTIES:` drawer right below silently stops being recognized: `props:{}`
fixed	both stamps on one line, drawer below	both `scheduled` and `deadline` parse; the drawer parses; `id` lifts. Everything works.

The failure cascades — the second planning line poisons the drawer that comes after it. So your owner property, your ID, your whole record evaporates to {} because of a stray newline two lines up. There's no error; the row just comes back hollow. When a drawer mysteriously parses to nothing, this is the first place to look. One line. Always one line.

headline as RECORD

A :PROPERTIES: drawer turns a headline into a typed record. Two rules carry most of the surprises, both verified:

written form	row field
`:owner: shane`	`props["OWNER"] = "shane"` — keys are uppercased, values trimmed
`:ID: rotate-key`	`props["ID"] = "rotate-key"` and `id = "rotate-key"` — ID is lifted to the top level as sugar

So :owner: becomes OWNER — the key you read in the JSON is never the key you typed, casing-wise. And :ID: appears twice: once inside props like every property, and once promoted to the top-level id field, because the system asks "what is this headline's identity?" often enough to earn the shortcut. This section is deliberately short — the drawers deep dive takes the property mechanism all the way down to toolkit manifests and agent identities.

the CLOCK glyphs

Depth rung — skippable, and the timestamps deep dive goes the whole distance. But the shape is worth seeing once. The brackets decide whether a timestamp is active:

written	parses to
`<2026-06-20 Sat 09:00 +1w>`	`{"at":"2026-06-20T09:00","repeat":"+1w","active":true}`
`[2026-06-20 Sat 09:00]`	`{"at":"2026-06-20T09:00","repeat":null,"active":false}`

Angle brackets mean active:true — this is a thing the engine should act on. Square brackets mean active:false — a record of when something happened, inert. The parser is loose by design: the date is just whichever token is ten characters long with a dash at position four; the time is the first token with a colon; the repeater is a token starting +, .+, or ++. Day names like Sat are skipped, never validated — write Xyz and it shrugs.

One honest leak to know now: a time range passes through raw. <2026-06-20 Sat 09:00-10:30> yields "at":"2026-06-20T09:00-10:30" — the whole range token lands in the time slot. The lens doesn't split it; it just hands you the string. Useful to know before you parse at as a single time downstream.

blocks that COMPILE

Depth rung — the tangling deep dive is where this becomes a build system. The mechanic: a headline's section can carry a source block, and the kernel captures the first one only — its language, its header args, and its body. The header-arg grammar is six keys: :deps (comma-split list), :uses, :in, :out, :persist (a bare flag), and :dir. Anything else is silently ignored.

The interesting part is what happens when you tag two headlines as workflow and components. The kernel infers the data-flow edge by matching one component's :out to another's :in. Watch a two-step plan become a running world:

* etl                       :workflow:
  :PROPERTIES:
  :SCHEDULE: 0 6 * * *
  :END:
** fetch                    :component:
   #+begin_src js :uses http :out prices.json
   export async function run() { return fetch_prices(); }
   #+end_src
** report                   :component:
   #+begin_src python :deps pandas :in prices.json :out report.html :persist
   build_report()
   #+end_src

Run that through wbx tangle and the kernel computes — nobody draws — this:

flowchart LR
  subgraph plan["etl  :workflow:  · schedule 0 6 * * *"]
    fetch["fetch  :component:
uses: http · out: prices.json"]
    report["report  :component:
in: prices.json · out: report.html · persist"]
  end
  plan -- "kernel: tangle_plan()" --> world
  subgraph world["the compiled world"]
    f2["fetch"] -- "prices.json" --> r2["report"]
    imp["imports: http"]
    exp["exports: report.html"]
  end
  style plan fill:#fbfaf6,stroke:#121316
  style world fill:#fbfaf6,stroke:#121316
  style exp fill:#13d943,stroke:#121316

The output carries "edges":[{"from":"fetch","to":"report"}] — inferred purely because fetch's :out prices.json matched report's :in prices.json. imports is ["http"] (the deduped :uses), exports is ["report.html"] (the out nobody consumed), report carries "persist":true, and the :SCHEDULE: property surfaces as {"cron":"0 6 * * *"}. Two tagged headlines, and you have a scheduled, capability-aware build graph.

Now delete fetch and run wbx lint:

[{"level":"error","scope":"report",
  "message":"input `prices.json` has no upstream producer"}]

The validator flags exactly two kinds of error — a component with no source block or language, and an :in with no upstream producer. That's a syntax error that is also a plan error: report wants an input that nothing makes. This is the bridge to the workflows lessons — the same grammar that lists your to-dos also type-checks your build.

the AUTOPSY

The parent lesson argued that markdown can't see structure. Here's the receipt. Take that same rotate-key task and write it the way a careful person writes markdown — heading, bold labels, a sentence of metadata:

# TODO rotate signing key
**Scheduled:** 2026-06-20, repeat weekly. Tags: ops, security.

Run it through the very same kernel and you get:

[]

Zero rows. Not a degraded parse — nothing. Every fact a human reads in that markdown — it's a task, it's a TODO, it's scheduled, it repeats weekly, it's tagged ops and security — is invisible to the machine. A # heading is not a * headline; bold text is not a state keyword; a prose sentence is not a property drawer. The information is all there, perfectly legible to you, and the parser sees an empty array.

	org	markdown
structure seen	one full row, eight fields	nothing — `[]`
state / tags / schedule	parsed into typed fields	readable to humans, invisible to code

That's the whole why-not-markdown argument compressed into one command. Markdown is a format for documents you render. Org, through this lens, is a format for structure you compute on. Same characters on the page; one of them the kernel can answer questions about.

dissect the EASTER egg

The parent page hides a comment in its own HTML source — a note for whoever views the page the honest way. It isn't decoration; it's valid org, and it parses to a real row. Here's the row the kernel returns for it:

{"title":"you found the written layer",
 "tags":["easter","egg"],
 "props":{"DISCOVERED":"by viewing source, the honest way"},
 "level":1, "state":null, "id":null,
 "scheduled":null, "deadline":null}

The title is clean, the trailing :easter:egg: became tags, and a :DISCOVERED: property became DISCOVERED in props — the same uppercasing rule from the drawers section, proving itself on a comment nobody was supposed to find. The lesson inside the lesson: org lives anywhere plain text lives. View-source on the parent is itself the drill — paste that comment into wbx query and you'll get exactly the row above. (This page hides one too. Go look.)

where the lens STOPS

Honesty section. The kernel is a lens, not org-mode, and a lens has edges. Knowing them is the difference between trusting the grammar and being surprised by it:

construct	what the kernel does
custom TODO keywords (`WAIT`, `KILLED`…)	not recognized. Only `TODO` / `DONE`. A `#+TODO:` config line is ignored; `* WAIT foo` → `state:null`, "WAIT" stays in the title
priority cookie `[#A]`	swallowed from the title, exposed as no field — there is no priority on the row
time ranges `09:00-10:30`	pass through raw into `at` as a single string — not split
split planning line	only the first stamp parses, and it kills the drawer below — the one-line rule is load-bearing
a second source block in one section	ignored — only the first is captured
`lint` (the kernel export)	a stub returning `[]`. The CLI's `wbx lint` actually calls `validate` — different function, real output
agenda, links, footnotes, tables-as-data, clock tables…	rendered as HTML by `render()`, never surfaced as structure

That last row is the honest summary of the whole page. Most of org-mode's enormous surface still renders — the kernel's render() is just orgize's to_html(), so footnotes and tables and links all come out as fine HTML. They just don't become structure the system computes on. The lens surfaces what the ecosystem needs to be machine-readable, and renders the rest as the document it is. That's not a missing feature; it's the line between structure and prose, drawn on purpose.

questions people actually ASK

Can I add custom TODO keywords like WAIT or BLOCKED?

No — verified. The kernel recognizes only TODO and DONE. A #+TODO: TODO WAIT | DONE KILLED config line is ignored, and * WAIT foo parses to state:null with "WAIT" left sitting in the title. If you need more states, they live in properties or tags, not in the keyword slot.

Where did my [#A] priority go?

Orgize swallowed it cleanly — your title comes back without it — but the kernel exposes no priority field anywhere on the row. The cookie is valid org syntax that this lens simply doesn't surface. If priority matters to your plan, put it in a property (:PRIORITY: A) where it'll show up in props.

Is this all of org-mode?

No — and that's the point. This is a lens over org, not the twenty-year Emacs feature set. A dozen-ish constructs carry structural meaning; the rest renders as prose. The grammar is small on purpose, so that what one person writes, every person and every agent reads identically.

How do I check my own file?

Run it through the kernel: wbx query plan.org. You'll get the JSON rows back. If a field is missing or a drawer came back empty, you've found a syntax issue — most often the one-line planning rule. The parser is the spec.

Does this run live in the browser?

The machinery exists — the kernel ships a browser build with a raw ptr/len ABI exporting four functions (parse_headlines, tangle_plan, validate, render), built for exactly this kind of in-page drill. An earlier in-page notebook shipped and was then retired the same day, so today this page prints machine-verified static output rather than running the wasm in your tab. Every JSON snippet here is real wbx output; the live in-page version is a wiring decision, not a missing capability.

Why is wbx lint not the same as the kernel's lint?

A fun honest wrinkle. The kernel exports a lint function that is currently a stub — it returns []. The CLI's wbx lint doesn't call it; it calls the kernel's validate, which is real and flags missing producers and source-less components. So wbx lint works — it just routes around the stub.

keep GOING

This page is the receipt for the parent's argument. Each construct it drilled has a deep dive that takes it the whole way down.

Org, the grammarthe parent — why a written layer at all

→

Tanglingheader args + the build plan, all the way down

→

Drawersthe property record mechanism in full

→

Timestampsthe complete date grammar

→

The kernelone crate, three surfaces

→