folklore vs GRAMMAR
Org-mode the Emacs thing is twenty years deep. Agenda views, footnotes, clock tables, capture templates, link abbreviations, column views — a folklore so large that no two org users have ever met who used the same tenth of it. Read its manual and you will learn a hundred constructs. Read this ecosystem's parser and you will learn that it honors maybe a dozen.
That gap is the whole problem this page exists to close. A spec tells you what a feature is supposed to do; specs are aspirational, and they lie by omission about what a given implementation actually reads. The kernel here is not org-mode — it's an opinionated lens pointed at org text, and it surfaces exactly the lines that carry meaning to the rest of the system. Every other line is rendered as prose and otherwise ignored.
So we won't learn grammar from a spec. We'll learn it from the parser itself: write a line, run it through the kernel, read the JSON it hands back. A construct that produces a field is grammar here. A construct that vanishes is folklore. The parser is the only authority that never lies, because it's the thing the system actually runs.
the DEFINITION
1. the subset of org-mode the kernel surfaces as structure — exactly eight fields per headline, plus a handful of section-level constructs. Everything else in the file is prose: rendered, never parsed into meaning.
The headline is the unit. Parse any org file and the kernel returns one row per headline, and every row has exactly these eight keys — no more, no fewer:
| field | comes from | example value |
|---|---|---|
level | the run of leading stars | 1 |
title | the headline text, tags stripped | rotate signing key |
state | a leading TODO / DONE keyword | TODO · else null |
tags | the trailing :a:b: run | ["ops","security"] |
props | the :PROPERTIES: drawer, keys uppercased | {"OWNER":"shane"} |
id | sugar — props["ID"] lifted to the top | rotate-key |
scheduled | the SCHEDULED: timestamp | {at,repeat,active} |
deadline | the DEADLINE: timestamp | {at,repeat,active} |
Eight fields. That's the entire contract for a headline, and it's small enough to hold in your head precisely because the kernel is opinionated. The grammar isn't small by accident — it's small by design, so that what one human writes, another human and every agent read the same way.
the DRILL loop
Here's the method for the whole page, and the method you'll keep after it. Pipe org text into the kernel through the CLI and read the rows it returns:
printf '* TODO water the plants :home:\n' > task.org wbx query task.org
Point wbx query at any .org file and it calls the
kernel's parse_headlines and prints the JSON array. There's no
project, no config, no setup — the kernel is a pure function from text to rows.
The same text, the same row, every time, on any of the kernel's surfaces. The
flow is exactly this:
flowchart LR org["org text
(one line or a whole file)"] subgraph kernel["the OQL kernel — one pure function"] parse["orgize parse"] --> lens["the lens
(keep 8 fields, drop the rest)"] end org --> kernel kernel --> rows["JSON rows
[{level,title,state,…}]"] style kernel fill:#fbfaf6,stroke:#121316 style rows fill:#f2ddb0,stroke:#121316,stroke-width:2.5px
Two stages, and the second is the one that matters. First the parser
(orgize, an off-the-shelf Rust library) reads the org tree. Then the
lens walks that tree and keeps only what the system cares about — the
eight headline fields, source blocks, the workflow tags — and lets everything
else fall through as prose. Every example below this line is real output from
that function. Nothing is illustrative; each cell was machine-verified. You can
reproduce any of them by saving the org and running wbx query on it.
anatomy of one LINE
A headline is positional. Read left to right, each region is a different field. Here is the most a single headline can carry, annotated:
* ← stars → level (here: 1) TODO ← state keyword (TODO or DONE only) [#A] ← priority cookie — silently swallowed, no field rotate signing key ← title :ops:security: ← trailing tag run → tags
All on one line, that's * TODO [#A] rotate signing key :ops:security:,
and it parses to a row where level is 1,
state is "TODO", title is
"rotate signing key" — clean, cookie gone — and tags
is ["ops","security"]. The tags are stripped off the end of the
title for you; the title comes back without them.
The edges are where confidence is won or lost, so drill them. These are all verified:
- A headline needs a star at column zero and a space after it.
*no spaceis not a headline — it vanishes from the rows entirely. An indented* staris not a headline either — also gone. Column zero, then a space. Nothing else counts. - Tags can't contain spaces.
:ops:security:is two tags.:a b:is not a tag at all — it stays in the title as literal text, because the tag scanner only accepts a colon-run with no internal spaces. - The priority cookie has no field.
[#A]is real org-mode syntax and orgize swallows it cleanly — the title comes back without it — but the kernel exposes no priority key anywhere on the row. If you lean on priorities, the lens does not see them. Honest limit.
the ONE-line rule
This is the single gotcha most likely to bite you, so it gets its own
section. The planning line — the SCHEDULED: and
DEADLINE: stamps — must sit on the one line directly under the
headline, together. Split them across two lines and the parse quietly falls
apart, and not just for the planning fields.
Here is the broken form and the fixed form, same content, side by side. The difference is one newline:
| what you wrote | what the kernel returns | |
|---|---|---|
| broken | SCHEDULED: and DEADLINE: on separate
lines, drawer below |
only scheduled parses; deadline is
null — and the :PROPERTIES: drawer right
below silently stops being recognized: props:{} |
| fixed | both stamps on one line, drawer below | both scheduled and deadline parse;
the drawer parses; id lifts. Everything works. |
The failure cascades — the second planning line poisons the drawer that
comes after it. So your owner property, your ID, your whole record evaporates
to {} because of a stray newline two lines up. There's no error;
the row just comes back hollow. When a drawer mysteriously parses to nothing,
this is the first place to look. One line. Always one line.
headline as RECORD
A :PROPERTIES: drawer turns a headline into a typed record.
Two rules carry most of the surprises, both verified:
| written form | row field |
|---|---|
:owner: shane | props["OWNER"] = "shane" — keys are uppercased, values trimmed |
:ID: rotate-key | props["ID"] = "rotate-key" and id = "rotate-key" — ID is lifted to the top level as sugar |
So :owner: becomes OWNER — the key you read in the
JSON is never the key you typed, casing-wise. And :ID: appears
twice: once inside props like every property, and once promoted to
the top-level id field, because the system asks "what is this
headline's identity?" often enough to earn the shortcut. This section is
deliberately short — the drawers deep dive takes the
property mechanism all the way down to toolkit manifests and agent identities.
the CLOCK glyphs
Depth rung — skippable, and the timestamps deep dive goes the whole distance. But the shape is worth seeing once. The brackets decide whether a timestamp is active:
| written | parses to |
|---|---|
<2026-06-20 Sat 09:00 +1w> | {"at":"2026-06-20T09:00","repeat":"+1w","active":true} |
[2026-06-20 Sat 09:00] | {"at":"2026-06-20T09:00","repeat":null,"active":false} |
Angle brackets mean active:true — this is a thing the engine
should act on. Square brackets mean active:false — a record of
when something happened, inert. The parser is loose by design: the date is just
whichever token is ten characters long with a dash at position four; the time
is the first token with a colon; the repeater is a token starting +,
.+, or ++. Day names like Sat are
skipped, never validated — write Xyz and it shrugs.
One honest leak to know now: a time range passes through raw.
<2026-06-20 Sat 09:00-10:30> yields
"at":"2026-06-20T09:00-10:30" — the whole range token lands in the
time slot. The lens doesn't split it; it just hands you the string. Useful to
know before you parse at as a single time downstream.
blocks that COMPILE
Depth rung — the tangling deep dive is where this
becomes a build system. The mechanic: a headline's section can carry a source
block, and the kernel captures the first one only — its language, its
header args, and its body. The header-arg grammar is six keys:
:deps (comma-split list), :uses, :in,
:out, :persist (a bare flag), and :dir.
Anything else is silently ignored.
The interesting part is what happens when you tag two headlines as workflow
and components. The kernel infers the data-flow edge by matching one
component's :out to another's :in. Watch a two-step
plan become a running world:
* etl :workflow:
:PROPERTIES:
:SCHEDULE: 0 6 * * *
:END:
** fetch :component:
#+begin_src js :uses http :out prices.json
export async function run() { return fetch_prices(); }
#+end_src
** report :component:
#+begin_src python :deps pandas :in prices.json :out report.html :persist
build_report()
#+end_src
Run that through wbx tangle and the kernel computes — nobody
draws — this:
flowchart LR
subgraph plan["etl :workflow: · schedule 0 6 * * *"]
fetch["fetch :component:
uses: http · out: prices.json"]
report["report :component:
in: prices.json · out: report.html · persist"]
end
plan -- "kernel: tangle_plan()" --> world
subgraph world["the compiled world"]
f2["fetch"] -- "prices.json" --> r2["report"]
imp["imports: http"]
exp["exports: report.html"]
end
style plan fill:#fbfaf6,stroke:#121316
style world fill:#fbfaf6,stroke:#121316
style exp fill:#13d943,stroke:#121316
The output carries "edges":[{"from":"fetch","to":"report"}] —
inferred purely because fetch's :out prices.json matched report's
:in prices.json. imports is ["http"]
(the deduped :uses), exports is
["report.html"] (the out nobody consumed), report carries
"persist":true, and the :SCHEDULE: property surfaces
as {"cron":"0 6 * * *"}. Two tagged headlines, and you have a
scheduled, capability-aware build graph.
Now delete fetch and run wbx lint:
[{"level":"error","scope":"report",
"message":"input `prices.json` has no upstream producer"}]
The validator flags exactly two kinds of error — a component with no source
block or language, and an :in with no upstream producer. That's a
syntax error that is also a plan error: report wants an input
that nothing makes. This is the bridge to the workflows lessons — the same
grammar that lists your to-dos also type-checks your build.
the AUTOPSY
The parent lesson argued that markdown can't see structure. Here's the receipt. Take that same rotate-key task and write it the way a careful person writes markdown — heading, bold labels, a sentence of metadata:
# TODO rotate signing key **Scheduled:** 2026-06-20, repeat weekly. Tags: ops, security.
Run it through the very same kernel and you get:
[]
Zero rows. Not a degraded parse — nothing. Every fact a human reads
in that markdown — it's a task, it's a TODO, it's scheduled, it repeats weekly,
it's tagged ops and security — is invisible to the machine. A #
heading is not a * headline; bold text is not a state keyword; a
prose sentence is not a property drawer. The information is all there,
perfectly legible to you, and the parser sees an empty array.
| org | markdown | |
|---|---|---|
| structure seen | one full row, eight fields | nothing — [] |
| state / tags / schedule | parsed into typed fields | readable to humans, invisible to code |
That's the whole why-not-markdown argument compressed into one command. Markdown is a format for documents you render. Org, through this lens, is a format for structure you compute on. Same characters on the page; one of them the kernel can answer questions about.
dissect the EASTER egg
The parent page hides a comment in its own HTML source — a note for whoever views the page the honest way. It isn't decoration; it's valid org, and it parses to a real row. Here's the row the kernel returns for it:
{"title":"you found the written layer",
"tags":["easter","egg"],
"props":{"DISCOVERED":"by viewing source, the honest way"},
"level":1, "state":null, "id":null,
"scheduled":null, "deadline":null}
The title is clean, the trailing :easter:egg: became
tags, and a :DISCOVERED: property became
DISCOVERED in props — the same uppercasing rule from
the drawers section, proving itself on a comment nobody was supposed to find.
The lesson inside the lesson: org lives anywhere plain text lives. View-source
on the parent is itself the drill — paste that comment into
wbx query and you'll get exactly the row above. (This page hides
one too. Go look.)
where the lens STOPS
Honesty section. The kernel is a lens, not org-mode, and a lens has edges. Knowing them is the difference between trusting the grammar and being surprised by it:
| construct | what the kernel does |
|---|---|
custom TODO keywords (WAIT, KILLED…) | not recognized. Only TODO / DONE. A #+TODO: config line is ignored; * WAIT foo → state:null, "WAIT" stays in the title |
priority cookie [#A] | swallowed from the title, exposed as no field — there is no priority on the row |
time ranges 09:00-10:30 | pass through raw into at as a single string — not split |
| split planning line | only the first stamp parses, and it kills the drawer below — the one-line rule is load-bearing |
| a second source block in one section | ignored — only the first is captured |
lint (the kernel export) | a stub returning []. The CLI's wbx lint actually calls validate — different function, real output |
| agenda, links, footnotes, tables-as-data, clock tables… | rendered as HTML by render(), never surfaced as structure |
That last row is the honest summary of the whole page. Most of org-mode's
enormous surface still renders — the kernel's render() is
just orgize's to_html(), so footnotes and tables and links all
come out as fine HTML. They just don't become structure the system
computes on. The lens surfaces what the ecosystem needs to be machine-readable,
and renders the rest as the document it is. That's not a missing feature; it's
the line between structure and prose, drawn on purpose.
questions people actually ASK
Can I add custom TODO keywords like WAIT or BLOCKED?
No — verified. The kernel recognizes only TODO and
DONE. A #+TODO: TODO WAIT | DONE KILLED config line
is ignored, and * WAIT foo parses to state:null
with "WAIT" left sitting in the title. If you need more states, they live in
properties or tags, not in the keyword slot.
Where did my [#A] priority go?
Orgize swallowed it cleanly — your title comes back without it — but the
kernel exposes no priority field anywhere on the row. The cookie is valid org
syntax that this lens simply doesn't surface. If priority matters to your
plan, put it in a property (:PRIORITY: A) where it'll show up
in props.
Is this all of org-mode?
No — and that's the point. This is a lens over org, not the twenty-year Emacs feature set. A dozen-ish constructs carry structural meaning; the rest renders as prose. The grammar is small on purpose, so that what one person writes, every person and every agent reads identically.
How do I check my own file?
Run it through the kernel: wbx query plan.org. You'll get the
JSON rows back. If a
field is missing or a drawer came back empty, you've found a syntax issue —
most often the one-line planning rule. The parser is the spec.
Does this run live in the browser?
The machinery exists — the kernel ships a browser build with a raw
ptr/len ABI exporting four functions (parse_headlines,
tangle_plan, validate, render), built
for exactly this kind of in-page drill. An earlier in-page notebook shipped
and was then retired the same day, so today this page prints
machine-verified static output rather than running the wasm in your tab.
Every JSON snippet here is real wbx output; the live in-page
version is a wiring decision, not a missing capability.
Why is wbx lint not the same as the kernel's lint?
A fun honest wrinkle. The kernel exports a lint function that
is currently a stub — it returns []. The CLI's
wbx lint doesn't call it; it calls the kernel's
validate, which is real and flags missing producers and
source-less components. So wbx lint works — it just routes
around the stub.
keep GOING
This page is the receipt for the parent's argument. Each construct it drilled has a deep dive that takes it the whole way down.