learn / 08·3 — under workflows · schedules

time theENGINEhonors

A schedule is two halves of one page: declared time in the plan — timestamps and a cron property the kernel compiles into data — and the keeper, an engine-side heartbeat that gives agents the ticks to honor it. A file can't fire itself. This lesson is what actually wakes up at nine.

schedules12 min read
A small figure standing beneath a monumental brass orrery and pendulum-clock the height of a cathedral, gears turning in bright daylight, a single illuminated dial reading nine — 1970s sci-fi style

nine a.m., and nobody's HOME

Here is a line in a plan: SCHEDULED: <2026-06-13 09:00 +1w>. It says, plainly, every Friday at nine. And on Friday at nine the file sits on a disk, a few kilobytes of text, doing exactly what text always does: nothing. A text file cannot fire itself. Something outside it has to wake up, read the line, and decide to act on it.

The traditional answer to that is cron — a daemon that reads a table of times and runs scripts. But cron is a fourth unreviewable system, added to the tracker and the pipeline and the code: it lives somewhere else, it's edited by one person, and it can only ever do exactly what its line says, blind to whether the line still makes sense. The workflows lesson sold you on time being native to the plan. This page owns the two questions that promise leaves open: what exactly can you write on a headline, and what actually wakes up and runs it.

the DEFINITION

sched·ule /ˈskɛ·dʒuːl/ noun

1. declared time in a plan — org timestamps and a :SCHEDULE: property the kernel compiles to data — paired with the keeper: an engine-side heartbeat that runs an agent on a cadence so the declared time gets honored.

Two halves, one idea, and they are deliberately not the same clock. Declared time lives in the plan and is pure data — the kernel parses it, surfaces it, validates the rest of the tree around it, and never fires anything. Cadence time lives in the keeper, the engine that actually ticks. The keeper isn't a cron daemon reading your timestamps; it's a metronome that runs an agent, and the agent reads the plan. Hold that split — the whole lesson hangs on it.

one character decides EVERYTHING

Start with what you can write. An org timestamp's first character is its whole meaning. Angle brackets — <…> — make it active: a thing to act on. Square brackets — […] — make it inactive: a thing merely recorded, an event that happened. Anything else parses to nothing; there are no bare timestamps. That one character is the difference between an appointment and a diary entry.

Inside the brackets the kernel doesn't care about order. It splits on whitespace and recognizes each token by shape: a ten-character token with a dash at position four is the date (2026-06-13); a token containing a colon is the time (09:00); a token starting with +, .+, or ++ is the repeater. Day names like Fri match none of the three shapes, so they fall through and are silently ignored. The result is a tiny, honest JSON: { at, repeat, active }.

Here is exactly what parse_timestamp yields — real input on the left, the kernel's real output on the right:

what you writewhat the kernel sees
<2026-06-13 Fri 09:00 +1w>{ at: "2026-06-13T09:00", repeat: "+1w", active: true }
<2026-06-20>{ at: "2026-06-20", repeat: null, active: true }
[2026-06-11 14:30]{ at: "2026-06-11T14:30", repeat: null, active: false }
<2026-06-13 09:00 .+2d>{ at: "2026-06-13T09:00", repeat: ".+2d", active: true }

Note the third row: the Fri is gone, the time joined the date with a T, and active flipped to false purely because of the square brackets. And note the repeater is carried raw"+1w" is a string the kernel hands forward verbatim. It recognizes the three prefixes; it does not compute the next occurrence. We'll be honest about that below. Both SCHEDULED: and DEADLINE: go on the planning line directly under the headline, and the parser surfaces both on every headline row as { at, repeat, active } or null.

when a timestamp isn't ENOUGH

Depth rung — skippable. A repeating timestamp says "every week from this date." Sometimes you want the older, denser grammar: a cron expression. For that there's a property, :SCHEDULE:, in the headline's drawer. And it outranks the timestamp. When the kernel asks a headline for its schedule, the precedence is fixed:

flowchart TD
  h["a headline"]
  h --> q1{":SCHEDULE: property?"}
  q1 -- yes --> cron["schedule = { cron: raw-string }"]
  q1 -- no --> q2{"SCHEDULED: timestamp?"}
  q2 -- yes --> ts["schedule = the timestamp JSON"]
  q2 -- no --> nul["schedule = null"]
  style cron fill:#13d943,stroke:#121316,stroke-width:2.5px
  style ts fill:#f2ddb0,stroke:#121316
  style nul fill:#d9dbd3,stroke:#121316
  

So :SCHEDULE: 0 6 * * * on a headline becomes { cron: "0 6 * * *" } and wins even if a SCHEDULED: timestamp sits right beside it — the cron property is the louder voice. One caution worth stating plainly: the kernel does not parse or validate that string. :SCHEDULE: anything at all becomes { cron: "anything at all" }. It is carried data, not interpreted instruction — no cron parser exists anywhere in the runtime.

This schedule rides the compiled world: when a workflow tree is built, every workflow and sub-workflow carries its own schedule field next to its name, imports, exports, and edges. And it surfaces in the run records the runtime hands back — every result from running a workflow includes its schedule, and you can ask for the plan alone, no execution, with POST /api/workflow?plan=1. The time you declared is queryable like any other structure in the tree.

a METRONOME, not a cron daemon

Now the bigger aha — the one the whole page turns on. The engine that fires schedules is called the keeper, and it is not crond reading your timestamps. It's a metronome. Each beat it does one thing: run an agent definition. Then it records the outcome and schedules the next beat. The agent is what reads the plan, sees what's due, claims it, and does it.

Why a metronome and not a daemon that reads the line directly? Because a cron daemon can only do what the line literally says. An agent woken on a tick can read the whole plan in one pass and notice three things at once: a deadline slipped, a dependency just unblocked, and a schedule arrived. Declared time is for the plan; cadence time is for the worker; the worker is smart enough to reconcile them. Here is the beat:

sequenceDiagram
  participant K as keeper (the heartbeat)
  participant A as agent definition
  participant P as the plan file
  participant G as the tenant's git repo
  Note over K: tick
  K->>G: pull origin (if GitOps on)
  K->>A: run one keeper run — MODE, LIFECYCLE, your loop
  A->>P: read the plan — what is due, ready, overdue?
  A->>G: do the work · commit — the commit IS the changelog
  A-->>K: outcome — done · failed · killed · no-work
  Note over K: schedule the next tick
  

Walk it once. The keeper wakes. If GitOps is enabled it first pulls the tenant's git origin, so a push to GitHub becomes live within one tick. It runs the agent definition in that tenant's repository — the same execution path as a manual run. The agent reads the plan, does its work, and commits; because the workdir is the tenant's public repo, the keeper's commits are the changelog. It reports an outcome, and the keeper schedules the next beat. The keeper exists because some control planes are internal-only and a public GitHub-cron can't reach them — so the engine carries its own heartbeat, on-box.

anatomy of a TICK

Depth rung. A single tick has more rigor than "run a thing on a timer" suggests, because the keeper has to survive restarts, crashes, and runs that hang. Four properties define one beat:

flowchart TD
  boot["restart"] --> grace["catch-up delay = max(60s, interval − elapsed)"]
  grace --> tick["tick fires"]
  tick --> pull["GitOps: pull origin before the run"]
  pull --> run["run agent in a linked Task, wall-clock bounded"]
  run --> kill{"finished before timeout?"}
  kill -- no --> killed[":killed — brutal_kill, worker survives"]
  kill -- yes --> out{"outcome"}
  out --> done[":done"]
  out --> failed[":failed"]
  out --> nowork[":no_work"]
  style killed fill:#f3c5a3,stroke:#121316
  style done fill:#aee5c2,stroke:#121316
  style nowork fill:#d9dbd3,stroke:#121316
  

First, catch-up scheduling. The last-run time is persisted to disk on the engine's data volume. On restart the keeper computes its first delay as max(60s, interval − elapsed) — a sixty-second boot grace, but otherwise it picks up where the cadence clock left off. A restart must not reset the rhythm. Second, the wall-clock kill: every run executes in a linked task with a hard timeout (default fifteen minutes); past it the run is killed outright, and because the worker traps exits, a crashing or hung run never takes the worker down with it. Third, the four outcomes:done, :failed, :killed, :no_work — which drive what happens next. Fourth, the GitOps pull before each run, where a merge conflict aborts and is left for a human rather than resolved blindly.

the breather and the BACKOFF

A keeper runs in one of two modes. Interval mode ticks on a fixed cadence — hourly by default. Continuous mode drops the fixed interval and instead takes a short breather between ticks — forty-five seconds by default — so a busy keeper stays hot without a fixed wait. But continuous-by-default has an obvious failure: an idle agent burning an LLM call every forty-five seconds just to report there's nothing to do.

The fix is the NO-WORK protocol and an exponential idle backoff. When the agent has nothing its run-kind may do, it begins its final answer with the literal token NO-WORK. The keeper counts consecutive NO-WORK ticks and backs off: max(base, 60s · 2^(streak−1)), capped at thirty minutes. A single real :done snaps it back to the hot cadence. The curve, for a continuous worker with a 45-second breather:

consecutive NO-WORK ticksnext tick in
0 — did real work45s — the breather
11m
22m
34m
48m
516m
6 and beyond30m — the cap

An idle keeper settles to one wake every half hour; the moment there's real work, the next :done drops it straight back to a 45-second beat. This backoff applies in both continuous and interval modes. Turning a keeper on is a handful of environment variables — this is the block you'd actually paste:

WB_KEEPER_DEF=/data/agents/keeper.org   # required — no def, no ticks
WB_KEEPER_INTERVAL_MS=3600000           # default: hourly
WB_KEEPER_CONTINUOUS=1                   # breather mode…
WB_KEEPER_BREATHER_MS=45000             # …45s between ticks
WB_KEEPER_RUN_TIMEOUT_MS=900000         # 15min wall-clock kill
WB_LIFECYCLE_DEF=/data/lifecycle.org    # optional cadence state machine
WB_TENANT=local                         # whose repo the commits land in

Without WB_KEEPER_DEF the keeper starts and idles forever — no def, no ticks. On a fresh boot the first tick fires after the sixty-second grace, not instantly — and on a restart it picks up the cadence clock with max(60s, interval − elapsed).

one manifest, many WORKERS

One keeper runs one agent. To run several agents on their own cadences, you point WB_CREW_DEF at a manifest — an org file where each heading is an agent and its drawer is the config. The two are mutually exclusive: if the crew variable is set it wins; otherwise the single keeper runs; if neither is set, nothing starts. Here's the manifest, verbatim in shape:

* wren
:PROPERTIES:
:DEF: /data/agents/writer.org
:LIFECYCLE: /data/lifecycles/writer.org
:INTERVAL: 10m
:END:
The writer: drafts new sections on its lifecycle.

* moss
:PROPERTIES:
:DEF: /data/agents/editor.org
:INTERVAL: 20m
:END:

Each heading becomes one supervised worker. :DEF: is required — a member without it is skipped and logged. :LIFECYCLE: is optional; absent, the agent just ticks on its interval. :INTERVAL: accepts 10m, 2h, 90s, or bare milliseconds, defaulting to an hour. Each worker's cadence state is namespaced — keeper-last-run-wren, lifecycle-pos-wren — so the agents tick independently with no shared clock. Two things keep a crew from stampeding the engine at boot and the model under load:

flowchart TD
  sup["crew supervisor"]
  sup --> gate["the gate · max 2 concurrent"]
  sup --> w1["wren · first tick +0s"]
  sup --> w2["moss · first tick +30s — stagger"]
  sup --> w3["a third worker · +60s"]
  w1 -- acquire --> gate
  w2 -- acquire --> gate
  w3 -. "queued — gate full" .-> gate
  style gate fill:#13d943,stroke:#121316,stroke-width:2.5px
  style w3 fill:#d9dbd3,stroke:#121316
  

First, the stagger: the i-th worker's first tick is delayed by i times thirty seconds, so agents overlap on the wire but don't all wake at once. Second, the concurrency gate — the first supervised child — admits at most two runs (by default) and queues the rest; each worker acquires before its run and releases after, even on a crash, so a wedged worker can never starve the others. One last truth about crews: when two agents could grab the same task, the runtime does not lock it. Claiming is a def-level protocol — a board state change plus an :AGENT: property committed to git before the work. The runtime only isolates runs and throttles concurrency; it does not arbitrate who owns a task.

declaring what a tick IS

Depth rung. So far a tick has been opaque — "run the agent." A lifecycle makes the cadence itself declarative: a deterministic state machine, written in native org, that runs one transition per tick. The skeleton is deterministic; what the agent does inside a state stays as open-ended as any agent run. Each heading is a state, and its drawer configures it: :KIND: is wake (run the agent) or rem (dream — no agent); :REPEAT: N holds the state for N successful ticks; :NEXT: names the successor; :MIN-INTERVAL: is a time gate — a tick before the interval is a no-op and the position holds; #+START: picks the first state. Here's the canonical shape as a loop:

stateDiagram-v2
  [*] --> wake_add
  wake_add --> wake_add: REPEAT 3
  wake_add --> wake_audit: after 3 hits
  wake_audit --> rem
  rem --> wake_plan: gated — MIN-INTERVAL 10m
  wake_plan --> wake_add: loop
  

Read it as a story: the agent wakes to add three times, then wakes once to audit what it added, then enters a rem state to dream — but only after ten minutes have actually passed — then wakes to plan, and loops back to adding. The stepping rules are precise. A :done increments the hit count; at :REPEAT: it takes :NEXT: and resets. A :no_work collapses the remaining repeats and takes :NEXT: now — a fast-forward of repeats only, never a skip: every state in the declared order still runs, so audits and dreams keep their cadence. A :failed or :killed holds the position and retries the same state next tick — the cadence position is never lost. The position survives restarts, written as state hits to the data volume; an unknown persisted state resets to the spec's start. And the spec is re-read every tick, kept deliberately dumb, so a hot edit is picked up on the next beat.

where it BITES

Honesty section, and this one corrects something. Repeater prefixes are recognized, not computed. The kernel sees that +1w, .+1d, and ++1w are repeaters and carries them as strings, but no code in the runtime distinguishes their org-mode semantics — that + shifts by the interval, ++ shifts to the future, .+ measures from the completion date. Those meanings live in the agent's understanding of org, honored at agent level, never in engine code. Likewise the :SCHEDULE: cron string is uninterpreted data — nothing in this runtime parses it.

The parent lesson said a passed DEADLINE on a TODO is "a fact the validator reports." We'll correct that here: the kernel validator checks DAG facts — missing-language components and dangling inputs with no upstream producer — not time facts. Overdue is arithmetic the reader runs on the parsed headlines, not a diagnostic the kernel emits. The structure is checkable; whether a date has passed is something you compute, not something the validator hands you.

And keeper time is at-least, never at-most. A +1w at 09:00 does not fire at 09:00 sharp — it fires within a tick of it, when an agent that read the plan next wakes. :MIN-INTERVAL: means "no earlier than," never "exactly at." The engine's own duration grammar only knows seconds, minutes, and hours — no days or weeks in keeper intervals, unlike org repeaters, which the kernel happily carries because it never has to do the math. Last: cadence is not autonomy. A keeper makes follow-through cheap; it does not make judgment optional. The pitch is never software that runs itself — direction stays human, and the keeper is just the part that shows up on time.

questions people actually ASK

Will +1w fire at exactly 9:00?

No — within a tick of it. The keeper doesn't read your timestamp and fire at that instant; it wakes an agent on a cadence, and the agent reads the plan and acts on what's due. If the keeper is on an hourly interval, "9:00" means "the first tick at or after 9:00." Keeper time is at-least, not at-most.

What's the difference between +, .+, and ++?

In org convention: + shifts by the interval, ++ shifts forward until the next future date, and .+ measures from the completion date. But the kernel only recognizes the prefix and carries the string verbatim — it doesn't compute the next occurrence. The agent honors the meaning; the kernel just carries it.

SCHEDULED versus DEADLINE?

SCHEDULED is an appointment — when to start. DEADLINE is a due date — when it's owed. The parser surfaces both on every headline, but only SCHEDULED feeds the headline's schedule (after the :SCHEDULE: cron property, which outranks it). DEADLINE is there to be read and reasoned about — overdue is arithmetic you run.

Keeper or external cron — when each?

The keeper exists for internal-only control planes a public cron can't reach — it carries its own heartbeat on-box. When you can reach the HTTP surface, a triggered drain is the leaner shape: any scheduler — cron, a machine exec, CI — hits the endpoint and no always-on poller idles on an empty backlog. The autopoet is the worked example of that triggered shape.

Do crew agents fight over tasks?

The runtime doesn't let them collide — it isolates runs and caps concurrency at the gate — but it does not lock tasks either. Claiming is a def-level protocol: an agent changes the board state and commits an :AGENT: property to git before doing the work. Coordination is a convention the agents follow, enforced by the plan being shared and versioned, not by a runtime lock.

What survives a redeploy?

Two small files on the engine's data volume: the last-run timestamp (so the cadence clock isn't reset — the first tick after restart is max(60s, interval − elapsed)) and, if you run a lifecycle, its position as state hits. A redeploy resumes the rhythm rather than restarting it.

keep GOING

Schedules stand on the plan above them and the engine beneath them — follow either direction.