privacy — sharing the work, never the session

the file that knows too MUCH

The parent lesson ended on a promise that doubles as a threat. An agent's working directory — its notes, its drafts, its long-term memory, its decision logs — lives in the same disk as the deliverable. Hand someone the file and you've handed them the project. Wonderful, until you read it the other way: the file that carries your finished report also carries every wrong turn the agent took getting there, every page it read, every key it signs with.

This is not a hypothetical we dreamed up to sound careful. The module that fixes it says so in its own first lines: the default exists to prevent the exact class of leak this project already hit — beads task data pushed to GitHub. Someone ran a bulk commit, the issue tracker rode along, and a public repo got a tour of the backlog. The lesson was cheap because nothing secret was in those tickets. Next time it might be.

So the worry is real and the stakes are concrete: if egress is all-or-nothing, the whole "the workspace is the artifact" pitch collapses, because no sane person ships a file that also ships their agent's brain. The rest of this page is the one mechanism that makes the pitch survive contact with the send button.

the BOUNDARY

pri·va·cy /ˈprʌɪ·və·si/ noun

1. a single boundary module — one source of truth for what is public and what is private — that every egress path consults before anything leaves, so the line can't drift between them. Its rule, stated once in the code: sharing exposes the work, never the session that produced it.

It is exactly seventy lines you can read in a sitting — Workbooks.Private. That smallness is the design. There is no privacy system with settings and surfaces; there is one list, in one place, that three different exits all ask the same question. Define the answer once where you can audit it, and no exit can quietly disagree with another. The agent never has to remember the boundary, because the boundary is automatic — appended, stripped, and recorded by the engine, not by anyone's discipline.

three doors, one BOUNCER

There are exactly three ways data leaves a workbook, and the trick is that all three ask the same module the same question first:

Git commit — the work syncs to a tenant repo and, on push, to a host like GitHub.
Bundle ship — the workbook is packed into a .wbundle and sent as a file.
Library pack — a workbook is exported from your library to share or to archive.

Each of those used to be a place where someone could forget. Three exits would mean three chances to leak, and three lists that drift apart over time until one of them is wrong. Instead every door routes through Workbooks.Private before a byte crosses — one bouncer, one guest list, three doors:

flowchart LR
  git["git commit
→ tenant repo, GitHub"]
  bun["bundle ship
→ a .wbundle file"]
  lib["library pack
→ share or archive"]
  priv{"Workbooks.Private
is this private?"}
  out["what actually leaves
the work — never the session"]
  stay["stays home
memory · scratch · telemetry · signing key"]
  git --> priv
  bun --> priv
  lib --> priv
  priv -- "public" --> out
  priv -- "private" --> stay
  style priv fill:#a8d4f0,stroke:#121316,stroke-width:2.5px
  style out fill:#aee5c2,stroke:#121316
  style stay fill:#d9dbd3,stroke:#121316
  style git fill:#ffffff,stroke:#121316
  style bun fill:#ffffff,stroke:#121316
  style lib fill:#ffffff,stroke:#121316

Read the graph as a funnel that narrows on purpose. Three arrows come in from the left — the three ways out — and they all converge on a single blue decision node before anything reaches the world. Whatever the node calls public continues out the right as the work; whatever it calls private drops into the grey box and never moves. One node, asked three times, is the entire guarantee. There is no fourth door that skipped the question.

what stays home, EXACTLY

A boundary you can't enumerate is a boundary you can't trust, so here is the real default list — the constants in the module, with what each thing actually is and why it must not ship. No mystery, nothing you can't audit:

what stays home	what it actually is	why it can't ship
`_steps.jsonl`	per-tool agent telemetry — one line appended per tool call	every move the agent made, including dead ends
`_ledger.json`	the sealed run ledger — `_steps.jsonl` hash-chained and signed	the signed provenance trail of the session
`_status.json`	pipeline stage status	internal pipeline state, not a deliverable
`_trace.jsonl`	web-surface step traces — step, tool, output excerpt	raw working notes from live runs
`_telemetry.db`	session telemetry	the session, not the work
`scratch/`	the agent's thinking-out-loud directory	drafts and half-thoughts, by definition
`.workbooks/`	the Ed25519 signing key — the tenant's identity	the private half of the key. Never.
`.beads/`	the issue tracker export	the exact thing that leaked once
`.claude/`	agent config and session scaffolding	operator config, not the artifact
`memory/` · `tmp/`	the two private VFS volumes	long-term agent memory and scratch — not `workspace`
`_*.{jsonl,json,db}`	any FUTURE session sidecar, caught by the `_` prefix rule	future-proofing — no list edit needed

That last row is the cleverest line in the module. The check is: a basename that starts with _ and ends in .jsonl, .json, or .db is private — full stop. So the day some new agent feature writes _dreams.jsonl, it is already private, with nobody editing a list. The convention does the remembering.

And the module produces the matching .gitignore from those same constants — this exact block, appended automatically to every tenant repo:

scratch/
.workbooks/
.beads/
.claude/
memory/
tmp/
_*.jsonl
_*.json
_*.db

One list, rendered for git. The file form and the volume form of "private" come out of the same place — which is the next section's whole point.

one boundary, both FORMS

depth rung · skippable — how one list keeps two shapes in lockstep

Private data exists in two physical forms, and a sloppy design would protect one and forget the other. Form one: tree files — the _*.jsonl sidecars and the dot-directories, sitting as actual files when a workbook is unpacked. Form two: VFS volumes — memory and tmp living as rows inside the SQLite disk, never unpacked at all. Same data, two skins.

The module keeps them in agreement by emitting the volumes as directory globs in the gitignore output. So memory/ and tmp/ appear in the ignore list even though, inside a packed workbook, they aren't directories — they're database volumes. The names line up across both forms:

VFS form (inside the SQLite disk)	file form (when unpacked to a tree)
volume `memory`	directory `memory/`
volume `tmp`	directory `tmp/`
volume `workspace` — the only one that ships	directory `workspace/` — public

The verdict of that table is one sentence: if the SQLite disk is ever unpacked into a real tree, the same volumes git already treats as private in the file form are ignored there too. One boundary, both forms — so there is no clever path where data that's private as a volume becomes public the moment it lands as a file. The volumes lesson defines those three regions; this page only decides which of them crosses.

strip, then VACUUM

depth rung · skippable — what stripping a disk actually does

On the bundle door, the whole privacy mechanism is two SQL statements run against a copy of the workbook's own disk. The function writes the disk to a temp SQLite file and does this:

DELETE FROM vfs WHERE volume NOT IN ('workspace');
VACUUM;   -- DELETE leaves recoverable pages; VACUUM rebuilds the file

The first statement is obvious — keep workspace, drop the rest. The second is the one people forget, and forgetting it would be a quiet disaster. A SQLite DELETE doesn't shred bytes; it marks pages free, and the deleted rows sit recoverable in the file's slack space until something overwrites them. Ship that file and a curious recipient can carve your agent's memory out of the free pages. VACUUM rebuilds the file from scratch, so the deleted private rows are gone, not merely unlinked. Absence, made real.

Then the bundle records the decision in its manifest, so the choice is legible to whoever receives the file:

sequenceDiagram
  participant S as Bundle.ship
  participant V as VFS.public_only
  participant M as manifest.json
  S->>V: hand over the disk bytes
  V->>V: DELETE FROM vfs WHERE volume NOT IN ('workspace')
  V->>V: VACUUM — rebuild, no recoverable slack
  V->>S: stripped disk — workspace only
  S->>M: volumes: ["workspace"], private_included: false

Walk that sequence as a short story. The ship step hands the disk to the stripper. The stripper deletes every volume that isn't workspace, then vacuums the file so nothing is carvable, and hands back a disk that contains only the public region. Finally the ship step writes the receipt into the manifest — which volumes shipped, and a flat private_included: false. The recipient never has to take your word for what's inside; the manifest says so, and the bytes back it up.

One honest caveat lives right here. The stripper ends in a rescue that, if the strip itself crashes on malformed input, returns the original bytes — it fails open. In practice the input is the engine's own well-formed disk, so the rescue rarely fires; but it exists, and we'd rather you know than discover it. The honesty section comes back to this.

why add -A is SAFE

depth rung · skippable — the git door's one trick

The git door earns its safety with a single habit: on every repo init and before every commit, the engine appends any missing private lines to the tenant repo's .gitignore. It's idempotent — only the missing lines get added, existing content is preserved — so the ignore file converges to the full list and stays there.

That one habit is what makes bulk staging safe. A git add -A is the natural thing an agent reaches for, and ordinarily it's a footgun: it sweeps in everything, including the .workbooks/ signing key. Here it can't, because the auto-gitignore has already excluded session and secret data before the add runs. The same protection covers the mirror-snapshot path the engine uses internally. The signing key is a Fly secret (WB_SIGNING_KEY, a base64 Ed25519 seed) whose private half must never enter version control — the ledger's whole attribution model rests on it staying host-only — and the agent never has to think about any of that. It just commits.

# the tenant repo's .gitignore — appended automatically, idempotently
scratch/        .workbooks/        .beads/        .claude/
memory/         tmp/
_*.jsonl        _*.json            _*.db
#  → a bulk `git add -A` physically cannot stage the signing key

The boundary the agent never has to remember is the public/private ignore, written for it. Discipline you have to maintain is discipline that eventually fails; discipline the engine enforces on every commit doesn't.

your .gitignore is the BOUNDARY

Here's the part that turns this from architecture into something you actually drive. The library door runs a second filter after the built-in defaults: it honors the repo's own .gitignore — using git's own matcher, git check-ignore, not a reimplementation. That's a deliberate choice: exact git semantics, and DRY, because nobody should re-derive gitignore globbing.

The consequence is the most useful sentence on this page. To mark something "don't share," you mark it exactly the way you already mark "don't track" — a line in .gitignore. No bespoke privacy API to learn, no manifest to hand-edit. The instinct every developer already has — drop a path into .gitignore — is the share boundary. The same logic governs build inputs: on a share, source build inputs are dropped so you ship the compiled .wasm, not the source, unless you opt in.

So the library pack's share path is a double filter — the engine's safe defaults (strip_parts), then your repo's own ignore rules layered on top:

flowchart LR
  parts["everything in the workbook"]
  d1["Workbooks.Private
strip the built-in defaults"]
  d2["git check-ignore
honor YOUR .gitignore"]
  ship["what ships"]
  parts --> d1 --> d2 --> ship
  style d1 fill:#a8d4f0,stroke:#121316
  style d2 fill:#aee5c2,stroke:#121316,stroke-width:2.5px
  style ship fill:#ffffff,stroke:#121316
  style parts fill:#ffffff,stroke:#121316

Read it left to right: the full set of parts enters, the blue node removes the engine's always-private defaults, the green node then removes anything your gitignore names, and only what survives both filters ships. The green node is the one you control. The defaults protect you from mistakes; your gitignore expresses your intent — and the share path respects both without you learning a new tool.

saying yes on PURPOSE

Privacy-by-default would be useless if you could never deliberately include the session — for a handoff, or a backup of your own work. So opt-in exists, and it's always the same explicit shape: include_private: true. That uniformity is intentional; it's the same shape as the identity toolkit's --include-private, which emits the Ed25519 private key with a stderr warning and mode-0600 file writes. Including the session is a thing you say, loudly, on purpose.

The library pack gives that decision a legible name — a purpose, not a raw flag:

purpose	what ships	who it's for
`:share` (default)	the work only — session stripped	anyone you send it to
`:archive`	everything, session included	YOU — a backup of your own work

The verdict of that table: a backup of your own work that forgot your session would be a bad backup, so :archive keeps everything — a re-download of your own snapshot still has your agent's memory and logs intact. :share strips, because the person you're sending it to wants the report, not your run history. Same verb, one explicit word of difference. And whichever one you chose, the manifest's private_included field records it — so the recipient of a bundle can read manifest.json and see whether the session is inside before they trust or open anything.

One thing survives the strip in every case: provenance. Even on a default share, the HTML is C2PA-signed with the tenant's DID — the signature ships, the session log doesn't. Proof travels; the session doesn't.

where the boundary ENDS

Honesty section. Five things this boundary is not, stated plainly.

The CLI can't opt in yet. The wbx pack verb always ships the stripped form — it doesn't expose --include-private or --archive today. The opt-in is real, but it's an engine/API option for now; the CLI flag is future work. If you want an archive today, you reach the engine, not the command line.

The strip fails open. As the VACUUM section noted, if the stripper crashes on malformed input it returns the original bytes rather than failing closed. The input is normally the engine's own disk, so this is a narrow window — but it's a real one, and we name it.

There's no dedicated test suite. We looked: there is no test file targeting this module specifically. The guarantee here is architectural — one choke point, three callers, seventy auditable lines — not belt-and-braces suite coverage. That's a real property and a real limit at once; honest is honest.

Private isn't hidden from you. The boundary protects egress, not your own control plane. Your engine's surfaces still read _steps.jsonl to render live agent activity — by design. You can watch every move your agent makes; the point is only that none of it leaves. Private means "doesn't cross the boundary," not "secret from its owner."

Opt-in is irreversible once sent. A recipient of an include_private bundle has everything — the memory, the telemetry, the lot — and you cannot un-send it. The manifest tells them what they got; it can't claw it back for you. Choose :archive for yourself, not for strangers.

questions people actually ASK

Is memory deleted from my copy when I share?

No. Egress rewrites a copy on the way out — the strip happens in a temp file, never against your live disk. Your memory and tmp are untouched by sharing; only the thing that left was slimmer than the thing you kept.

Can I see whether a bundle I received contains private data?

Yes — open manifest.json. The volumes field lists what shipped (["workspace"] for a normal share, all three on opt-in) and private_included is a flat true/false. The receipt is right there; you don't have to spelunk the bytes to know.

How do I make my own file private?

Three ways, all things you already know. Give it an _ prefix and a .json/.jsonl/.db extension and it's caught by the prefix rule. Or add a line to .gitignore — the library share honors it via git check-ignore. Or write it into the memory or tmp volume. No special API; the boundary speaks the tools you have.

Is this encryption?

No — it's absence, not ciphertext. Private data simply isn't in the shipped bytes. Encryption-that-ships — where the bytes are present but unreadable without a key — is a different mechanism, covered in sealed sections and escrow. This page is about what leaves; those are about what unlocks.

Does provenance get stripped too?

No. The signature ships even when the session doesn't — the HTML is C2PA-signed with the tenant's DID on a default share. Proof travels; the session log stays home. A recipient can verify who published it without ever seeing how it was made.

Why one module instead of a setting per door?

Because three lists drift and one list can't. If git, bundle, and library each owned their own notion of private, the day one of them fell behind is the day something leaks. One seventy-line module, consulted by all three, means the boundary is defined exactly once — you can read it, audit it, and trust that every door agrees because they're all reading the same sentence.

keep GOING

This page is the sharing consequence of the disk — the neighbors below make it whole.

The VFSthe disk this protects on the way out

→

Volumesworkspace ships · memory + tmp stay

→

Sealed sectionsciphertext that ships — absence's opposite

→

The ledgerthe _steps + _ledger this keeps home

→