you already built your TOOLS
The parent lesson sold you the unit — a real program in a sandbox, wrapped in a manual that teaches people and agents — and ended on an honest note: the parts bin is early, so convert one tool you love. This page is how you convert it.
Because here's the thing the brochure skips: your capabilities don't live in
a vacuum waiting to be written. They already exist, in other people's
formats. A directory of Claude skills. An .mcp.json three of
your projects depend on. An OpenAPI spec a vendor handed you. An npm CLI, a pip
CLI, a .cursorrules file, an AGENTS.md. The implied
price of any new ecosystem is rewrite everything in the new shape — and
that price is exactly why most new formats are born dead. Nobody pays it.
So the design goal isn't a converter that's clever. It's a ramp that's honest: convert what's mechanically convertible, do the real work where it can, and where it genuinely can't, leave instructions instead of a shrug. Convert, don't rewrite.
the DEFINITION
1. the ecosystem's intake ramp: detect a source's shape, translate its docs to org, scaffold a real toolkit — manifest, skills, carried scripts — and write the conversion plan it can't finish for you.
The spec states the philosophy without hedging: parse what's parseable, do the work, and leave a manual — never a shrug. That last clause is the whole ethic. An import that can't fully convert your tool does not fail and does not pretend — it produces a finished toolkit plan, written in the same grammar agents already execute, and hands you the rest of the work as a checklist. Parse-only and local-only at this stage; nothing is built, nothing runs, nothing is trusted yet. A guidance-only toolkit — pure manual, no code — is a valid landing, not a consolation prize.
nine shapes, one SNIFF
Detection is not AI and not magic — it's literal shape-sniffing. The
import verb knows nine source kinds, and the taxonomy maps each
shape to what it becomes. An --as <kind> flag overrides
everything; pass a wrong one and you get a clean mapped error, never a panic
(the test feeds it --as vsix and asserts the exit code is not a
crash).
| kind | what's detected | what it becomes |
|---|---|---|
| claude-skill | SKILL.md + frontmatter | main skill + a reference-* skill per references/*.md, scripts carried |
| markdown | a lone .md | a one-skill toolkit, the doc as its manual |
| folder | a directory of anything | every top-level .md → a skill; a stub if none |
| mcp-server | .mcp.json / mcp.json | a skill per server + a synthesized launch script |
| openapi | "openapi"/"swagger" in the first 4 KB | a skill per operation, Dock-brokered HTTP noted |
| npm-cli | package.json with "bin" | scope-stripped id, bins carried, README → manual |
| pip-cli | pyproject.toml | a synthesized stub per console script — for the audit |
| cursor-rules | .cursorrules / .cursor/rules/ | a guidance-only toolkit of rule skills |
| agents-md | AGENTS.md / CLAUDE.md | a one-skill guidance toolkit |
The order of the sniff is the careful part. Identity files win before the
generic .md catch-all — a folder with a SKILL.md is a
Claude skill, not just a pile of markdown. OpenAPI is matched by
content, not extension: any .json/.yaml whose
first 4096 characters contain "openapi", "swagger", or a
line starting openapi: qualifies. And a directory always lands
somewhere — if nothing else matches, it imports as a plain folder. The one
case that bails is a remote ref, covered under the honest limits below.
One stale corner worth naming so you're not misled: the CLI's inline help for
--as lists only three kinds. The parser accepts all nine, and its
error message is the source of truth — claude-skill, markdown, folder,
mcp-server, openapi, npm-cli, pip-cli, cursor-rules, agents-md.
three honest STAGES
The spec calls import an intake ramp on purpose — a ramp has stages, and these three are kept distinct so none of them can lie on behalf of another:
- Stage 1 — parse + scaffold. Detect the shape, translate the docs,
write a real toolkit to disk: a
manifest.org, askills/tree, and any carried scripts. Local and parse-only. - Stage 2 — the dependency audit. This runs automatically at the end of every import. It classifies each carried script against the wasm lanes and writes the findings into the manifest itself — the artifact carries its own audit, not a side report.
- Stage 3 — the fix-up plan. Everything not already sandbox-ready becomes an org TODO checklist with concrete rewrite recipes. This is the agent manual: work the items, prove the toolkit, done.
Here is the whole ramp as the kernel runs it — source on the left, a finished toolkit-with-findings on the right, and the real done-test still ahead of you:
flowchart LR src["a source
(skill / mcp / openapi / cli …)"] src --> detect{"detect()
sniff the shape"} detect --> parse["per-kind parser
md → org, collect scripts"] parse --> scaffold["write_scaffold()
manifest.org + skills/ + scripts/"] scaffold --> audit["audit_static()
auto-runs, stage 2"] audit --> man[["manifest.org
carries findings + fix-up plan"]] man -. "still ahead" .-> ship["push → build → verify"] style src fill:#f3c5a3,stroke:#121316 style man fill:#f3c5a3,stroke:#121316,stroke-width:2.5px style ship fill:#ffffff,stroke:#121316,stroke-dasharray:4 3
The dotted edge is the honest part of the picture: an import produces an
artifact, not a running tool. Imported is not runnable is not trusted. If
the audit fails to run at all, the import still succeeds — it degrades to a quiet
audit skipped note rather than throwing the conversion away. The
manual always lands.
the honest 90%
Depth rung — skippable, but it's where the word honest earns its
keep. The doc translator, md_to_org, is mechanical and
deterministic. No model, no inference. It walks the markdown and rewrites the
constructs it knows, line by line:
| markdown in | org out | rule |
|---|---|---|
# Heading | * Heading | hashes → stars, same depth |
```bash … ``` | #+begin_src bash … #+end_src | no lang → text; unterminated closed at EOF |
[this](https://x.dev) | [[https://x.dev][this]] | link order flips |
**bold** | *bold* | double star → single |
`code` | =code= | backtick → equals |
The thesis is one comment in the source: what it can't translate it passes
through — org tolerates it. Lists, tables, images: untouched, and they still
read, because org is a superset that doesn't choke on stray markdown. That's why
90% is the right number to promise and 100% the wrong one — the translator never
guesses, so it never mangles. Frontmatter is split off first: a leading
--- YAML block becomes key/value pairs (folded and multiline values
handled), and an unterminated block is simply treated as body, not an error. Ten
percent rides through as-is. That's the deal, stated plainly.
what lands on DISK
A scaffold is a real toolkit, not a sketch. The manifest carries a fixed set
of keys that mark its provenance — most importantly #+STATUS: imported
and #+IMPORTED_FROM: <kind> <source>, so the artifact
always remembers where it came from. It refuses to overwrite an existing output
directory: {dir} already exists, full stop (the test re-imports into
a populated dir and asserts the failure). Walk the claude-skill round trip end to
end. The input — a directory with a SKILL.md, a reference doc, and a
shell script:
my-skill/
SKILL.md ---
name: demo-skill
description: does demo things
---
# Demo …
references/extra.md
scripts/run.sh #!/bin/sh
echo hi
One command. The id comes from the frontmatter name, the tagline
from description (truncated to 220 chars), the body becomes the main
skill, and references/extra.md becomes a reference-extra
skill. The human-mode output:
$ wbx toolkit import my-skill imported claude-skill → demo-skill-toolkit/ manifest.org the toolkit surface skills/ 2 skills scripts/ 1 carried (unaudited — see the TODO) next: review manifest.org, then `wbx toolkit verify demo-skill` audited 1 script — 1 ready · 0 convertible · 0 blocked run.sh ready (sh) findings written into manifest.org
And the manifest that lands carries the trust posture in its own body —
verbatim, not a footnote: anything under scripts/ is carried
verbatim and NOT yet trusted to run in the sandbox. A
** TODO dependency audit placeholder is written first, then replaced
in-place by stage 2's findings. For an all-shell skill like this one, the fix-up
section reads nothing to fix — every script is sandbox-ready. The fence in
SKILL.md came across as #+begin_src bash; a link to
https://x.dev became [[https://x.dev][this]]. Every claim
in this paragraph is a line in the test suite.
manufacturing EVIDENCE
Depth rung. The cleverest mechanic in the whole verb is this: for two kinds, import synthesizes a script that doesn't exist in the source, specifically so the audit has something honest to judge. The system manufactures the evidence for its own honesty pass.
Take an MCP server. Its config never contains a script — it contains a launch
command. But the server IS the bin: running it is the capability. So import
writes the launcher into scripts/<name>.sh, and now the audit
can classify it. Watch one .mcp.json become an auditable verdict:
sequenceDiagram participant J as .mcp.json participant I as import participant A as audit_static participant M as manifest.org J->>I: server gh → command npx -y @x/server I->>I: synthesize scripts/gh.sh
exec npx -y @x/server I->>A: here is a real script to judge A->>A: classify npx → npm lane exists A->>M: gh.sh — convertible
resolve + bundle at build time
The pip CLI uses the same trick toward the opposite verdict. From a
pyproject.toml console script, import synthesizes a tiny Python stub
containing only the entry-point comment — synthesized for the audit; the real
implementation lives in the package. It exists so the audit can say the true
thing: blocked — no python lane today, and hand the rewrite recipe to an
agent. The stub is a placeholder for honesty, not a working tool. (pip-cli parses
the TOML by hand, deliberately taking no toml dependency, to keep the import lane
thin.)
names travel, values DON'T
An MCP config routinely carries secrets — an API token sitting in an
env block. Import documents the env names in the skill and
synthesizes the launcher, but the values never leave the source. The skill
says, in effect, set these engine-side; the secret is never copied into any output
file. Feed it a token and look at what lands:
in → {"mcpServers":{"gh":{"command":"npx",
"args":["-y","@x/server"],"env":{"TOKEN":"hunter2"}}}}
out → scripts/gh.sh
#!/bin/sh
# the MCP server launch — the server is the toolkit bin
exec npx -y @x/server
skills/gh.org
Env (names only; set values engine-side): TOKEN
The string hunter2 appears in zero output files — manifest,
skill, and script are all checked, and the test asserts it. The name
TOKEN is documented so a human knows what to set; the value stays put.
Config values may be secrets, so they're treated as secrets by default.
The same skill is honest about a second limit: MCP tools are enumerated at runtime, not in the config. So the import documents the launcher, not the tool list, and says so — run the server to list its tools, then write each as its own skill.
guidance-only is a LANDING
Not every source has code, and the ramp doesn't punish that. A
.cursorrules file, a .cursor/rules/ directory, an
AGENTS.md, or a bare markdown doc all import into a toolkit that is
pure manual — skills and no scripts. The audit, finding nothing to convert, writes
the cleanest possible verdict: no carried scripts — guidance-only toolkit,
nothing to convert. The test feeds it an AGENTS.md and asserts
the import reports nothing to fix — a guidance-only toolkit is a clean
landing, not a failure.
This is the lowest, most stable rung of the ladder the parent lesson named: the manual is the interface. A toolkit that is only a manual is still a real toolkit — installable, versioned, readable by an agent. Sometimes the most valuable thing you have to convert is know-how, and the ramp carries know-how as a first-class artifact.
the honest LIMITS
House rule on these pages: name the edges in the source's own words. The import verb has real ones.
- Remote refs aren't wired. A skills.sh slug, an
owner/repo, anhttpURL — none import directly. Instead of half-trying, the error teaches the fix: Remote refs (skills.sh, owner-repo) aren't wired yet — clone it locally and import the folder: git clone … && wbx toolkit import <dir>. Clone first. - YAML OpenAPI isn't parsed. Only JSON specs parse today, and a YAML spec doesn't get a half-parse — it gets a recipe: convert YAML first: yq -o=json spec.yaml > spec.json && wbx toolkit import spec.json.
- MCP tools are runtime-enumerated. The config has the launcher, not the tool catalog, so the import documents the launcher and tells you to enumerate the rest after first run.
- No python lane today. A python or ruby or perl script audits as blocked. The import is honest about it and hands the rewrite plan.
- Translation is 90%, not 100%. Lists, tables, and images pass through untranslated. They read fine in org; they're just not rewritten.
- Imported ≠ verified. The scaffold is a plan and a candidate, not a shipped tool. Push, build, and verify are still ahead.
The blocked case deserves the worked detail, because it's the one most likely
to read as a failure when it isn't. From a pyproject.toml exposing
pycli = "pycli.main:run", the manifest gains exactly this — an audit
finding, and the org checklist an agent works against:
** dependency audit (static, auto)
*** pycli.py — blocked (python3)
- interpreter =python3= :: blocked — no python lane today —
rewrite in a covered lane or split the logic
** TODO fix-up plan [0/1]
The agent manual: work each item, check it off, then prove the whole
toolkit — done when =wbx toolkit push pycli = · =wbx toolkit build pycli=
· =wbx toolkit verify pycli= all pass. With an engine reachable,
=wbx toolkit audit --fix= runs that push→build→verify for you.
*** TODO pycli.py (blocked — python3)
- [ ] rewrite in JS for the quickjs lane — keep the script's CLI contract
(same args in, same stdout out) so callers don't change
- [ ] or split the logic into org tasks the engine runs natively
- [ ] re-run =wbx toolkit audit= — pycli.py must classify ready
The import didn't fail. It produced the instructions — in the same org grammar agents already execute. The sibling lesson audits owns the depth of how ready, convertible, and blocked get decided; this page hands off cleanly at stage 2.
questions people actually ASK
Does it use an LLM?
No. Detection is filename and content sniffing; translation is a mechanical, deterministic markdown-to-org pass; the audit is a static classifier. Run it twice on the same input and you get the same scaffold. The intelligence is in the structure, not in a model call.
My import says blocked — did it fail?
No — the plan is the product. A blocked script means import couldn't make it
sandbox-ready by itself, so it wrote the rewrite recipe instead. An agent works
the org checklist, and wbx toolkit audit <dir> --fix proves
the result by running push → build → verify. Blocked is a starting line, not a
dead end.
Can I import straight from skills.sh or a repo URL?
Not yet. Remote refs aren't wired, and rather than half-try, the error tells you the move: clone the source locally, then import the folder. The intake ramp is local-only at this stage by design.
What happens to my MCP server's tools list?
Tools are enumerated at runtime, not in the config, so import can't read them statically. It documents the launch command — the server is the bin — and tells you to run the server once, list its tools, and write each as its own skill.
Will my scripts run after import?
Not immediately, and the manifest says so verbatim: carried scripts are NOT yet trusted to run in the sandbox. Imported is not runnable is not trusted — push, build, and verify are the gates between a scaffold and a working tool.
What do agents get instead of the human report?
The same findings, as a JSON envelope — {ok, verb, data} with a
scripts array and a summary of ready/convertible/blocked counts. Human mode
appends the prose report to the output; agent mode keeps it machine-readable.
The manifest carries the findings either way.
keep GOING
Import is stage one of a longer ramp — these lessons carry it the rest of the way to a shipped, trusted tool.