Prompts¶
The system prompts that drive Ossature's LLM calls are declared as PromptSpecs. A PromptSpec is a structured, versioned object that Ossature renders into a final string at call time, rather than a free form template.
Each PromptSpec carries a stable id like audit.spec_audit or
build.implementer, a semantic version, and an ordered list of named
blocks (role, instructions, output_format, examples, tools, workflow,
and so on). Each block's content is a literal string that includes its
own XML wrapping tags, so a future variant override can swap the
block wholesale without the renderer having to reconstruct the tags.
The variables field is the declared set of placeholders. Today the
only one in use is language. Substitution uses string.Template
syntax (${language}).
The renderer is a pure function. Given a spec id and variable values it joins the blocks with a blank line, substitutes the variables, and returns the result.
PromptSpecs live under src/ossature/promptspec/. The spec.py module
holds the Block and PromptSpec dataclasses, renderer.py holds the
registry and render(), and the per-prompt modules live under
specs/audit/ and specs/build/. Each spec module calls register(...)
at import time, so importing ossature.promptspec is enough to
populate the registry.
To inspect a rendered prompt locally:
from ossature.promptspec import render, registered_ids
print(registered_ids())
# ['audit.cross_spec_audit', 'audit.interface_inference', ...]
print(render("audit.spec_audit", language="python"))
# <role>
# You are a senior technical reviewer ...
If you pass an unknown variable, omit a declared one, or use an
unregistered id, PromptSpecError is raised so the failure mode is
loud rather than silent.
System prompt vs user prompt¶
The PromptSpec is the system prompt, rendered once and reused across
calls. Context-specific instructions belong in the user prompt
alongside the data they refer to, not in the system prompt as a
conditional ("if X is provided, do Y"). For the planner, this means
the setup-command instruction, audit-findings instruction, and
verbatim-copy-tasks explanation all live in audit/planner.py's
user-prompt assembly, gated by if config.build.setup, if
audit_report.findings, and if context_inventory. The system prompt
stays a stable core, which keeps the rendered text shorter when those
contexts don't apply and helps prefix-based prompt caching land
better hit rates.
Paired specs for the planner¶
There are two planner specs, audit.plan_initial for fresh planning
and audit.plan_replan for re-planning after a spec change. Both
share their role, instructions, output-format, and examples blocks.
plan_replan also includes a preservation_rules block that tells
the model to emit a PreservedTaskRef for previous tasks the diff
doesn't touch.
Routing happens in audit/planner.py. When the call carries both a
spec diff and a previous task list, the planner renders the re-plan
spec. Otherwise it renders the initial spec. Both target the same
pydantic output type, SpecTaskPlan, whose task list is a
discriminated union of PlannerTask and PreservedTaskRef.
Verify validator¶
After the planner generates a SpecTaskPlan, a post-processing
validator walks each task's verify commands and flags any that would
fail because the source they need doesn't exist yet at that point in
the plan. The planner prompt also states this rule in prose, telling
the model that a task's verify may only reference files produced by
that task or one of its depends_on predecessors, and that
scaffold-only tasks should prefer a test -f check or an empty list
over a build command whose source comes later. The validator runs the
same check after generation. When the validator finds a problem, it raises
ModelRetry, which feeds the error message back through the agent
loop and asks the LLM to fix the affected tasks.
The validator's logic is language-agnostic. It asks the active
LanguageProfile two questions per verify command: is this a build
invocation, and does some task in the chain produce a source file. A
profile answers using three tuple fields: build_invocation_tokens
(substrings like "cargo build" or "npm install"),
source_extensions (file extensions that count as compilable source,
like ".rs" or ".py"), and manifest_filenames (basenames that
share an extension with source but act as manifests, like
"build.zig" and "conf.lua"). Empty tuples disable the check, which
is how the generic profile and any unknown language avoid false
positives.
The validator adds no extra place to edit for a curated language. The three new tuples sit alongside the prompt-facing string fields in the same LanguageProfile dataclass.
Language profiles¶
Some prompts need more than just the language name. The planner prompt, for example, has to mention build-invocation commands the scaffold rule forbids, typical manifest filenames, and worked task examples in that language. Hand-coding those into the prompt body forces the model to filter the noise at inference time and risks leaking unrelated language tooling into the output.
A LanguageProfile carries the per-language data the prompts need.
It has a setup-command example, scaffold manifest names,
build-invocation examples, a safe verify-examples paragraph, a
common verify command, and a block of worked task examples. Curated
profiles live under src/ossature/promptspec/profiles/ for python,
rust, javascript, typescript, lua, and zig. TypeScript is split from
JavaScript because the tooling diverges (tsc, tsconfig, type-only
verify) even though both run on the npm/node toolchain. Anything
else falls through to a generic profile whose field values use
directive wording (look at the manifest, prefer single-file checks)
and interpolate the language name where needed, so
language = "elixir" keeps working with weaker but still useful
guidance.
When a spec declares language as a variable, the renderer pulls the
active profile's fields into the substitution namespace. A prompt
template can then write ${build_invocation_examples},
${scaffold_manifests}, ${worked_examples}, and the like alongside
${language}, and the renderer fills each from the resolved profile.
Adding a new curated language touches two files. Drop a new module
under profiles/ that fills in the LanguageProfile dataclass and calls
register_profile, then add an import for it to profiles/__init__.py
so the module loads and registers at import time. No prompts need
editing.
Snapshot coverage¶
Every prompt has fixtures under tests/unit/fixtures/promptspec/,
capturing the rendered output for each language-bearing spec across
each curated language plus a fallback case that exercises the
generic profile. A parametrized snapshot test in
tests/unit/test_promptspec_snapshots.py re-renders each spec and
compares against its fixture. A dedicated test in
tests/unit/test_language_profiles.py enforces the cross-language
guarantee that a render targeted at one language never mentions
another curated language's tooling.