schema + rules + retry config — none of that needs a live LLM to test. Your contract logic is pure, and the entire pipeline is exposed through verify and a mockable RunFn.
Test schema + rules without an LLM
verify validates data directly:
Test the full loop with a fake RunFn
RunFn is just (attempt) => Promise<string | null>. Replace it with a function that returns canned strings:
Assert on repair context
YourRunFn sees attempt.repairs — assert that the loop is actually sending repair messages back:
Test without defineContract at all
Use enforce inline for one-shot tests:
Test failure categories
The 8 failure categories each have distinct triggers. Use fakeRunFn outputs to hit them:
| Category | RunFn returns | Why |
|---|---|---|
EMPTY_RESPONSE | null or "" | nothing to parse |
REFUSAL | "I can't help with that." | detected refusal language |
NO_JSON | "just prose, no json" | no parseable JSON |
TRUNCATED | "{\"a\": 1, \"b\":" | obviously cut off |
PARSE_ERROR | "{a: 1}" | malformed JSON |
VALIDATION_ERROR | valid JSON but wrong types | schema rejected |
RULE_ERROR | passes schema, fails a rule | rule rejected |
RUN_ERROR | throw new Error("boom") | your RunFn threw |
Disable retries in tests
DefaultmaxAttempts: 3 means a failing RunFn runs three times. For tighter feedback, lower it:
Turn off the logger
If your test environment has the Boundary API key set (e.g. CI env leaksBOUNDARY_API_KEY), you don’t want tests shipping events to the real dashboard. Three options:
- Scrub the env —
vi.stubEnv("BOUNDARY_API_KEY", "")socreateBoundaryLoggerreturnsnull. - Pass
logger: undefinedat call time — overrides the defined logger for this test. - Use a capture sink:
See also
Engine primitives
verify, classify, clean used directly
ContractLogger hooks
Assert on specific lifecycle events