Use Boundary when
LLM output feeds into application logic
Database writes, API calls, routing decisions, workflow triggers — anywhere the output drives behavior, not just display.
Incorrect values would break workflows
Not just incorrect types (Zod handles that), but incorrect values. A lead scored as “hot” with a score of 25. An invoice where subtotal + tax ≠ total.
Cross-field consistency is required
When fields must be consistent with each other — tier matches score, currency matches account, action matches status.
You need correctness, not just structure
Structured outputs guarantee valid JSON. Boundary guarantees the values are actually right.
Don’t use Boundary when
Chat / conversational UX
Free-form text responses shown directly to users. No structured output to validate.
Creative generation
Poetry, stories, marketing copy — there’s no “correct” output to enforce.
Free-form text without structure
If the LLM output isn’t JSON, there’s no schema to validate against.
Human makes the final decision
If a person reviews every output before it enters your system, the human is the boundary.
Performance and cost
Boundary adds zero overhead on the happy path beyond validation (which is synchronous and in-memory). Extra LLM calls happen only on failure:| Scenario | LLM calls |
|---|---|
| Output is correct on first try | 1 call (no extra cost) |
| Output needs one repair | 2 calls |
| Output needs two repairs | 3 calls (default max) |
| All attempts fail | 3 calls + structured error |
result.durationMS field tells you the exact time for each call.
The right mental model
Boundary is not:- A testing framework
- A prompt engineering tool
- A model evaluation suite