Skip to main content
You do not need the Boundary dashboard to get useful feedback while building a contract. The local package can show attempts, repair messages, rule failures, and test failures entirely inside your process. Use this workflow before you add @withboundary/sdk.

Use debug: true first

For a new contract, start with the built-in debug logger:
import { defineContract } from "@withboundary/contract";

export const leadContract = defineContract({
  name: "lead-scoring",
  schema,
  rules,
  debug: true,
});
This is the shortest path to useful output. It prints the run lifecycle to the console and does not send anything over the network. Use it while you are answering basic questions:
  • Which rule failed?
  • Did the model receive the repair message?
  • Did it accept on the second or third attempt?
  • Is the reject path carrying enough detail?

Use createConsoleLogger when you need more detail

debug: true is the default console logger. When you need to inspect prompt instructions, raw output, or cleaned output, pass createConsoleLogger directly:
import {
  createConsoleLogger,
  defineContract,
} from "@withboundary/contract";

const logger = createConsoleLogger({
  showInstructions: true,
  showRepairs: true,
  showRawOutput: true,
  showCleanedOutput: true,
  maxStringLength: 800,
});

export const leadContract = defineContract({
  name: "lead-scoring",
  schema,
  rules,
  logger,
});
Turn raw output on only while debugging. It can contain customer data, prompts, or provider metadata.

Test rules without calling a model

Most contract bugs are rule bugs. Test them with verify() and plain objects:
import { describe, expect, it } from "vitest";
import { verify } from "@withboundary/contract";
import { leadSchema, leadRules } from "../src/contracts/lead";

describe("lead rules", () => {
  it("rejects a hot lead below the score threshold", () => {
    const result = verify(
      { tier: "hot", score: 25, reason: "Low intent" },
      leadSchema,
      leadRules,
    );

    expect(result.ok).toBe(false);
    if (result.ok) return;

    expect(result.error.attempts[0].category).toBe("RULE_ERROR");
    expect(result.error.attempts[0].issues).toContain(
      'tier is "hot" but score is 25; set tier to warm/cold or raise score to at least 70',
    );
  });
});
This test is deterministic. No provider key, no network, no flaky prompt behavior.

Test the repair loop with a fake RunFn

Once rules pass, test the full loop with canned model responses:
it("repairs then accepts", async () => {
  const responses = [
    JSON.stringify({ tier: "hot", score: 25, reason: "" }),
    JSON.stringify({ tier: "cold", score: 25, reason: "Low intent" }),
  ];

  const result = await leadContract.accept(async () => responses.shift()!);

  expect(result.ok).toBe(true);
  if (!result.ok) return;
  expect(result.attempts).toBe(2);
});
This tells you whether the contract loop behaves before a real model is involved.

Capture local metrics with ContractLogger

When console logs get noisy, write a small logger for the signals you care about:
import type { ContractLogger } from "@withboundary/contract";

const metricsLogger: ContractLogger = {
  onRunSuccess(ctx) {
    metrics.increment("boundary.accepted", {
      contract: ctx.contractName,
      attempts: String(ctx.attempts),
    });
  },
  onRunFailure(ctx) {
    metrics.increment("boundary.rejected", {
      contract: ctx.contractName,
      category: ctx.category ?? "unknown",
    });
  },
  onVerifyFailure(ctx) {
    metrics.increment("boundary.verify_failed", {
      contract: ctx.contractName,
      category: ctx.category,
    });

    console.debug("boundary verify failed", {
      contract: ctx.contractName,
      issues: ctx.issues,
      ruleIssues: ctx.ruleIssues,
    });
  },
};

export const leadContract = defineContract({
  name: "lead-scoring",
  schema,
  rules,
  logger: metricsLogger,
});
Keep logger hooks lightweight. They run during the contract lifecycle and should not perform slow work inline.

What to log locally

In development, useful logs are usually:
  • contract name
  • final ok state
  • attempt count
  • final failure category
  • failing rule names or issue strings
  • repair messages
  • raw output only when debugging a specific issue
Avoid logging full prompts and outputs by default. They are often the most sensitive data in the system.

When to add the hosted SDK

Local logging is enough while one developer is tuning a contract. Add @withboundary/sdk when you need shared production visibility:
  • acceptance rate over time
  • top failing rules across traffic
  • model or prompt regressions
  • alerting when correctness degrades
  • team debugging without searching server logs
See Production observability when you are ready to wire that in.