Production Observability

Local logging helps while you are building. The hosted SDK helps after real traffic starts. Use @withboundary/sdk when you need to answer production questions without reading application logs:

Which contracts are rejecting most often?
Which rule is causing retries?
Did a prompt or model change reduce acceptance rate?
Are failures isolated to one environment?
Which runs need human review?

Install the SDK

npm install @withboundary/sdk

@withboundary/sdk is separate from @withboundary/contract. Installing the local contract package never enables cloud telemetry by itself.

Add a Boundary logger

import { defineContract } from "@withboundary/contract";
import { createBoundaryLogger } from "@withboundary/sdk";

const boundaryLogger = createBoundaryLogger({
  apiKey: process.env.BOUNDARY_API_KEY,
  environment: "production",
});

export const leadContract = defineContract({
  name: "lead-scoring",
  schema,
  rules,
  logger: boundaryLogger,
});

If BOUNDARY_API_KEY is missing and no custom write sink is configured, createBoundaryLogger() returns null. Passing that to defineContract is safe. The contract still runs; it just sends no events.

Keep raw data off by default

The SDK’s defaults are conservative:

createBoundaryLogger({
  apiKey: process.env.BOUNDARY_API_KEY,
  environment: "production",
  capture: {
    inputs: false,
    outputs: false,
    repairs: true,
  },
});

With these defaults, Boundary receives run metadata, failure categories, failing rule names, repair messages, duration, attempt count, model label, and SDK/runtime attribution. Raw prompts and raw model outputs stay in your process unless you opt in.

Opt into raw payloads only when needed

Raw inputs and outputs are useful for staging or short debugging windows:

const boundaryLogger = createBoundaryLogger({
  apiKey: process.env.BOUNDARY_API_KEY,
  environment: "staging",
  capture: {
    inputs: true,
    outputs: true,
    repairs: true,
  },
  redact: {
    fields: ["email", "phone", "ssn", "apiKey"],
    patterns: [/\b\d{3}-\d{2}-\d{4}\b/],
  },
});

Do not turn raw capture on just to make the dashboard useful. Rule failures, categories, repairs, and acceptance rates are enough for most production monitoring.

Flush in short-lived runtimes

Long-running Node servers can rely on normal batching. Serverless and edge runtimes need an explicit flush near the end of the request:

try {
  return await handler(req);
} finally {
  await boundaryLogger?.flush();
}

See the runtime guides for exact patterns:

Node.js

Long-running processes

Next.js

Route handlers, server actions, and edge runtime

Vercel & Lambda

Per-invocation flushing

Workers & Edge

waitUntil and timer constraints

What to watch first

Start with a small dashboard checklist:

acceptance rate by contract
top failing rules
failures by category
p95 duration and average attempt count
recent rejected runs with repair messages

If acceptance drops, inspect the top rule failures first. If format failures rise, inspect provider output, schema instructions, max tokens, and prompt changes.

SDK quickstart

Minimal hosted setup

Capture policy

Exact data buckets and defaults

Redaction

Scrub data before transmission

Security & data handling

Network behavior and trust boundaries

​Install the SDK

​Add a Boundary logger

​Keep raw data off by default

​Opt into raw payloads only when needed

​Flush in short-lived runtimes