Skip to main content
The HTTP transport is built to fail gracefully. You don’t configure any of this — these are the guarantees the SDK ships with.

Retry schedule

Every batch gets up to 3 attempts with exponential backoff and up to 50% jitter.
AttemptBase delayWith jitter
1(no delay)
2100ms100-150ms
3400ms400-600ms
After three failed attempts the batch is dropped and onError fires once.

What retries

ConditionAction
Network error (ENOTFOUND, ECONNRESET, etc.)Retry
5xx responseRetry
408, 429Retry (429 uses Retry-After, see below)
401 / 403No retry. Logger disabled.
Other 4xxNo retry. Batch dropped.
Timeout (per attempt)Retry

429 + Retry-After

On HTTP 429, the transport honors the server’s Retry-After header before the next attempt:
  • Seconds (Retry-After: 30) → wait 30 seconds.
  • HTTP-date (Retry-After: Wed, 21 Oct 2026 07:28:00 GMT) → wait until that time.
  • Missing / unparseable → fall back to the default backoff.
Capped at 60 seconds so a pathological header can’t stall your process indefinitely.

Circuit breaker

After 5 consecutive failures, the breaker opens. While open (for 30 seconds), the transport short-circuits — no network, no retry — and onError fires with a BreakerOpenError. When the cooldown ends, the breaker flips to half-open. The next batch becomes a probe:
  • Probe succeeds → breaker closes, normal operation resumes.
  • Probe fails → breaker opens again for another 30s.
This is the protection against retry storms during backend outages. You’d otherwise stack up queued events, retry them 3x each, and amplify the outage from your side. Auth errors bypass the breaker — bad credentials aren’t transient, and there’s no point counting them against the breaker state.

Auth failures (401/403)

The first 401 or 403 disables the logger permanently for the process’s lifetime:
  • The specific batch that hit the error is dropped.
  • Every subsequent onRunSuccess / onRunFailure is a no-op — no events are enqueued.
  • onError fires once with an AuthError.
This prevents spamming the dashboard with auth failures. Fix the key and redeploy.

Per-attempt timeout

Every HTTP request runs under a 10-second AbortController timeout. If the request doesn’t complete in 10s, the fetch is aborted, the attempt counts as a failure, and the backoff begins. This is not configurable today. It’s tuned conservatively because the ingest endpoint is designed to respond in < 500ms under normal load.

onError — permanent drops

onError is called when the transport gives up on a batch — after retries exhaust, when the breaker is open, or when a non-retryable status returns.
createBoundaryLogger({
  onError(err) {
    metrics.increment("boundary.drop");
    console.warn("Boundary event dropped:", err);
  },
});
Default behavior (no onError supplied): the SDK calls console.warn once per process, then falls silent to avoid flooding logs during an outage. Errors that surface here:
Error nameMeaning
AuthError401/403 — logger is now disabled
RateLimitError429 after all retries exhausted
BreakerOpenErrorBreaker short-circuited the batch
NonRetryableStatusError4xx other than 401/403/408/429
Generic ErrorNetwork/timeout/5xx after all retries

Keep-alive

Native fetch in Node 18+ and every runtime the SDK targets reuses TCP connections automatically. There’s no Agent to configure and no connection pool to tune.

Summary

  • 3 attempts per batch (100ms, 400ms, 1600ms), 50% jitter
  • 429 honors Retry-After (capped at 60s)
  • Breaker: 5 fails → open 30s → probe
  • 401/403 → disable logger permanently, no retry
  • 10s timeout per attempt via AbortController
  • Permanent drops surface via onError

See also

Batching

Queue overflow behavior during outages

Shutdown

Drain timeouts during exit