Retry schedule
Every batch gets up to 3 attempts with exponential backoff and up to 50% jitter.| Attempt | Base delay | With jitter |
|---|---|---|
| 1 | — | (no delay) |
| 2 | 100ms | 100-150ms |
| 3 | 400ms | 400-600ms |
onError fires once.
What retries
| Condition | Action |
|---|---|
| Network error (ENOTFOUND, ECONNRESET, etc.) | Retry |
| 5xx response | Retry |
| 408, 429 | Retry (429 uses Retry-After, see below) |
| 401 / 403 | No retry. Logger disabled. |
| Other 4xx | No retry. Batch dropped. |
| Timeout (per attempt) | Retry |
429 + Retry-After
On HTTP 429, the transport honors the server’s Retry-After header before the next attempt:
- Seconds (
Retry-After: 30) → wait 30 seconds. - HTTP-date (
Retry-After: Wed, 21 Oct 2026 07:28:00 GMT) → wait until that time. - Missing / unparseable → fall back to the default backoff.
Circuit breaker
After 5 consecutive failures, the breaker opens. While open (for 30 seconds), the transport short-circuits — no network, no retry — andonError fires with a BreakerOpenError.
When the cooldown ends, the breaker flips to half-open. The next batch becomes a probe:
- Probe succeeds → breaker closes, normal operation resumes.
- Probe fails → breaker opens again for another 30s.
Auth failures (401/403)
The first 401 or 403 disables the logger permanently for the process’s lifetime:- The specific batch that hit the error is dropped.
- Every subsequent
onRunSuccess/onRunFailureis a no-op — no events are enqueued. onErrorfires once with anAuthError.
Per-attempt timeout
Every HTTP request runs under a 10-secondAbortController timeout. If the request doesn’t complete in 10s, the fetch is aborted, the attempt counts as a failure, and the backoff begins.
This is not configurable today. It’s tuned conservatively because the ingest endpoint is designed to respond in < 500ms under normal load.
onError — permanent drops
onError is called when the transport gives up on a batch — after retries exhaust, when the breaker is open, or when a non-retryable status returns.
onError supplied): the SDK calls console.warn once per process, then falls silent to avoid flooding logs during an outage.
Errors that surface here:
| Error name | Meaning |
|---|---|
AuthError | 401/403 — logger is now disabled |
RateLimitError | 429 after all retries exhausted |
BreakerOpenError | Breaker short-circuited the batch |
NonRetryableStatusError | 4xx other than 401/403/408/429 |
Generic Error | Network/timeout/5xx after all retries |
Keep-alive
Nativefetch in Node 18+ and every runtime the SDK targets reuses TCP connections automatically. There’s no Agent to configure and no connection pool to tune.
Summary
- 3 attempts per batch (100ms, 400ms, 1600ms), 50% jitter
- 429 honors
Retry-After(capped at 60s) - Breaker: 5 fails → open 30s → probe
- 401/403 → disable logger permanently, no retry
- 10s timeout per attempt via
AbortController - Permanent drops surface via
onError
See also
Batching
Queue overflow behavior during outages
Shutdown
Drain timeouts during exit