LowRouter

Routing

Every request goes through the router. For an explicit model the router’s job is small (pick a healthy upstream that serves it); for lowrouter/auto the router picks the model too. This page describes both.

The router’s inputs

For each request:

  • The model string in the request (lowrouter/auto, an explicit ID, or one of the auto-* pseudo-models).
  • The route object, if present (provider, region, prefer_low_carbon, fallback).
  • The key’s policy: any of the per-key fields from API key management.
  • The current health of every upstream (ok, degraded, unavailable).
  • The current grid carbon intensity for each (provider, region) pair.

Decision order

The router applies constraints from the most specific to the least:

  1. Per-request route. A pinned provider or region removes anything that doesn’t match.
  2. Per-key policy. A key’s region pin or models allowlist is then applied.
  3. Account policy. Defaults set in auto-routing settings — for instance, “prefer EU regions when possible.”
  4. Auto-router scoring. Whatever survives the above is scored on:
    • Capability match (does the model support the request shape — vision input, tool use, structured output?).
    • Provider health.
    • Latency (median over the last 5 minutes per upstream).
    • Carbon (grams per 1K tokens for that provider × region pair, with the bias controlled by prefer_low_carbon).
    • Price (matters when the request used auto-cheap).
  5. Tie-break. When two candidates score within 1% of each other, the more recently used one wins (sticky routing within a session when user is supplied; otherwise random).

What happens on failure

If the chosen upstream returns a 5xx or times out:

  • The router marks that upstream’s slot temporarily unavailable (decaying over a few minutes).
  • It tries the next eligible candidate in the same region (region is never violated silently).
  • If none, it tries other regions only if route.fallback != false and the per-key/account policy allows it.
  • If still none, it returns 503 service_unavailable with a code describing what’s missing.

The full chain of attempts is recorded in the generation’s routing_trace, visible on the dashboard’s transaction-detail page.

prefer_low_carbon

The auto-router’s carbon score is a weighted term in its overall score. Setting prefer_low_carbon: true on a request increases that weight, which pushes traffic toward providers serving from lower-grid-intensity regions when capability and latency are comparable.

It does not override pinned regions or providers. It does not guarantee the lowest-carbon option in absolute terms — only that, all else equal, lower carbon wins.

Worked example

A request for lowrouter/auto with a vision input:

  1. Drop models that don’t support vision.
  2. Drop providers in degraded or unavailable state.
  3. Among the rest, score on (capability fit, latency, carbon).
  4. The top-scored option wins.

If the top-scored option later returns 502 mid-request:

  1. Mark its (provider, region) slot unavailable.
  2. Re-score the surviving candidates.
  3. Retry with the new top option (capped at two retries per request).
  4. If retries are exhausted, return 502 to the caller.

Pinning recipes

GoalRecipe
EU residencySet route.region: eu-west per request, or pin the key’s region.
Specific providerroute.provider: anthropic. Combine with route.region for region too.
Hard pin (no failover)route.provider, route.region, route.fallback: false.
Lower carbonroute.prefer_low_carbon: true. Combine with no region pin to let the router pick the cleanest available region.
Cheapest acceptableUse lowrouter/auto-cheap.

What the router does not do

  • It does not benchmark output quality. The auto router optimises for capability, latency, carbon, and price — not “is the answer good”.
  • It does not silently swap models mid-conversation. If you’ve been routed to model A on the first turn, the auto router prefers sticking with model A on the second when you supply a stable user.