
# Routing

Every request goes through the router. For an explicit model the
router's job is small (pick a healthy upstream that serves it); for
`lowrouter/auto` the router picks the model too. This page describes
both.

## The router's inputs

For each request:

- The model string in the request (`lowrouter/auto`, an explicit ID,
  or one of the auto-* pseudo-models).
- The `route` object, if present (`provider`, `region`,
  `prefer_low_carbon`, `fallback`).
- The key's policy: any of the per-key fields from
  [API key management](../guides/api-keys).
- The current health of every upstream (`ok`, `degraded`,
  `unavailable`).
- The current grid carbon intensity for each (provider, region) pair.

## Decision order

The router applies constraints from the most specific to the least:

1. **Per-request `route`.** A pinned `provider` or `region` removes
   anything that doesn't match.
2. **Per-key policy.** A key's `region` pin or `models` allowlist is
   then applied.
3. **Account policy.** Defaults set in
   [auto-routing settings](/dashboard/auto-routing) — for instance,
   "prefer EU regions when possible."
4. **Auto-router scoring.** Whatever survives the above is scored on:
   - Capability match (does the model support the request shape —
     vision input, tool use, structured output?).
   - Provider health.
   - Latency (median over the last 5 minutes per upstream).
   - Carbon (grams per 1K tokens for that provider × region pair, with
     the bias controlled by `prefer_low_carbon`).
   - Price (matters when the request used `auto-cheap`).
5. **Tie-break.** When two candidates score within 1% of each other,
   the more recently used one wins (sticky routing within a session
   when `user` is supplied; otherwise random).

## What happens on failure

If the chosen upstream returns a 5xx or times out:

- The router marks that upstream's slot temporarily unavailable
  (decaying over a few minutes).
- It tries the next eligible candidate **in the same region** (region
  is never violated silently).
- If none, it tries other regions **only if** `route.fallback != false`
  and the per-key/account policy allows it.
- If still none, it returns `503 service_unavailable` with a code
  describing what's missing.

The full chain of attempts is recorded in the generation's
`routing_trace`, visible on the dashboard's transaction-detail page.

## `prefer_low_carbon`

The auto-router's carbon score is a weighted term in its overall
score. Setting `prefer_low_carbon: true` on a request increases that
weight, which pushes traffic toward providers serving from
lower-grid-intensity regions when capability and latency are
comparable.

It does **not** override pinned regions or providers. It does **not**
guarantee the lowest-carbon option in absolute terms — only that, all
else equal, lower carbon wins.

## Worked example

A request for `lowrouter/auto` with a vision input:

1. Drop models that don't support vision.
2. Drop providers in `degraded` or `unavailable` state.
3. Among the rest, score on (capability fit, latency, carbon).
4. The top-scored option wins.

If the top-scored option later returns 502 mid-request:

1. Mark its `(provider, region)` slot unavailable.
2. Re-score the surviving candidates.
3. Retry with the new top option (capped at two retries per request).
4. If retries are exhausted, return 502 to the caller.

## Pinning recipes

| Goal | Recipe |
|------|--------|
| EU residency | Set `route.region: eu-west` per request, or pin the key's region. |
| Specific provider | `route.provider: anthropic`. Combine with `route.region` for region too. |
| Hard pin (no failover) | `route.provider`, `route.region`, `route.fallback: false`. |
| Lower carbon | `route.prefer_low_carbon: true`. Combine with no region pin to let the router pick the cleanest available region. |
| Cheapest acceptable | Use `lowrouter/auto-cheap`. |

## What the router does not do

- It does not benchmark output quality. The auto router optimises for
  capability, latency, carbon, and price — not "is the answer good".
- It does not silently swap models mid-conversation. If you've been
  routed to model A on the first turn, the auto router prefers
  sticking with model A on the second when you supply a stable
  `user`.
