Sustainability-first
“Sustainable AI” is doing a lot of marketing work right now. This page describes what the phrase means inside LowRouter — concretely, what we measure, what we report, and what we have decided not to claim.
What we measure
For every inference request, we estimate two numbers:
- Energy per output token (Wh), derived from the model’s active parameter count using the EcoLogits methodology.
- Grid carbon intensity (gCO₂e/kWh) for the region the provider serves the request from, sourced from the International Energy Agency.
The product gives us gCO₂e per 1,000 tokens for the request. The exact formula and the confidence we attach to each estimate are in the methodology page.
These numbers are exposed:
- On every API response, in an
ecoblock. - On the dashboard, aggregated by day, model, provider, and region.
- On the public model browser, as a comparable estimate per model.
What we don’t measure (yet)
- Training emissions. We report inference only. Training is a separate, larger, and harder-to-attribute footprint, and folding it into per-request numbers is misleading.
- Hardware embodied carbon. GPU manufacturing has a real footprint; we don’t yet have a defensible per-token number for it.
- Real-time grid mix. We use annual averages by region. Live carbon-aware routing is a future feature, not a current one.
- Embedding workloads. Encoder-only models have a different compute profile and our formula does not yet model them well.
We list these limits because they are part of what an honest number looks like. The sustainable-AI page repeats them next to every chart.
Constraints that fall out of this position
A few platform decisions follow from taking energy and carbon seriously:
- The site itself is on a tight transfer budget. Pages like this one ship under 100 KB total. The docs site renders server-side with no client framework. Heavy interactive widgets are not added without a reason.
- The default route is auto-mode, not “the largest model.” If a smaller model can do the job for a fraction of the energy, we’d rather it be the default.
- Providers and regions with very high grid intensity are deprioritised in routing unless explicitly pinned by the caller. See provider routing.
- We are slower to add features than we’d like. Every new background job, every dashboard widget, every fancy visualisation is a per-user energy cost; we add them when they earn the cost.
What this is not
A pledge that any single request is green. A claim that the estimate is exact. An assertion that running an LLM through LowRouter is meaningfully better for the planet than running it directly.
What it is: a measured number, an open formula, and a default that prefers the smaller model and the cleaner grid when other constraints allow.