See exactly what GPT-4o, Claude, Gemini, and 200+ models cost at official rates, then how much less you'd pay routing each request through LLM Gateway's cheapest provider.
Official (OpenAI)
$0.15/M in · $0.60/M out
LLM Gateway (OpenAI)
$0.15/M in · $0.60/M out
Official Provider Pricing
$0.2100
direct from providers
LLM Gateway Pricing
$0.2100
cheapest provider per model, no markup
You Save
$0.00
same price, more features
| Model | Tokens | Official | LLM Gateway | Saved |
|---|---|---|---|---|
GPT-4o Mini via OpenAI | 1.0M in 100K out | $0.2100 | $0.2100 | — |
| Total | $0.2100 | $0.2100 | — | |
Smart Routing
Automatically routes to the cheapest available provider for each model
No Markup
Zero platform fees — you pay exactly what the provider charges
Automatic Failover
If one provider is down, requests automatically route to the next cheapest
Processing high volume?
Enterprise plans include volume discounts, dedicated support, custom SLAs, and extended data retention.
Start for free with no platform fees. No credit card required.
Estimate your monthly spend across any model in three steps, then see how much routing through LLM Gateway saves you.
Pick from 200+ LLMs across OpenAI, Anthropic, Google, Mistral, Meta, and more. Stack as many models as you want to compare side by side.
Set the input and output tokens you expect to process, or start from a Light, Medium, Heavy, or Intensive preset and adjust from there.
Instantly see official provider pricing next to LLM Gateway's cheapest routed provider, plus the exact amount you would save each cycle.
Every large language model bills by the token, the small chunks of text a model reads and writes. Roughly speaking, one token is about four characters of English, so 1,000 tokens is around 750 words. Providers quote prices per million tokens, and they charge separately for the tokens you send (input) and the tokens the model generates (output).
Output tokens are usually two to four times more expensive than input tokens, so the ratio between your prompt size and response size has a big impact on your bill. A summarization workload that reads a lot and writes a little costs very differently from a code-generation workload that writes long responses. The calculator above keeps the two separate so your estimate reflects how you actually use each model.
Prices also vary by provider. A single popular model is often hosted by several providers at different rates, and those rates change as providers compete on price. Instead of locking yourself into one provider, LLM Gateway routes each request to the cheapest available provider for that model through one OpenAI-compatible API, with no platform markup. That is the gap the calculator shows: the official list price versus the lowest live price you would actually pay.
Use it to budget a new feature, compare GPT-4o against Claude or Gemini before you commit, or build the business case for switching providers. When the numbers look good, you can start for free and keep the same estimate in production.
Everything you need to know about estimating and lowering your LLM token costs.
Providers bill separately for input tokens (your prompt) and output tokens (the model's response), priced per million tokens. Your total cost is (input tokens × input price) + (output tokens × output price). This calculator runs that math for every model you add and sums the result.
Input tokens are everything you send to the model, including your prompt, system message, and conversation history. Output tokens are what the model generates back. Output tokens almost always cost more than input tokens, which is why the split matters when you estimate spend.
Popular models are often served by several providers at different rates, and prices change as providers compete. LLM Gateway routes each request to the cheapest available provider for that model, so you pay the lowest live rate without changing any code.
No. LLM Gateway passes through provider pricing with zero platform markup, so you pay exactly what the provider charges (and less when a cheaper provider or volume discount is available). You only add a payment method once you start sending real traffic.
Estimates use current published per-token prices for each model and provider. Real-world cost depends on your exact token counts, caching, and any negotiated rates, so treat the numbers as a close planning estimate rather than a final invoice.
Route through a gateway that compares providers and picks the lowest price per request. Because LLM Gateway supports 200+ models behind one OpenAI-compatible API, you can switch models or providers based on cost without rewriting your integration.
Yes, the calculator is completely free and requires no signup. You can compare as many models and token volumes as you like, then create a free LLM Gateway account when you are ready to start sending requests.