NewEarly pilot teams averaging $44 saved per $100 spent

Your LLM bill
is too high.

Drop-in proxy that makes every inference call cheaper by default — exact cache, prompt cache, in-flight dedup, smart routing. One line of code, zero rewrite.

live request
POST /v1/chat/completions
model: claude-sonnet-4-5
tokens: 18,440 input
↳ cache lookup...
✓ cache hit (prefix, 18.1k tokens)
provider call: skipped
x-gateway-savings: $1.344
x-gateway-cost: $0.156
latency: 38ms
routing ···
$0
Saved per $100 spent — average across early pilot teams
0%
Cheaper on cache hits — same output, cache-read pricing
0 ms
Added latency — imperceptible to end users
0 lines
Of code changed — drop in your base URL, nothing else moves

Transparent routing across every major provider

OpenAIAnthropicAWS BedrockGoogle VertexTogether AIGroqDeepInfraFireworks AIOpenAIAnthropicAWS BedrockGoogle VertexTogether AIGroqDeepInfraFireworks AI
( 01 / 04 )

From zero to saving
in 10 minutes.

No SDK swap. No proxy config. No vendor onboarding. Every request is cheaper by default from the moment you flip the URL.

01
Deposit credits

Buy credits upfront. $5 free on signup, no card required. Provider costs deducted at actual price with zero markup — you pay what they charge.

02
Change one line

Point baseURLat your gateway endpoint. Same SDK, same model names, same auth format. Your application doesn't know anything changed.

03
Watch the savings header

Every response comes back with x-gateway-savings — the exact dollar amount saved on that request. The dashboard shows your lifetime breakdown by cache tier.

Request lifecycle
live
Your app
OpenAI SDK · unchanged
origin
BTL Runtime
Auth · rate-limit · audit
gateway
Router
Eligibility → scoring
routing
Cache layer
Exact · semantic · prefix
cache
Provider
Failover · retry
provider
Response
+ savings headers
output
( 02 / 04 )

Same model. Same output.
90% less cost.

A 100K-document RAG query via Claude Sonnet. Real numbers — Anthropic's published rates with cache-read pricing applied.

Line itemDirect to AnthropicVia BTL Runtime
Provider cost (per query)$1.50$0.15
Platform fee$0.00$0.006
Total per query$1.50$0.156
1,000 queries / month$1,500$156
Annual (100K queries / mo)$180,000$18,720
Without BTL Runtime$180kannual spend at 100K queries/mo
Provider cost per query$1.50
Cache savings$0
Annual$180,000
With BTL Runtime$18.7kannual spend — $161k back in your budget
Provider cost per query$0.15
BTL platform fee (4%)$0.006
Annual$18,720
( 03 / 04 )

Not another OpenRouter.

OpenRouter is a marketplace. BTL Runtime is an inference layer you control — with features no public proxy has shipped.

01
Not a reseller
We pass provider costs through at actual price. Our fee is a flat 4% on credits used — not a per-token markup. OpenRouter charges 5.5%.
02
Caching is automatic
Exact response cache, Anthropic/Vertex prompt cache, in-flight dedup — all apply without config. Every response header tells you what saved.
03
Self-hostable
Full Postgres control plane, all provider adapters, MIT licensed. Run it on your own infra with no telemetry and no phone-home.
04
Savings you can see
Every response includes x-gateway-savings and x-gateway-cost. The dashboard breaks lifetime savings down by cache tier.
FeatureBTLOpenRouter
Provider pass-through pricing
Zero-code drop-in
Exact response cache
Prompt cache (Anthropic/Vertex)
In-flight request dedup
Per-response savings headers
Semantic cache per tenant
Cross-request prefix matching
Postgres control plane
Platform fee4% → 2%5.5%
_

What shipping teams say.

MA

"We dropped our Claude costs 40% in the first week. The prompt cache integration was worth the switch on its own — we didn't change a single line of application code."

Michael A.
Founder, RAG Labs
SK

"Spent an afternoon migrating from OpenRouter. The savings headers immediately showed my team where money was going. Semantic cache was a complete unlock for our support bot."

Sarah K.
Lead Engineer, FinQuery
DR

"We run 2M+ tokens a day across five models. BTL Runtime's routing and in-flight dedup cut month-one spend by 35%. Self-hosted deployment took 20 minutes."

David R.
CTO, Textly AI
( 04 / 04 )

Provider cost pass-through.
4% platform fee.

No markups on tokens, no arbitrage on model pricing. You pay what the provider charges — we take a small fee on credits used.

Credits · Cloud
Pay-as-you-go

$5 free on signup. Provider costs at actual price.

  • $0–$500: 4% fee
  • $500–$5,000: 3% fee
  • $5,000+: 2% fee
  • All caching layers included
  • No card required to start
Create workspace
Self-hosted · OSS
Open source

Your infra, your control. MIT license.

  • All provider adapters
  • Prompt & response cache
  • Full routing engine
  • Postgres-backed
  • No telemetry or phone-home
View on GitHub
Enterprise
Custom

Dedicated instance, SLA-backed. SSO, audit, priority support.

  • Everything in cloud
  • Negotiated fee tier
  • SLA & uptime guarantee
  • SSO / SAML
  • Dedicated onboarding
Talk to the team

Start saving on your
first request.

Sign up with email and password, open the dashboard, and generate your first machine key in a few minutes.