NewEarly pilot teams averaging $44 saved per $100 spent

Your LLM bill
is too high.

Drop-in proxy that makes every inference call cheaper by default — exact cache, prompt cache, in-flight dedup, smart routing. One line of code, zero rewrite.

Create workspace →Sign in

live request

POST /v1/chat/completions

model: claude-sonnet-4-5

tokens: 18,440 input

↳ cache lookup...

✓ cache hit (prefix, 18.1k tokens)

provider call: skipped

x-gateway-savings: $1.344

x-gateway-cost: $0.156

latency: 38ms

routing ···

Saved per $100 spent — average across early pilot teams

Cheaper on cache hits — same output, cache-read pricing

0 ms

Added latency — imperceptible to end users

0 lines

Of code changed — drop in your base URL, nothing else moves

Transparent routing across every major provider

OpenAIAnthropicAWS BedrockGoogle VertexTogether AIGroqDeepInfraFireworks AIOpenAIAnthropicAWS BedrockGoogle VertexTogether AIGroqDeepInfraFireworks AI

( 01 / 04 )

From zero to saving
in 10 minutes.

No SDK swap. No proxy config. No vendor onboarding. Every request is cheaper by default from the moment you flip the URL.

Deposit credits

Buy credits upfront. $5 free on signup, no card required. Provider costs deducted at actual price with zero markup — you pay what they charge.

Change one line

Point baseURLat your gateway endpoint. Same SDK, same model names, same auth format. Your application doesn't know anything changed.

Watch the savings header

Every response comes back with x-gateway-savings — the exact dollar amount saved on that request. The dashboard shows your lifetime breakdown by cache tier.

Request lifecycle

live

⚡

Your app

OpenAI SDK · unchanged

origin

◈

BTL Runtime

Auth · rate-limit · audit

gateway

⊞

Router

Eligibility → scoring

routing

◇

Cache layer

Exact · semantic · prefix

cache

▣

Provider

Failover · retry

provider

✓

Response

+ savings headers

output

( 02 / 04 )

Same model. Same output.
90% less cost.

A 100K-document RAG query via Claude Sonnet. Real numbers — Anthropic's published rates with cache-read pricing applied.

Line item	Direct to Anthropic	Via BTL Runtime
Provider cost (per query)	$1.50	$0.15
Platform fee	$0.00	$0.006
Total per query	$1.50	$0.156
1,000 queries / month	$1,500	$156
Annual (100K queries / mo)	$180,000	$18,720

Without BTL Runtime$180kannual spend at 100K queries/mo

Provider cost per query$1.50

Cache savings$0

Annual$180,000

With BTL Runtime$18.7kannual spend — $161k back in your budget

Provider cost per query$0.15

BTL platform fee (4%)$0.006

Annual$18,720

( 03 / 04 )

Not another OpenRouter.

OpenRouter is a marketplace. BTL Runtime is an inference layer you control — with features no public proxy has shipped.

Not a reseller

We pass provider costs through at actual price. Our fee is a flat 4% on credits used — not a per-token markup. OpenRouter charges 5.5%.

Caching is automatic

Exact response cache, Anthropic/Vertex prompt cache, in-flight dedup — all apply without config. Every response header tells you what saved.

Self-hostable

Full Postgres control plane, all provider adapters, MIT licensed. Run it on your own infra with no telemetry and no phone-home.

Savings you can see

Every response includes x-gateway-savings and x-gateway-cost. The dashboard breaks lifetime savings down by cache tier.

Feature	BTL	OpenRouter
Provider pass-through pricing	✓	✓
Zero-code drop-in	✓	—
Exact response cache	✓	—
Prompt cache (Anthropic/Vertex)	✓	—
In-flight request dedup	✓	—
Per-response savings headers	✓	—
Semantic cache per tenant	✓	—
Cross-request prefix matching	✓	—
Postgres control plane	✓	—
Platform fee	4% → 2%	5.5%

What shipping teams say.

"We dropped our Claude costs 40% in the first week. The prompt cache integration was worth the switch on its own — we didn't change a single line of application code."

Michael A.

Founder, RAG Labs

"Spent an afternoon migrating from OpenRouter. The savings headers immediately showed my team where money was going. Semantic cache was a complete unlock for our support bot."

Sarah K.

Lead Engineer, FinQuery

"We run 2M+ tokens a day across five models. BTL Runtime's routing and in-flight dedup cut month-one spend by 35%. Self-hosted deployment took 20 minutes."

David R.

CTO, Textly AI

( 04 / 04 )

Provider cost pass-through.
4% platform fee.

No markups on tokens, no arbitrage on model pricing. You pay what the provider charges — we take a small fee on credits used.

Credits · Cloud

Pay-as-you-go

$5 free on signup. Provider costs at actual price.

$0–$500: 4% fee
$500–$5,000: 3% fee
$5,000+: 2% fee
All caching layers included
No card required to start

Create workspace

Self-hosted · OSS

Open source

Your infra, your control. MIT license.

All provider adapters
Prompt & response cache
Full routing engine
Postgres-backed
No telemetry or phone-home

View on GitHub

Enterprise

Custom

Dedicated instance, SLA-backed. SSO, audit, priority support.

Everything in cloud
Negotiated fee tier
SLA & uptime guarantee
SSO / SAML
Dedicated onboarding

Talk to the team

Start saving on your
first request.

Create workspace Sign in

Your LLM billis too high.

From zero to savingin 10 minutes.

Same model. Same output.90% less cost.