Bearer auth
Every runtime request uses the same `Authorization` header shape.
Authorization: Bearer <GATEWAY_API_KEY>
Runtime is meant to stay boring at the API boundary and useful everywhere underneath it: routing, pricing, cache wins, benchmark comparison, and live customer-facing visibility.
Start by putting Runtime behind the same OpenAI-compatible client you already use. The fastest production path is still one safe traffic slice at a time.
Drop-in quickstart
Keep the OpenAI-compatible client you already use, point it at Gateway, and start with one slice of production traffic.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.GATEWAY_API_KEY,
baseURL: "https://api.badtheorylabs.com/v1",
});
const response = await client.chat.completions.create({
model: "openai/gpt-4.1-mini",
messages: [{ role: "user", content: "Hello from Gateway" }],
});Start with the routes where cheaper execution, repeated prompts, or provider failover matter first.
Every runtime request uses the same `Authorization` header shape.
Authorization: Bearer <GATEWAY_API_KEY>
Dashboard sessions manage credits, requests, and machine keys. Your app should use scoped API keys created inside the workspace.
The public model directory is grouped by shared slug, while each detail page exposes route-level provider variants, context windows, capabilities, and pricing context.
Use the public page for humans and the API route for programmatic discovery.
Slugs like `openai/gpt-4.1-mini` map cleanly to `/models/openai/gpt-4.1-mini`.
curl https://api.badtheorylabs.com/v1/models/openai/gpt-4.1-mini
| Method | Path | Purpose |
|---|---|---|
| GET | /v1/models | Public catalog of grouped model slugs and route metadata. |
| GET | /v1/models/{modelId} | Model detail, route variants, capability flags, and pricing context. |
| POST | /v1/chat/completions | OpenAI-compatible chat completions with Runtime routing and savings. |
| POST | /v1/responses | OpenAI-compatible responses API surface. |
| POST | /v1/account/quote | Estimate route options, customer charge, and savings before live traffic. |
The quote route is the easiest way to show what Runtime would charge, what the benchmark direct cost looks like, and where savings come from before you cut over real traffic.
curl https://api.badtheorylabs.com/v1/account/quote \
-H "Authorization: Bearer $GATEWAY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4.1-mini",
"messages": [
{"role":"user","content":"Summarize this support ticket in 3 bullets."}
]
}'Runtime responses can include benchmark cost, customer charge, and customer savings headers, so teams can measure what the gateway changed without building a second reporting system first.
x-btl-benchmark-cost: 0.014 x-btl-customer-charge: 0.0056 x-btl-saved: 0.0084
`x-btl-benchmark-cost` is the naive direct upstream baseline for that request. `x-btl-customer-charge` is what Runtime billed. `x-btl-saved` is the visible delta between the two.