LandingModelsDocsBriefBetaDashboard
Public docs

Integrate like OpenAI.
See pricing and savings clearly.

Runtime is meant to stay boring at the API boundary and useful everywhere underneath it: routing, pricing, cache wins, benchmark comparison, and live customer-facing visibility.

Quickstart

Swap the base URL first.

Start by putting Runtime behind the same OpenAI-compatible client you already use. The fastest production path is still one safe traffic slice at a time.

Drop-in quickstart

Swap the base URL and start routing.

Keep the OpenAI-compatible client you already use, point it at Gateway, and start with one slice of production traffic.

Node SDK
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.GATEWAY_API_KEY,
  baseURL: "https://api.badtheorylabs.com/v1",
});

const response = await client.chat.completions.create({
  model: "openai/gpt-4.1-mini",
  messages: [{ role: "user", content: "Hello from Gateway" }],
});

Start with the routes where cheaper execution, repeated prompts, or provider failover matter first.

Auth

Machine keys on the request path.

Header

Bearer auth

Every runtime request uses the same `Authorization` header shape.

Authorization
Authorization: Bearer <GATEWAY_API_KEY>
Workspace path

Human sign-in stays in the dashboard.

Dashboard sessions manage credits, requests, and machine keys. Your app should use scoped API keys created inside the workspace.

Models

Catalog and detail routes are public.

The public model directory is grouped by shared slug, while each detail page exposes route-level provider variants, context windows, capabilities, and pricing context.

Catalog

Browse `/models` or hit `/v1/models`.

Use the public page for humans and the API route for programmatic discovery.

Detail

OpenRouter-style slug paths.

Slugs like `openai/gpt-4.1-mini` map cleanly to `/models/openai/gpt-4.1-mini`.

Model detail
curl https://api.badtheorylabs.com/v1/models/openai/gpt-4.1-mini
Requests

Core endpoints

MethodPathPurpose
GET
/v1/models
Public catalog of grouped model slugs and route metadata.
GET
/v1/models/{modelId}
Model detail, route variants, capability flags, and pricing context.
POST
/v1/chat/completions
OpenAI-compatible chat completions with Runtime routing and savings.
POST
/v1/responses
OpenAI-compatible responses API surface.
POST
/v1/account/quote
Estimate route options, customer charge, and savings before live traffic.
Billing

Quote before you send live traffic.

The quote route is the easiest way to show what Runtime would charge, what the benchmark direct cost looks like, and where savings come from before you cut over real traffic.

Quote preview
curl https://api.badtheorylabs.com/v1/account/quote \
  -H "Authorization: Bearer $GATEWAY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4.1-mini",
    "messages": [
      {"role":"user","content":"Summarize this support ticket in 3 bullets."}
    ]
  }'
Savings headers

Every response can explain the economics.

Runtime responses can include benchmark cost, customer charge, and customer savings headers, so teams can measure what the gateway changed without building a second reporting system first.

Headers

Three useful values

Response headers
x-btl-benchmark-cost: 0.014
x-btl-customer-charge: 0.0056
x-btl-saved: 0.0084
Interpretation

Benchmark vs actual charge

`x-btl-benchmark-cost` is the naive direct upstream baseline for that request. `x-btl-customer-charge` is what Runtime billed. `x-btl-saved` is the visible delta between the two.