status live · claude-opus-4-8, gpt-5.5, deepseek-v4-flash, MiniMax-M3, kimi-k2.6 · from $0.05/M

OpenAI & Anthropic,
at half the price.

GPT, Claude, DeepSeek, Qwen, GLM, Kimi. Same SDK, same response shape. Just change base_url and your bill drops by roughly 50% — we route to cheap inference providers and pass the savings through.

Get an API key Read the docs

3 free models. No credit card. 1 minute to integrate.

curlbash

# Before, OpenAI
curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_KEY" \
  -d '{"model": "gpt-4o-mini", "messages": [...]}'

# After, Capio
curl https://capiollm.tech/v1/chat/completions \
  -H "Authorization: Bearer $CAPIO_KEY" \
  -d '{"model": "deepseek-v4-flash", "messages": [...]}'
#                                  ^^^^^^^^^^^^^^^^^
#                                  same response shape, 6× cheaper

Anthropic

OpenAI

DeepSeek

xAI

Qwen

Kimi

OpenClaw

Cursor

OpenCode

MiniMax

Anthropic

OpenAI

DeepSeek

xAI

Qwen

Kimi

OpenClaw

Cursor

OpenCode

MiniMax

Every model you'll need

OpenAI and Anthropic, plus frontier models from Chinese labs. One endpoint, per-million-token prices, no markup on context, no surprise tiers.

Model	Context	Input / M	Output / M
Claude Opus 4.8 claude-opus-4-8	200K	$3.50	$17.50	docs →
MiniMax M3 MiniMax-M3	1M	$0.30	$1.20	docs →
DeepSeek V4 Flash [Best value] deepseek-v4-flash	1M	$0.07	$0.14	docs →
Kimi K2.6 kimi-k2.6	262K	$0.48	$2.39	docs →
Gemini 3.1 Pro gemini-3.1-pro	1M	$1.40	$8.40	docs →
DeepSeek V4 Pro deepseek-v4-pro	1M	$0.31	$0.61	docs →
GPT-5.5 [Long context] gpt-5.5	1M	$3.50	$21.00	docs →

Embeddings, image, and audio models available, see docs.

Pricing that scales down

Prepaid tokens, spend them on any model. No subscription, no monthly fee, no usage cliffs.

// recommended

Pay-as-you-go

from $0.05

/M tokens

Buy a token pack, spend it on any model.

No monthly fee, no subscription
All models, all endpoints
Streaming + function calling
Tokens never expire
Pay with card (Paddle)

Start free

Volume

Custom

talk to us

For sustained high-volume workloads.

Volume discount up to 30%
Dedicated capacity
Custom DPA
Invoicing on request
SLA + status page

Contact sales

vs. the alternatives

GPT-5.5, Claude Opus 4.8, or DeepSeek V4 Pro — flagship quality, a fraction of the price.

Provider · Model	Input / M	Output / M
OpenAI GPT-5.5	$3.50	$21.00
Anthropic Claude Opus 4.8	$3.50	$17.50
▸Capio · DeepSeek V4 Pro	$0.31	$0.61

Public list prices, June 2026. We add ≈20% margin on upstream cost.

Common questions

Is this just a DeepSeek reseller?

No. We route to 12+ Chinese and US LLM providers, picking the best one per request. Today DeepSeek V4 Flash is usually the winner on cost, Kimi K2.6 on long context, MiniMax M3 on low latency, Claude Opus 4.8 on quality. You can also pin a specific model in your request.

Do I need to change my code?

Just the base URL and the API key. Everything else (request body, response shape, streaming, function calling, tool use) is OpenAI-compatible, so your existing code works.

How is data handled?

Your prompts and responses are routed to whichever provider is best for that request. We don't log request contents. We do log metadata (tokens, latency, model, status) for billing and debugging, see privacy policy.

What about rate limits?

PAYG has no rate limit beyond the upstream providers' per-minute caps (usually 500-2000 RPM). Volume customers get dedicated capacity and priority routing. If you need more, talk to us.

Do you support function calling, vision, embeddings?

Yes for all three, on models that support them. See /models for the matrix.

Can I get a refund?

Unspent tokens are refundable within 30 days. Past that, no. Once we pay the upstream provider, we can't recover the cost.

Cheap tokens, today.

Create an account, get an API key, send your first request in under a minute. We give you 1M free tokens to try it out.