Skip to main content
Capio
status live · claude-opus-4-8, gpt-5.5, deepseek-v4-flash, MiniMax-M3, kimi-k2.6 · from $0.05/M

OpenAI & Anthropic,
at half the price.

GPT, Claude, DeepSeek, Qwen, GLM, Kimi. Same SDK, same response shape. Just change base_url and your bill drops by roughly 50% — we route to cheap inference providers and pass the savings through.

3 free models. No credit card. 1 minute to integrate.

curlbash
# Before, OpenAI
curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_KEY" \
  -d '{"model": "gpt-4o-mini", "messages": [...]}'

# After, Capio
curl https://capiollm.tech/v1/chat/completions \
  -H "Authorization: Bearer $CAPIO_KEY" \
  -d '{"model": "deepseek-v4-flash", "messages": [...]}'
#                                  ^^^^^^^^^^^^^^^^^
#                                  same response shape, 6× cheaper
Anthropic
OpenAI
DeepSeek
xAI
Qwen
Kimi
OpenClaw
Cursor
OpenCode
MiniMax
Anthropic
OpenAI
DeepSeek
xAI
Qwen
Kimi
OpenClaw
Cursor
OpenCode
MiniMax

Every model you'll need

OpenAI and Anthropic, plus frontier models from Chinese labs. One endpoint, per-million-token prices, no markup on context, no surprise tiers.

ModelInput / MOutput / M
Claude Opus 4.8
claude-opus-4-8
$3.50$17.50
MiniMax M3
MiniMax-M3
$0.30$1.20
DeepSeek V4 Flash
[Best value]
deepseek-v4-flash
$0.07$0.14
Kimi K2.6
kimi-k2.6
$0.48$2.39
Gemini 3.1 Pro
gemini-3.1-pro
$1.40$8.40
DeepSeek V4 Pro
deepseek-v4-pro
$0.31$0.61
GPT-5.5
[Long context]
gpt-5.5
$3.50$21.00

Embeddings, image, and audio models available, see docs.

Pricing that scales down

Prepaid tokens, spend them on any model. No subscription, no monthly fee, no usage cliffs.

// recommended
Pay-as-you-go
from $0.05
/M tokens

Buy a token pack, spend it on any model.

  • No monthly fee, no subscription
  • All models, all endpoints
  • Streaming + function calling
  • Tokens never expire
  • Pay with card (Paddle)
Volume
Custom
talk to us

For sustained high-volume workloads.

  • Volume discount up to 30%
  • Dedicated capacity
  • Custom DPA
  • Invoicing on request
  • SLA + status page

vs. the alternatives

GPT-5.5, Claude Opus 4.8, or DeepSeek V4 Pro — flagship quality, a fraction of the price.

Provider · ModelInput / MOutput / M
OpenAI GPT-5.5
$3.50$21.00
Anthropic Claude Opus 4.8
$3.50$17.50
Capio · DeepSeek V4 Pro
$0.31$0.61

Public list prices, June 2026. We add ≈20% margin on upstream cost.

Common questions

Is this just a DeepSeek reseller?

No. We route to 12+ Chinese and US LLM providers, picking the best one per request. Today DeepSeek V4 Flash is usually the winner on cost, Kimi K2.6 on long context, MiniMax M3 on low latency, Claude Opus 4.8 on quality. You can also pin a specific model in your request.

Do I need to change my code?

Just the base URL and the API key. Everything else (request body, response shape, streaming, function calling, tool use) is OpenAI-compatible, so your existing code works.

How is data handled?

Your prompts and responses are routed to whichever provider is best for that request. We don't log request contents. We do log metadata (tokens, latency, model, status) for billing and debugging, see privacy policy.

What about rate limits?

PAYG has no rate limit beyond the upstream providers' per-minute caps (usually 500-2000 RPM). Volume customers get dedicated capacity and priority routing. If you need more, talk to us.

Do you support function calling, vision, embeddings?

Yes for all three, on models that support them. See /models for the matrix.

Can I get a refund?

Unspent tokens are refundable within 30 days. Past that, no. Once we pay the upstream provider, we can't recover the cost.

Cheap tokens, today.

Create an account, get an API key, send your first request in under a minute. We give you 1M free tokens to try it out.