OpenAI & Anthropic,
at half the price.
GPT, Claude, DeepSeek, Qwen, GLM, Kimi. Same SDK, same response shape. Just change base_url and your bill drops by roughly 50% — we route to cheap inference providers and pass the savings through.
3 free models. No credit card. 1 minute to integrate.
# Before, OpenAI
curl https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_KEY" \
-d '{"model": "gpt-4o-mini", "messages": [...]}'
# After, Capio
curl https://capiollm.tech/v1/chat/completions \
-H "Authorization: Bearer $CAPIO_KEY" \
-d '{"model": "deepseek-v4-flash", "messages": [...]}'
# ^^^^^^^^^^^^^^^^^
# same response shape, 6× cheaperEvery model you'll need
OpenAI and Anthropic, plus frontier models from Chinese labs. One endpoint, per-million-token prices, no markup on context, no surprise tiers.
| Model | Context | Input / M | Output / M | |
|---|---|---|---|---|
Claude Opus 4.8 claude-opus-4-8 | 200K | $3.50 | $17.50 | docs → |
MiniMax M3 MiniMax-M3 | 1M | $0.30 | $1.20 | docs → |
DeepSeek V4 Flash [Best value]deepseek-v4-flash | 1M | $0.07 | $0.14 | docs → |
Kimi K2.6 kimi-k2.6 | 262K | $0.48 | $2.39 | docs → |
Gemini 3.1 Pro gemini-3.1-pro | 1M | $1.40 | $8.40 | docs → |
DeepSeek V4 Pro deepseek-v4-pro | 1M | $0.31 | $0.61 | docs → |
GPT-5.5 [Long context]gpt-5.5 | 1M | $3.50 | $21.00 | docs → |
Embeddings, image, and audio models available, see docs.
Pricing that scales down
Prepaid tokens, spend them on any model. No subscription, no monthly fee, no usage cliffs.
Buy a token pack, spend it on any model.
- No monthly fee, no subscription
- All models, all endpoints
- Streaming + function calling
- Tokens never expire
- Pay with card (Paddle)
For sustained high-volume workloads.
- Volume discount up to 30%
- Dedicated capacity
- Custom DPA
- Invoicing on request
- SLA + status page
vs. the alternatives
GPT-5.5, Claude Opus 4.8, or DeepSeek V4 Pro — flagship quality, a fraction of the price.
| Provider · Model | Input / M | Output / M |
|---|---|---|
OpenAI GPT-5.5 | $3.50 | $21.00 |
Anthropic Claude Opus 4.8 | $3.50 | $17.50 |
▸Capio · DeepSeek V4 Pro | $0.31 | $0.61 |
Public list prices, June 2026. We add ≈20% margin on upstream cost.
Common questions
Is this just a DeepSeek reseller?
No. We route to 12+ Chinese and US LLM providers, picking the best one per request. Today DeepSeek V4 Flash is usually the winner on cost, Kimi K2.6 on long context, MiniMax M3 on low latency, Claude Opus 4.8 on quality. You can also pin a specific model in your request.
Do I need to change my code?
Just the base URL and the API key. Everything else (request body, response shape, streaming, function calling, tool use) is OpenAI-compatible, so your existing code works.
How is data handled?
Your prompts and responses are routed to whichever provider is best for that request. We don't log request contents. We do log metadata (tokens, latency, model, status) for billing and debugging, see privacy policy.
What about rate limits?
PAYG has no rate limit beyond the upstream providers' per-minute caps (usually 500-2000 RPM). Volume customers get dedicated capacity and priority routing. If you need more, talk to us.
Do you support function calling, vision, embeddings?
Yes for all three, on models that support them. See /models for the matrix.
Can I get a refund?
Unspent tokens are refundable within 30 days. Past that, no. Once we pay the upstream provider, we can't recover the cost.
Cheap tokens, today.
Create an account, get an API key, send your first request in under a minute. We give you 1M free tokens to try it out.