About

A cheaper path to the same models

What we do

Capio is a thin proxy in front of large-language-model providers. You send a request to capiollm.tech using the same payload you'd send to OpenAI or Anthropic, and we forward it to whichever upstream is cheapest and healthy for that model. You pay us in credits; we pay the upstream in fiat. The difference is the margin.

The interface is the OpenAI and Anthropic shape. Existing client libraries work without code changes. The model names are the model names. Streaming, function calling, tool use, system prompts — all pass through. The only difference is the base URL and the price.

Why this exists

Frontier-tier inference is sold at commodity margins by the model labs, but the price European developers actually pay includes 50-100% markups from the US-based "compute platforms" that wrap them. Those markups pay for dashboards, sales teams, and a "neutral" routing layer that often does the wrong thing for your workload.

We skip the markup. The routing is honest (cheapest healthy upstream, with your stated preferences respected), the accounting is line-item, and the dashboard exists so you can see your bill — not so we can upsell you.

Who runs it

Capio is operated by veritas, a single-person operator based in Frankfurt, Germany. The codebase, the database, and the upstream contracts are all under one person's control. That has trade-offs:

Pro: the person who takes your support email is the person who wrote the auth code, who can read the logs, and who can ship a fix the same day.
Con: bus-factor of one. The codebase, secrets, and billing relationships are documented well enough that a successor could take over inside two weeks of full-time work, but the transition would not be instantaneous.

Until a second operator comes on, the service runs on a single-region production deployment in Frankfurt. We do not yet have a multi-region failover; the on-call coverage is also single-person. The status page reflects what's actually running.

What we don't do

We don't train models. Everything we serve is produced by someone else's weights. We don't fine-tune, we don't RLHF, and we don't have a "differentiation" model. We're a router and a billing layer.
We don't collect prompt data. The privacy policy spells this out, and the database schema is structured to make it impossible for us to suddenly start: there's no column to put a prompt in, and the usage_events table is sized to hold metadata, not payloads.
We don't run on tokens you haven't bought.The credit system is prepaid. A credit balance can't go negative; we can't accidentally end the month with an unpaid bill.

How to reach us

General: [email protected]

Billing / refunds: [email protected]

Privacy / GDPR: [email protected]

Security disclosures: [email protected]. PGP fingerprint is on request.

Read the docs →