Changelog
What shipped, when
We deploy continuously. Newer entries at the top. The version matches the Docker image tag — pin a specific tag in production so rollback is one line.
- v0.6.02026-06-07
Sub-cent billing accuracy
- Switched internal billing from integer cents to millicents (1/1000 of a cent) end-to-end. A 33K-token prompt on deepseek-v4-flash at $0.07/M input used to round to $0.00 and effectively not debit the account; it now costs 23 millicents and the dashboard shows $0.0023.
- Existing customer balances were multiplied by 1000 in the same migration. A $10 starting balance went from 1_000 cents to 1_000_000 millicents — same dollar value, higher precision below the cent.
- Dashboard and admin views format sub-cent amounts with 4 fractional digits so the cheap-model spend is visible. Whole-dollar amounts still show 2 decimals.
- v0.5.02026-06-06
Production hardening — billing safety, retry, observability
- Closed a billing edge case where a customer at zero balance could fire 0-prompt requests and accumulate cost via the completion side. checkAndReserve(0) now requires a positive balance.
- Wired up UPSTREAM_MAX_CONNECT_RETRIES for transient network blips. ECONNRESET/ECONNREFUSED/EAI_AGAIN now retry with a 50–350ms backoff instead of falling through to the next (more expensive) model in the chain.
- Paddle webhook + return-URL now share a single deduplication key (source, event_type, transaction_id). A concurrent POST + GET for the same transaction can't double-credit any more — the credit path runs in a Postgres transaction with FOR UPDATE on the customer row.
- SSE stream parser now handles CRLF-terminated events. Upstreams that use \r\n\r\n between events no longer buffer forever and drop the final [DONE].
- Production fail-fast: the process refuses to boot with a placeholder PADDLE_WEBHOOK_SECRET, a test-fixture SESSION_SECRET, or a non-HTTPS PUBLIC_BASE_URL.
- Optional METRICS_AUTH_TOKEN bearer gate on /metrics for public-internet deployments.
- v0.4.02026-05-22
Phase 2 — Postgres, Redis, Paddle, Resend
- Migrated the in-memory rate limiter to Redis/Valkey with atomic Lua scripts. RPM/TPM/RPD limits are now correct across restarts and across multiple app containers.
- Migrated the API key store from an env-var map to Postgres. Keys are stored as SHA-256 hashes; the raw key only exists in the customer's email and their code.
- Per-request usage events are persisted to usage_events (Postgres). The customer dashboard and the admin usage view both read from this table.
- Paddle sandbox checkout + webhook + Resend email delivery. Customers buy credits via Paddle, the webhook credits the account, and the API key is emailed via Resend.
- Atomic credit reservation: checkAndReserve moves spent_millicents up and remaining_millicents down in a single UPDATE; commit adjusts; refund restores. The reservation is never partial.
- v0.3.02026-04-30
Observability and routing polish
- Prometheus metrics at /metrics — request counts and latency histograms per normalised path. /readyz probes the upstream so the orchestrator can tell 'process up' from 'upstream reachable'.
- IP trust: with TRUST_PROXY=true, the X-Forwarded-For from the reverse proxy is honoured; otherwise XFF is ignored (so an attacker can't spoof their IP by setting the header themselves).
- Log sampling at LOG_SAMPLE_PCT for production. Errors are always logged; 2xx is sampled.
- Routing: align the catalog with the real upstream model list. Drop models the upstream doesn't actually expose, add the ones it does.
- v0.2.02026-03-15
Phase 1 — the actual proxy
- First end-to-end cut of the proxy. Auth via Bearer API key, RPM/TPM/RPD/spend limits in-process, fallback chains across upstreams, credit accounting on a per-customer balance.
- Catalog and routing: the model field can be a pinned alias (deepseek-v4-flash) or a strategy (auto:cost, auto:quality, auto:speed, auto:balanced). The router resolves the strategy against the live catalog.
- Web UI: landing page, /docs (OpenRouter-style with code blocks), /models (full catalog with filters and pricing).
- v0.1.02026-02-01
Phase 0 — skeleton
- Bun + Hono + Postgres + Redis skeleton. Hono routing, Pino structured logging, env validation, Docker multi-stage build, DEPLOY.md runbook.
- Nothing user-facing. The /healthz endpoint returns 200 and the process doesn't crash on a missing DB connection.