How TokenOne® settles every AI call.
Six stages, one decision per call. Classify the request, decide the route · TokenOne AI tier or external brand · execute it, QA the response, escalate if it falls short, feed the result back into the next decision. Payment-scheme rails for AI consumption, visible end to end and chain-verified.
The routing pipeline, six stages.
You can read the whole story in your dashboard. Every call is timestamped, stage-tagged, and explorable per project.
- 1
Read the request
Each request is read for what it actually needs · task family, latency tolerance, cost ceiling, regulatory scope. The plan follows the read, not gut feel.
TelemetryInputs: request shape, working context, prior delivery decisions, any user hints.
- 2
Plan delivery
TokenOne plans a primary path plus one fallback up the quality ladder. Quality floor cleared first; cheapest passing option wins. Region, latency, and budget caps enforced before anything leaves the tenant.
TelemetryInputs: the request profile, eligible delivery paths, current health signals, per-project budget.
- 3
Deliver
TokenOne delivers the call. Streaming or batch. Every token in and out is captured in a signed BigInt ledger entry · your single source of truth for spend and audit.
TelemetryOutputs: response, full delivery metadata, latency, token counts (in/out), wallet burn.
- 4
Verify
A read-replica analysis loop scores every response against the quality rubric · accuracy, structure, hallucination risk, instruction fidelity. Runs asynchronously; never blocks the user.
TelemetryOutputs: rubric score, findings, severity flags, optional escalation trigger.
- 5
Escalate (if needed)
If verification flags a response below the quality floor, TokenOne retries against the configured fallback. The user gets the better answer; the original is logged for governance.
TelemetryOutputs: replacement response, escalation reason, delivery-path delta.
- 6
Feedback
Every delivery decision + verification score writes back into the performance profile. Tomorrow’s decisions get sharper. Paths that drift below the floor lose priority automatically.
TelemetryOutputs: feedback row, updated path-quality profile, drift signal.
Each pillar plays its part.
A single call touches every pillar · but you only see the result. Here’s which pillar does what.
Token Wallet
Pays the callSigned BigInt ledger. Every burn writes a row keyed to the call ID. Reconciler closes the loop nightly.
TokenOne Delivery
Picks the tierMatch each call to the right tier (Apex/Core/Edge/Lite) across TokenOne AI and BYOK brands. Classify → decide → execute → feedback.
QA
Scores the resultRead-replica analysis. Rubric-scored. Feeds back into compute decisions.
Five ways to plug in.
Whichever tier you land on, the pipeline above is the same. Only the surface changes.
- Live
TokenOne Delivery API (OpenAI-shape · Claude variant for Anthropic clients)
- Live
Reverse proxy (Anthropic / OpenRouter / Google-compat)
- Live
MCP server
- v0.2 (separate repo)
Browser extension
- v0.2 (separate repo)
Desktop bridge (Electron)
The 5% margin model.
TokenOne® charges 5% on top of the underlying provider cost · not per seat, not per tier. When TokenOne Delivery picks a cheaper upstream, the savings split: most back to your wallet, 5% retained as margin.
We publish a reference-model index · the cost of buying each upstream directly. Every call shows you the index and the actual route, so “savings vs. buying direct” is a number you can audit, not a marketing claim.
- Per-project API keys: to_dev_xxx for dev environments, to_live_xxx for prod
- 24-hour overlap window on key rotation · zero downtime
- Tokens never expire. Top-up via Stripe; auto-top-up optional
- No per-seat pricing. No tier lock-in. No minimum spend.
Tokens burnt per day. The single MVP KPI.
We measure ourselves on one number: how many tokens TokenOne® routes for paying customers in a day. Everything else is in service of that.