AI middleware · utility layer

How TokenOne^® settles every AI call.

Six stages, one decision per call. Classify the request, decide the route · TokenOne AI tier or external brand · execute it, QA the response, escalate if it falls short, feed the result back into the next decision. Payment-scheme rails for AI consumption, visible end to end and chain-verified.

The routing pipeline, six stages.

You can read the whole story in your dashboard. Every call is timestamped, stage-tagged, and explorable per project.

1
Read the request
Each request is read for what it actually needs · task family, latency tolerance, cost ceiling, regulatory scope. The plan follows the read, not gut feel.
TelemetryInputs: request shape, working context, prior delivery decisions, any user hints.
2
Plan delivery
TokenOne plans a primary path plus one fallback up the quality ladder. Quality floor cleared first; cheapest passing option wins. Region, latency, and budget caps enforced before anything leaves the tenant.
TelemetryInputs: the request profile, eligible delivery paths, current health signals, per-project budget.
3
Deliver
TokenOne delivers the call. Streaming or batch. Every token in and out is captured in a signed BigInt ledger entry · your single source of truth for spend and audit.
TelemetryOutputs: response, full delivery metadata, latency, token counts (in/out), wallet burn.
4
Verify
A read-replica analysis loop scores every response against the quality rubric · accuracy, structure, hallucination risk, instruction fidelity. Runs asynchronously; never blocks the user.
TelemetryOutputs: rubric score, findings, severity flags, optional escalation trigger.
5
Escalate (if needed)
If verification flags a response below the quality floor, TokenOne retries against the configured fallback. The user gets the better answer; the original is logged for governance.
TelemetryOutputs: replacement response, escalation reason, delivery-path delta.
6
Feedback
Every delivery decision + verification score writes back into the performance profile. Tomorrow’s decisions get sharper. Paths that drift below the floor lose priority automatically.
TelemetryOutputs: feedback row, updated path-quality profile, drift signal.

Each pillar plays its part.

A single call touches every pillar · but you only see the result. Here’s which pillar does what.

Token Wallet

Pays the call

Signed BigInt ledger. Every burn writes a row keyed to the call ID. Reconciler closes the loop nightly.

TokenOne Delivery

Picks the tier

Match each call to the right tier (Apex/Core/Edge/Lite) across TokenOne AI and BYOK brands. Classify → decide → execute → feedback.

Scores the result

Read-replica analysis. Rubric-scored. Feeds back into compute decisions.

Five ways to plug in.

Whichever tier you land on, the pipeline above is the same. Only the surface changes.

TokenOne Delivery API (OpenAI-shape · Claude variant for Anthropic clients)
Live
Reverse proxy (Anthropic / OpenRouter / Google-compat)
Live
MCP server
Live
Browser extension
v0.2 (separate repo)
Desktop bridge (Electron)
v0.2 (separate repo)

The 5% margin model.

TokenOne^® charges 5% on top of the underlying provider cost · not per seat, not per tier. When TokenOne Delivery picks a cheaper upstream, the savings split: most back to your wallet, 5% retained as margin.

We publish a reference-model index · the cost of buying each upstream directly. Every call shows you the index and the actual route, so “savings vs. buying direct” is a number you can audit, not a marketing claim.

Per-project API keys: to_dev_xxx for dev environments, to_live_xxx for prod
24-hour overlap window on key rotation · zero downtime
Tokens never expire. Top-up via Stripe; auto-top-up optional
No per-seat pricing. No tier lock-in. No minimum spend.

See the pricing page

Tokens burnt per day. The single MVP KPI.

We measure ourselves on one number: how many tokens TokenOne^® routes for paying customers in a day. Everything else is in service of that.

Start routing Talk to us