AI Token Usage
Tokens per workflow. Tokens per model. Live.
P402 records input tokens, output tokens, cost, latency, cache hit, and retry waste for every AI call, attributed to a workflow, customer, feature, or employee.
For platform engineers and SRE teams shipping AI-backed features who need per-workflow token visibility without bolting on a vendor SDK.
The problem
Token usage is invisible until the bill lands.
Each provider exposes its own usage UI, at its own granularity, on its own delay. The dashboards do not federate. Engineers ship features without knowing the per-call cost.
A retry storm in one workflow can quietly burn through a department's budget. The signal lives in the application logs nobody reads, not in the cost ledger finance reviews.
What P402 does
One ledger. Owner, budget, policy, outcome, evidence.
input_tokens, output_tokens, total_tokens, cost_usd, latency_ms, cache_hit: bound to every event. Filter, group, export.
retry_cost_usd and context_waste_usd are first-class fields. The retry storm shows up as a row in Optimize, not in your SRE pager.
Semantic cache hits are recorded with cache_savings_usd. Finance sees the savings, not just the cost.
OpenAI, Anthropic, Gemini, Bedrock, OpenRouter: one schema. ai_economic_events has the same shape regardless of who served the request.
Proof
Per-event usage and economics columns. Not 3. Not 60.
Events land in the ledger in seconds, not on the next billing cycle.
ai_economic_events is the canonical row. Every dashboard reads it.
Questions
ai token usage: FAQ
How does this differ from LLM observability tools?
Observability records traces of prompts, responses, latency, and errors. P402 records the economic event around each call: owner, budget, policy decision, retry waste, cache savings, in a finance-ready ledger.
Do we have to send prompts to P402?
No. Meter-only mode persists token counts and economics with zero prompt content. Default for regulated workflows.
Can we send our own custom attribution fields?
Yes. action_type, task_type, workflow_id, project_id, feature_id, customer_id, employee_id, department_id are all standard. metadata is a JSONB column for anything custom.
Does P402 support streaming responses?
Yes. Output tokens are accumulated through the stream; the economic event is finalized when the stream closes.
How do we filter by retry waste?
The /dashboard/optimize surface ranks workflows by retry_cost_usd + context_waste_usd. Click through to a workflow to see the contributing events.