Skip to main content

Cost tracking

ProofRail tracks token usage and estimated cost on every LLM call your agents make. The tracking is automatic when your framework adapter reports tokens; you can also report tokens manually. Costs accumulate per chain, per agent, and per organization, with email alerts when you approach your monthly budget.

What gets tracked

For every LLM call recorded as a governance event, ProofRail captures:
  • Input tokens — prompt tokens sent to the model
  • Output tokens — completion tokens generated by the model
  • Model name — which model was called (e.g., gpt-4o, claude-sonnet-4-20250514)
  • Estimated cost in USD — computed from the backend’s pricing table
The cost computation happens server-side at event ingestion time. Once the event is stored, the cost is locked in even if pricing tables change later.

Supported models

Pricing is built in for major model families. The backend currently prices:
  • OpenAI — GPT-4o, GPT-4o-mini, GPT-4 Turbo, GPT-4, GPT-3.5 Turbo, and the o1/o3 reasoning models
  • Anthropic — Claude 4 Opus and Sonnet (May 2024), Claude Haiku 4.5, plus the Claude 3.5 and Claude 3 families
  • Google — Gemini 2.0 Pro, 2.0 Flash, 2.0 Flash Lite, 1.5 Pro, 1.5 Flash, 1.5 Flash 8B
  • Open-weight — Llama 3.3 70B, Llama 3.1 8B, Mixtral 8x7B
The exact model strings priced are in the backend pricing config and updated as providers ship new models. If your agent uses a model not in the pricing table, the event is still recorded — input_tokens and output_tokens are captured for your own analysis, but estimated_cost_usd will be 0.
Estimated cost is exactly that — an estimate. ProofRail’s pricing table tracks published per-token rates. Actual costs from your LLM provider may differ slightly due to volume discounts, prompt caching, batch pricing, or other arrangements you have with the provider. Use ProofRail cost data for governance decisions and trend analysis, not as an authoritative billing figure.

Where token data comes from

The LangChain, LangGraph, and MCP adapters automatically capture token data when the underlying framework reports it (most commonly via the model’s response metadata). For the LangChain adapter specifically: on_llm_end callbacks include token_usage from the LLM response, which the adapter captures and attaches to the event. For manual instrumentation:
await chain.record_agent_action(
    agent_name="research-agent",
    action_type="llm_call",
    action_name="anthropic.messages.create",
    payload={"messages": [...]},
    metadata={
        "model": "claude-sonnet-4-20250514",
        "input_tokens": 1234,
        "output_tokens": 567,
    },
)
The backend reads model, input_tokens, and output_tokens from the metadata and computes cost. The computed value comes back on the returned PolicyDecision as estimated_cost_usd.

Per-chain cost visibility

Open any chain in the dashboard and the per-chain total cost appears in the metrics card. For chains that span many LLM calls, this is the fastest way to spot expensive workflows. A chain that costs 0.50tocompleteisfine;onethatcosts0.50 to complete is fine; one that costs 5 is something to investigate.

Daily and monthly aggregates

The dashboard’s /dashboard/cost page shows cost aggregates:
  • Today — running total for the current day
  • This week — running total for the calendar week
  • This month — running total for the calendar month
  • Trend over recent days

Setting a monthly budget

On the same /dashboard/cost page, organization admins can set a monthly LLM budget in the Budget Settings card. Enter a dollar amount and click Save; clear it to remove the cap. Once set, ProofRail tracks your organization’s monthly LLM spend against the budget and sends email alerts:
  • 80% of budget — heads-up to your org admins
  • 100% of budget — budget exceeded alert
When the budget is exceeded, ProofRail triggers a require_approval policy on new chains — workflows pause for human review until the budget is raised, cleared, or the next month rolls over. (See Limitations for the current single-mode enforcement constraint.) Only admins can change the budget. Non-admin users see the current cap in read-only form.

Cost in receipts

Every chain receipt includes the cumulative cost as part of cumulative_metrics. This means months later, when you verify a receipt, the cost data is part of the signed payload. Tampering with cost figures breaks the signature. Useful for accounting, chargeback to teams, and compliance reporting.

Cost as a governance signal

Cost data isn’t just for budgeting — it’s a governance signal too. Several scenarios where cost tracking catches problems other governance misses:
  • Runaway agents — a workflow that should cost 0.10suddenlycosts0.10 suddenly costs 5 indicates the agent is in an unintended loop. Visible immediately on the cost dashboard.
  • Prompt injection — an attack that hijacks an agent to do work for the attacker shows up as anomalous cost patterns.
  • Misconfigured prompts — a prompt change that suddenly increases output tokens 10x shows up as a cost spike.
The cost trend view on the /dashboard/cost page is the fastest place to notice these. If your normal daily cost is 5andyousee5 and you see 50 today, something needs investigation.

Where to go next

Policies

Authoring custom rules.

Audit receipts

How cost data gets recorded in tamper-evident form.

Kill switch

Halting agent activity when something goes wrong.

Configuration

Per-action-class fail modes, sanitization, fast-path tuning.