Skip to main content

Policies

A policy is a rule that maps an agent action to a decision. ProofRail ships with default policies that cover common harm categories out of the box, and lets you add custom policies for organization-specific rules.

The four decision outcomes

Every action goes through policy evaluation and gets one of four outcomes:
DecisionWhat happens in your codeDashboard visible
allowThe action proceeds. record_agent_action returns normally.No
allow_with_flagThe action proceeds. A flag is attached to the event for review.Yes — flagged in dashboard
require_approvalThe action is paused. An email is sent. The call blocks until the approver responds.Yes — pending approvals
denyThe action is blocked. ActionDeniedError is raised in your code.Yes — denied events
The decision arrives as a PolicyDecision object when require_approval resolves or as an ActionDeniedError exception when an action is denied.

Three-stage evaluation

Every action runs through a three-stage pipeline before a decision is returned: Stage 1 — Risk classification. The action is scored 0-100 based on what it does, what’s in the payload, and which agent is running it. Categories like financial, destructive, communication, exfiltration, privilege_escalation, credential_exposure are assigned where they apply. Stage 2 — Chain context update. Running totals for the chain are updated with this action’s contribution: financial amounts, external communications, records modified, etc. Stage 3 — Policy application. Rules are evaluated in priority order: hard denies first, approval triggers second, audit flags third, default allow last. The first matching rule wins. The full algorithm is in proofrail/policies.py. The same algorithm runs on the backend; parity tests verify they stay in sync.

Default policies

These ship with ProofRail and cover the categories most agent workflows need to be careful about. You don’t have to configure them — they’re active by default.

Hard denies

Actions that are blocked immediately, with no human-in-the-loop option.
  • High-value transactions — single transactions over $50,000
  • Production deletions — DELETE operations against production resources
  • Unauthorized domains — sending data to domains not in the allowlist
  • Credential exposure — payloads containing patterns matching API keys, passwords, credit card numbers
  • IAM modifications — changes to permissions, roles, or access control
  • Impersonation attempts — agents sending communications with a claimed identity different from their declared identity
  • Restricted data access — accessing data classified as restricted without authorization

Approval triggers

Actions that pause for human review.
  • Medium financial transactions — single transactions between 1,000and1,000 and 50,000 (threshold configurable)
  • Cumulative financial threshold — chain cumulative exposure exceeds the configured limit (default $10,000)
  • First external communication — the first time a chain sends data outside your organization
  • Bulk operations — actions touching 100+ records
  • Bulk deletions — deleting 10+ items at once
  • Irreversible actions — contract signing, legal filings, public posts, financial commitments
  • High-risk agent actions — actions by agents you’ve marked as high-risk
  • Out-of-scope access — agents accessing resources outside their declared scope
  • PII sent externally — content containing detected PII going to external domains

Audit flags

Actions allowed but flagged for review.
  • Production writes — any write to production resources
  • External API calls — any call to an external service
  • New agent first action — the first time a previously unseen agent appears
  • External communication content — outgoing content to external recipients

Auto-pause conditions

Chains that hit these limits pause automatically, raising for review.
  • Event count — 1,000 events in a single chain (likely runaway behavior)
  • Chain duration — 10 minutes of activity without completion
  • Token budget — cumulative token spend exceeds the configured budget

Configuring thresholds

Many default policies use thresholds you can tune in proofrail.init():
proofrail.init(
    api_key="prail_...",

    # Financial thresholds
    financial_approval_threshold_usd=5000,
    cumulative_financial_threshold_usd=10000,

    # Domain allowlist (external comms)
    external_domains_allowlist=["clients.com", "vendor.com"],

    # High-risk agents (always require approval)
    high_risk_agents=["payment-agent", "deploy-agent"],

    # Approval workflow
    default_approval_timeout_hours=24,
    fallback_approvers=["lead@company.com"],
)
See Configuration Reference for the complete parameter list.

Custom policies

Beyond the defaults, you can define organization-specific rules in your dashboard. Custom policies match on event attributes (event type, tool name, agent name, chain metadata) and apply your chosen action. Custom policies are configured via the dashboard at app.proofrail.dev, then enforced automatically by the policy engine on every event. They support the same four decision outcomes as default policies and can be scoped to specific chains, environments, or agents. For policy authoring details — match syntax, rule priority, scope rules — see Policy Configuration.

Policy modes: enforce, shadow, disabled

Every policy (default or custom) can run in one of three modes:
  • enforce — Decisions are applied. This is what you usually want.
  • shadow — The policy evaluates normally, but the decision is logged without being applied. The SDK always returns allow to the agent. Useful for testing new policies against real traffic before turning them on.
  • disabled — The policy is skipped entirely.
New custom policies default to shadow mode so you can calibrate before enforcement. Default policies ship in enforce mode.

Where to go next

Audit receipts

Tamper-evident records of every chain.

Human approval guide

How approvals work end-to-end.

Kill switch

Halting all agent activity in an emergency.

Configuration

Every threshold and parameter, explained.