Policies

A policy is a rule that maps an agent action to a decision. ProofRail ships with default policies that cover common harm categories out of the box, and lets you add custom policies for organization-specific rules.

The four decision outcomes

Every action goes through policy evaluation and gets one of four outcomes:

Decision	What happens in your code	Dashboard visible
`allow`	The action proceeds. `record_agent_action` returns normally.	No
`allow_with_flag`	The action proceeds. A flag is attached to the event for review.	Yes — flagged in dashboard
`require_approval`	The action is paused. An email is sent. The call blocks until the approver responds.	Yes — pending approvals
`deny`	The action is blocked. `ActionDeniedError` is raised in your code.	Yes — denied events

The decision arrives as a PolicyDecision object when require_approval resolves or as an ActionDeniedError exception when an action is denied.

Three-stage evaluation

Every action runs through a three-stage pipeline before a decision is returned: Stage 1 — Risk classification. The action is scored 0-100 based on what it does, what’s in the payload, and which agent is running it. Categories like financial, destructive, communication, exfiltration, privilege_escalation, credential_exposure are assigned where they apply. Stage 2 — Chain context update. Running totals for the chain are updated with this action’s contribution: financial amounts, external communications, records modified, etc. Stage 3 — Policy application. Rules are evaluated in priority order: hard denies first, approval triggers second, audit flags third, default allow last. The first matching rule wins. The full algorithm is in proofrail/policies.py. The same algorithm runs on the backend; parity tests verify they stay in sync.

Default policies

These ship with ProofRail and cover the categories most agent workflows need to be careful about. You don’t have to configure them — they’re active by default.

Hard denies

Actions that are blocked immediately, with no human-in-the-loop option.

High cumulative exposure — chain-wide financial exposure over $50,000
Production deletions — DELETE operations against production resources
Unauthorized domains — sending data to domains not in the allowlist
Credential exposure — payloads containing patterns matching API keys, passwords, credit card numbers
IAM modifications — changes to permissions, roles, or access control
Impersonation attempts — agents sending communications with a claimed identity different from their declared identity
Restricted data access — accessing data classified as restricted without authorization

Approval triggers

Actions that pause for human review.

Medium financial transactions — single transactions between $1,000 and$ 50,000 (threshold configurable)
Cumulative financial threshold — chain cumulative exposure exceeds the configured limit (default $10,000)
First external communication — the first time a chain sends data outside your organization
Bulk operations — actions touching 100+ records
Bulk deletions — deleting 10+ items at once
Irreversible actions — contract signing, legal filings, public posts, financial commitments
High-risk agent actions — actions by agents you’ve marked as high-risk
Out-of-scope access — agents accessing resources outside their declared scope
PII sent externally — content containing detected PII going to external domains

Audit flags

Actions allowed but flagged for review.

Production writes — any write to production resources
External API calls — any call to an external service
New agent first action — the first time a previously unseen agent appears
External communication content — outgoing content to external recipients

Auto-pause conditions

Chains that hit these limits pause automatically, raising for review.

Event count — 1,000 events in a single chain (likely runaway behavior)
Chain duration — 10 minutes of activity without completion
Token budget — cumulative token spend exceeds the configured budget

Configuring thresholds

Many default policies use thresholds you can tune in proofrail.init():

proofrail.init(
    api_key="prail_...",

    # Financial thresholds
    financial_approval_threshold_usd=5000,
    cumulative_financial_threshold_usd=10000,

    # Domain allowlist (external comms)
    external_domains_allowlist=["clients.com", "vendor.com"],

    # High-risk agents (always require approval)
    high_risk_agents=["payment-agent", "deploy-agent"],

    # Approval workflow
    default_approval_timeout_hours=24,
    fallback_approvers=["lead@company.com"],
)

See Configuration Reference for the complete parameter list.

Custom policies

Beyond the defaults, you can define organization-specific rules in your dashboard. Custom policies match on event attributes (event type, tool name, agent name, chain metadata) and apply your chosen action. Custom policies are configured via the dashboard at app.proofrail.dev, then enforced automatically by the policy engine on every event. They support the same four decision outcomes as default policies and can be scoped to specific chains, environments, or agents. For policy authoring details — match syntax, rule priority, scope rules — see Policy Configuration.

Policy modes: enforce, shadow, disabled

Every policy (default or custom) can run in one of three modes:

enforce — Decisions are applied. This is what you usually want.
shadow — The policy evaluates normally, but the decision is logged without being applied. The SDK always returns allow to the agent. Useful for testing new policies against real traffic before turning them on.
disabled — The policy is skipped entirely.

New custom policies default to shadow mode so you can calibrate before enforcement. Default policies ship in enforce mode.

Where to go next

Audit receipts

Tamper-evident records of every chain.

Human approval guide

How approvals work end-to-end.

Kill switch

Halting all agent activity in an emergency.

Configuration

Every threshold and parameter, explained.

Get Started

Concepts

Framework Integration

Guides

Other

Policies

Policies

The four decision outcomes

Three-stage evaluation

Default policies

Hard denies

Approval triggers

Audit flags

Auto-pause conditions

Configuring thresholds

Custom policies

Policy modes: enforce, shadow, disabled

Where to go next

Audit receipts

Human approval guide

Kill switch

Configuration

​Policies

​The four decision outcomes

​Three-stage evaluation

​Default policies

​Hard denies

​Approval triggers

​Audit flags

​Auto-pause conditions

​Configuring thresholds

​Custom policies

​Policy modes: enforce, shadow, disabled

​Where to go next

Audit receipts

Human approval guide

Kill switch

Configuration

Policies

The four decision outcomes

Three-stage evaluation

Default policies

Hard denies

Approval triggers

Audit flags

Auto-pause conditions

Configuring thresholds

Custom policies

Policy modes: enforce, shadow, disabled

Where to go next