Policies
A policy is a rule that maps an agent action to a decision. ProofRail ships with default policies that cover common harm categories out of the box, and lets you add custom policies for organization-specific rules.The four decision outcomes
Every action goes through policy evaluation and gets one of four outcomes:| Decision | What happens in your code | Dashboard visible |
|---|---|---|
allow | The action proceeds. record_agent_action returns normally. | No |
allow_with_flag | The action proceeds. A flag is attached to the event for review. | Yes — flagged in dashboard |
require_approval | The action is paused. An email is sent. The call blocks until the approver responds. | Yes — pending approvals |
deny | The action is blocked. ActionDeniedError is raised in your code. | Yes — denied events |
PolicyDecision object when require_approval resolves or as an ActionDeniedError exception when an action is denied.
Three-stage evaluation
Every action runs through a three-stage pipeline before a decision is returned: Stage 1 — Risk classification. The action is scored 0-100 based on what it does, what’s in the payload, and which agent is running it. Categories likefinancial, destructive, communication, exfiltration, privilege_escalation, credential_exposure are assigned where they apply.
Stage 2 — Chain context update. Running totals for the chain are updated with this action’s contribution: financial amounts, external communications, records modified, etc.
Stage 3 — Policy application. Rules are evaluated in priority order: hard denies first, approval triggers second, audit flags third, default allow last. The first matching rule wins.
The full algorithm is in proofrail/policies.py. The same algorithm runs on the backend; parity tests verify they stay in sync.
Default policies
These ship with ProofRail and cover the categories most agent workflows need to be careful about. You don’t have to configure them — they’re active by default.Hard denies
Actions that are blocked immediately, with no human-in-the-loop option.- High-value transactions — single transactions over $50,000
- Production deletions — DELETE operations against production resources
- Unauthorized domains — sending data to domains not in the allowlist
- Credential exposure — payloads containing patterns matching API keys, passwords, credit card numbers
- IAM modifications — changes to permissions, roles, or access control
- Impersonation attempts — agents sending communications with a claimed identity different from their declared identity
- Restricted data access — accessing data classified as restricted without authorization
Approval triggers
Actions that pause for human review.- Medium financial transactions — single transactions between 50,000 (threshold configurable)
- Cumulative financial threshold — chain cumulative exposure exceeds the configured limit (default $10,000)
- First external communication — the first time a chain sends data outside your organization
- Bulk operations — actions touching 100+ records
- Bulk deletions — deleting 10+ items at once
- Irreversible actions — contract signing, legal filings, public posts, financial commitments
- High-risk agent actions — actions by agents you’ve marked as high-risk
- Out-of-scope access — agents accessing resources outside their declared scope
- PII sent externally — content containing detected PII going to external domains
Audit flags
Actions allowed but flagged for review.- Production writes — any write to production resources
- External API calls — any call to an external service
- New agent first action — the first time a previously unseen agent appears
- External communication content — outgoing content to external recipients
Auto-pause conditions
Chains that hit these limits pause automatically, raising for review.- Event count — 1,000 events in a single chain (likely runaway behavior)
- Chain duration — 10 minutes of activity without completion
- Token budget — cumulative token spend exceeds the configured budget
Configuring thresholds
Many default policies use thresholds you can tune inproofrail.init():
Custom policies
Beyond the defaults, you can define organization-specific rules in your dashboard. Custom policies match on event attributes (event type, tool name, agent name, chain metadata) and apply your chosen action. Custom policies are configured via the dashboard atapp.proofrail.dev, then enforced automatically by the policy engine on every event. They support the same four decision outcomes as default policies and can be scoped to specific chains, environments, or agents.
For policy authoring details — match syntax, rule priority, scope rules — see Policy Configuration.
Policy modes: enforce, shadow, disabled
Every policy (default or custom) can run in one of three modes:enforce— Decisions are applied. This is what you usually want.shadow— The policy evaluates normally, but the decision is logged without being applied. The SDK always returnsallowto the agent. Useful for testing new policies against real traffic before turning them on.disabled— The policy is skipped entirely.
shadow mode so you can calibrate before enforcement. Default policies ship in enforce mode.
Where to go next
Audit receipts
Tamper-evident records of every chain.
Human approval guide
How approvals work end-to-end.
Kill switch
Halting all agent activity in an emergency.
Configuration
Every threshold and parameter, explained.