AgentBoundary v0.1 Specification
Status: Draft. v0.1 is the first published spec; v0.2 will fold in feedback from the 90-day organic-distribution window. The JSON Schema at
docs/schemas/action-receipt-v0.1.jsonis the normative source for receipt field syntax. This document is the normative source for everything else: lifecycle, conformance, and semantic intent.License: This specification is licensed under Apache License 2.0. The reference implementation (Python package
agentboundary) is also Apache 2.0.
Table of Contents
- Introduction
- Definitions
- Controlled Action Lifecycle
- Action Receipt requirements
- Conformance Levels
- Versioning
- Open questions for v0.2
1. Introduction
AgentBoundary is an open specification for portable, tamper-evident proof of AI-initiated production actions. Its purpose is to let any party — an internal auditor, an external compliance reviewer, a regulator, an insurer, a customer — verify what an AI agent was allowed to do in production, without trusting the agent’s framework or model provider.
1.1 What this spec defines
- A canonical Action Receipt schema (JSON, versioned).
- A Controlled Action Lifecycle that runtimes must implement to claim conformance.
- Four Conformance Levels (Level 1 → Level 4) describing increasingly tamper-resistant implementations.
- A threat model (see
threat-model.md) describing the adversaries and attacks AgentBoundary defends against.
1.2 What this spec does NOT define
- A specific UI for approval flows (Slack, email, web — implementor’s choice).
- A specific policy language (YAML rules, OPA Rego, Cedar, custom DSL — all valid).
- The transport between agent and gateway (HTTP, gRPC, message queue, in-process — all valid).
- A specific cryptographic algorithm for
arguments_hashorreceipt_hashbeyond requiring SHA-256. - A retention or storage scheme for receipts beyond requiring they be independently verifiable.
1.3 Who this spec is for
Two audiences:
- Engineers building agent runtimes, governance toolkits, or compliance bundles. This spec tells you what your system must do to produce receipts that downstream consumers can rely on.
- Security, compliance, audit, and risk owners evaluating agent governance tools. This spec tells you what to demand from any tool claiming to “govern” agent actions, and how to verify the claim independently.
1.4 Relationship to other agent governance work
AgentBoundary is complementary to, not competitive with:
- Microsoft Agent Governance Toolkit (AGT) — provides a governance runtime; AgentBoundary defines the portable proof format that runtimes (AGT or otherwise) can emit.
- LangSmith Fleet / Gateway — provides observability and runtime middleware; AgentBoundary’s receipts can be persisted in LangSmith and verified by anyone with the schema.
- Anthropic Managed Agents
permission_policy— provides in-Claude permission flow; AgentBoundary defines the cross-platform receipt that survives outside Claude. - OWASP LLM Top 10 — defines risks; AgentBoundary’s conformance levels map to specific OWASP risks (see
owasp-mapping.md). - MCP elicitation — defines a tool-call approval interaction; AgentBoundary defines the artifact that interaction should produce.
AgentBoundary is about the artifact, not about the runtime. Any runtime that emits compliant Action Receipts contributes to a shared evidence base auditors can rely on.
1.5 Document conventions
- The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119 and RFC 8174 when, and only when, they appear in all capitals.
- Code examples are illustrative; the canonical JSON Schema at
docs/schemas/action-receipt-v0.1.jsongoverns syntax.
2. Definitions
This section defines the terms used throughout the specification. These definitions are normative.
2.1 Action
An Action is a request to mutate a production system, originated by an AI agent or a human-via-AI workflow. Actions are the unit of governance.
Examples of actions:
- Merging a pull request via the GitHub API
- Refunding a charge via the Stripe API
- Updating a customer record via the Salesforce API
- Triggering an Airflow DAG run
- Modifying a Spring service entity
- Sending a customer-facing email
A read-only operation (e.g., listing PRs, querying a database, retrieving a customer record) is NOT an Action under this spec — read-only operations do not require Action Receipts.
2.2 Actor
The Actor is the entity that initiated the Action. Three values are defined:
human— a person initiated the Action (e.g., a developer typing in Claude Code; the receipt records the human’s identity).system— an automated process initiated the Action (e.g., a scheduled job, a webhook handler).agent— an AI agent initiated the Action autonomously (no immediate human in the loop at the moment of initiation; the receipt records the agent’s identity).
The Actor is the entity to whom causation flows back. For an agent run that a human triggered, the Actor SHOULD be agent if the agent ran autonomously after the trigger; or human if every Action was individually authorized. Implementations MUST document their convention.
2.3 Agent
The Agent is the AI software stack that produced the Action. Recorded as {framework, framework_version, model, model_version?}. The framework is the agent runtime (e.g., claude-code, spring-ai, openai-agents-sdk, langgraph). The model is the underlying LLM (e.g., claude-opus-4-7, gpt-5.5, gemini-2.5-pro).
For multi-model agents (e.g., a router model + an executor model), the receipt records the model that decided to invoke the tool, not every model in the call chain. The full call chain MAY be recorded out-of-band; the receipt is concerned with attribution at the action boundary.
2.4 Tool
The Tool is the named capability the Agent invoked to perform the Action. Recorded as {name, version?, capability}.
nameis the implementation-level name (e.g.,github-mcp,stripe-mcp,claims-service).capabilityis the dotted-namespace identifier of WHAT the tool did (e.g.,github.merge,stripe.refund,spring.service.mutation). Capabilities are the unit policies operate on.
A given name MAY support multiple capability values (e.g., github-mcp supports github.merge, github.comment, github.close). Each Action emits one receipt with one capability.
2.5 Target
The Target is the production system the Action affects. Recorded as {system, environment, resource_id?}.
systemis a stable identifier for the affected system (a hostname, a service name, a registered identifier). MUST be specific enough that an auditor can identify the system later (i.e., not just"api").environmentis one ofprod,staging,dev. AgentBoundary is primarily concerned withprod; staging and dev receipts are useful for testing the pipeline but SHOULD be retained separately from prod receipts.resource_idis OPTIONAL but RECOMMENDED. If present, it identifies the specific resource within the system (a PR number, a charge ID, a customer record ID).
2.6 Policy
A Policy is the named, versioned rule that decided whether the Action was allowed. Recorded as {name, version, decision}.
nameis a stable, dot-namespaced identifier (e.g.,acme.refunds.under-500-auto-approve). Policies are first-class objects in the runtime; the spec does not prescribe HOW they are written or stored, only that they have a stable name.versionis a string. Policies change over time; receipts capture the version that decided. Auditors MUST be able to look up the named-and-versioned policy and inspect its text.decisionis one of:allow— the policy allowed the Action without further checks. The Action proceeded.deny— the policy refused the Action. The Action did not proceed. Receipt MUST still be emitted (denial is itself evidence).escalate— the policy could not decide; the Action was escalated for human review out-of-band. Receipt MUST still be emitted, withexecution.statustypicallyblockedif no follow-up occurred within the runtime’s window.require-approval— the policy required human approval before proceeding. If the Action did proceed, the receipt MUST include anapprovalblock. If approval was denied or timed out, the receipt’spolicy.decisionSHOULD still berequire-approvalandexecution.statusSHOULD beblocked.
2.7 Approval
When policy.decision == "require-approval" AND the Action proceeded, the receipt MUST include an Approval block: {approver, approved_at, context?}.
approverrecords who approved. MUST includeid(a stable identifier for the approver). SHOULD includedisplay_nameandrolefor audit readability.approved_atis the RFC 3339 timestamp when the approval was granted. MUST precedeexecution.completed_at.contextis OPTIONAL free-text justification the approver provided.
The Approval block is the spec’s tamper-resistance hinge. Without it, an Action requiring approval has no evidence that approval was granted — and the receipt is invalid.
2.8 Receipt
An Action Receipt is the JSON object that documents one Action. It MUST conform to the schema at docs/schemas/action-receipt-v0.1.json. It MUST be emitted for every Action that reached the production action boundary, including denied and escalated Actions.
Receipts are content-addressable via receipt_hash (SHA-256 over the canonicalized preceding fields). Two correct implementations MUST produce identical receipt_hash values for the same receipt content.
2.9 Capability
A Capability is a dotted-namespace identifier for the kind of Action a Tool can perform. The first segment of the namespace SHOULD identify the system being acted on (e.g., github.*, stripe.*, spring.service.*). Capabilities are not enumerated by the spec; implementations introduce capabilities as needed.
Capability identifiers MUST be lowercase ASCII, dot-separated, and stable across versions of the same Tool. Renaming a capability is a breaking change for any policy referencing it.
3. Controlled Action Lifecycle
Every controlled Action passes through a state machine. Implementations MUST traverse this machine in order; they MAY add internal states but MUST surface the canonical states in the receipt.
3.1 The seven canonical states
| State | Description | Emits receipt? |
|---|---|---|
| proposed | Agent has decided to invoke a Tool capability that targets a production system. The Action has been formed but not yet evaluated by policy. | No (internal) |
| policy_evaluated | A named, versioned Policy has produced a decision of allow, deny, escalate, or require-approval. | No (internal) |
| awaiting_approval | Policy returned require-approval. The Action is paused, awaiting an approver. | No (internal) |
| approved | An approver granted approval. Approver identity, timestamp, and context are recorded. | No (internal) |
| denied | Either Policy returned deny, OR an approver refused, OR the approval window expired. | Yes (execution.status = blocked) |
| executing | The Action proceeded against the production system. | No (internal) |
| completed | The Action finished, with execution.status in {success, failure}. | Yes |
3.2 Valid state transitions
proposed → policy_evaluated
policy_evaluated → executing (allow)
policy_evaluated → denied (deny)
policy_evaluated → awaiting_approval (require-approval)
policy_evaluated → escalated → denied|approved (escalate; out-of-band human routing)
awaiting_approval → approved → executing
awaiting_approval → denied (approver refused OR window expired)
executing → completed
Once a state has been entered, it MUST NOT be exited backwards. There is no executing → awaiting_approval. Re-running an Action after a denial requires a fresh proposed state with a new receipt_id.
3.3 Receipts MUST be emitted
A receipt MUST be emitted in these cases:
- The Action reaches
completed(regardless ofexecution.status). - The Action reaches
deniedwithout reachingexecuting. The receipt recordspolicy.decisionaccurately (the value that caused the denial) andexecution.status = "blocked".
A receipt MUST NOT be emitted while the Action is in transient internal states (proposed, policy_evaluated, awaiting_approval, approved, escalated, executing). The receipt is the final evidence of the Action; intermediate states are runtime concerns.
3.4 Timing constraints
policy.versionMUST be the version that decided. If the policy was updated betweenpolicy_evaluatedandexecuting, the receipt MUST still record the version that decided.arguments_hashMUST hash the arguments as they were atpolicy_evaluated. If arguments differ atexecuting(e.g., the agent modified them after approval), the runtime MUST either re-run policy evaluation or refuse to execute. A receipt with mismatched arguments at execution time is invalid.approval.approved_atMUST precedeexecution.completed_at. A receipt where approval timestamps the same as or after completion is invalid.
3.5 Out-of-band escalation
escalate is the spec’s escape hatch for policies that cannot decide automatically. When a policy returns escalate:
- The Action enters
escalatedstate. - The runtime routes the decision to a human reviewer via an out-of-band channel (Slack, email, web UI, ITSM ticket). The channel is implementor’s choice.
- The reviewer responds with either approval or denial.
- If approval is granted, the receipt SHALL record the approver and proceed as if
require-approvalhad decided originally. Thepolicy.decisionfield MUST remainescalate— the receipt records the policy’s actual output, not the after-the-fact resolution. - If denial is granted or no response arrives within the runtime’s window, the Action transitions to
deniedand a receipt is emitted withexecution.status = "blocked".
The runtime MUST document its escalation window (e.g., “24 hours, then auto-deny”) in operator documentation. The receipt records what actually happened, not the policy.
4. Action Receipt requirements
This section describes each field of an Action Receipt in prose. The JSON Schema at docs/schemas/action-receipt-v0.1.json is the normative source for syntactic constraints. This section is the normative source for semantic intent.
A receipt is a JSON object with the following fields. Fields are required unless explicitly marked OPTIONAL. The object MUST NOT contain additional top-level fields beyond those listed; the schema enforces this via additionalProperties: false.
4.1 version
A string literal: "agentboundary/v0.1". Any other value indicates a different spec version (forward-compatible) or a malformed receipt (backward-incompatible).
Implementations MUST reject receipts whose version does not match a spec they implement.
4.2 receipt_id
A globally unique identifier for this receipt. MUST be a valid UUID (RFC 4122). UUID v7 (timestamp-prefixed) is RECOMMENDED because it permits ordering by issuance without an external timestamp.
receipt_id MUST be regenerated for every Action; an Action retried after a denial generates a fresh receipt_id.
4.3 issued_at
The RFC 3339 timestamp when the receipt was generated. MUST be expressed in UTC (Z suffix) or with an explicit offset.
issued_at SHOULD be close to but distinct from execution.completed_at. Implementations MAY emit the receipt asynchronously after completion; in that case issued_at MAY be later than execution.completed_at.
4.4 actor
The entity that initiated the Action. See §2.2 Actor.
actor.type∈{human, system, agent}actor.idMUST be a stable identifier (not a session token)actor.display_nameis OPTIONAL but RECOMMENDED for audit readability
4.5 agent
The AI software stack that produced the Action. See §2.3 Agent.
agent.frameworkandagent.framework_versionMUST be presentagent.modelMUST be present and identify the LLM that decided to invoke the toolagent.model_versionis OPTIONAL but RECOMMENDED for reproducibility (the same model name can refer to different model versions over time)
4.6 tool
The named capability invoked. See §2.4 Tool.
tool.nameMUST be the implementation-level tool identifiertool.capabilityMUST be the dotted-namespace capability identifier (e.g.,github.merge)tool.versionis OPTIONAL but RECOMMENDED
4.7 target
The production system the Action affected. See §2.5 Target.
target.systemMUST be a stable system identifiertarget.environmentMUST beprod,staging, ordevtarget.resource_idis OPTIONAL but RECOMMENDED
4.8 arguments_hash
A lowercase hex SHA-256 (64 characters) of the canonicalized Action arguments.
Canonicalization is critical for portability. Two correct implementations MUST produce the same arguments_hash for the same logical arguments. v0.1 RECOMMENDS JSON Canonicalization Scheme (RFC 8785) but does not mandate a single algorithm — the implementation MUST document its canonicalization rules. v0.2 SHALL mandate a single algorithm based on v0.1 deployment experience.
arguments_hash proves the executed arguments match what was evaluated by policy. A receipt where arguments_hash differs from the actual canonicalized arguments at execution time is invalid.
4.9 policy
The named, versioned policy that decided. See §2.6 Policy.
policy.nameMUST be a stable, dot-namespaced identifierpolicy.versionMUST be the version that decided (even if the policy was updated between decision and execution)policy.decisionMUST be one of{allow, deny, escalate, require-approval}
4.10 approval (conditional)
REQUIRED when policy.decision == "require-approval" AND the Action proceeded (i.e., execution.status != "blocked").
When present:
approval.approver.idMUST be a stable identifier for the approverapproval.approved_atMUST be a valid RFC 3339 timestamp that precedesexecution.completed_atapproval.contextis OPTIONAL free-text justification
When NOT present:
- The receipt MUST NOT contain an
approvalblock ifpolicy.decision != "require-approval". (See §7 Open questions for the v0.1 stance onallow + approval.)
4.11 execution
The outcome.
execution.statusMUST be one of{success, failure, blocked}:success— Action completed and produced the intended effect on the target systemfailure— Action attempted but the target system rejected, errored, or partially completedblocked— Action did NOT proceed; receipt records the policy/approval state that prevented it
execution.completed_atMUST be a valid RFC 3339 timestampexecution.error_codeis OPTIONAL; if present, SHOULD be an implementation-defined error code forfailureorblockedcasesexecution.result_refis OPTIONAL; if present, SHOULD be a system-specific reference to the result (e.g., a Stripe refund ID, a GitHub commit SHA, a claim revision identifier)
4.12 receipt_hash
A lowercase hex SHA-256 of the canonicalized receipt content excluding the receipt_hash field itself.
This is the tamper-evidence anchor. Any modification to any other field invalidates the hash. An auditor verifies a receipt by:
- Computing SHA-256 of the canonicalized receipt with
receipt_hashremoved - Comparing against the value in
receipt_hash
If the values match, the receipt has not been tampered with since emission. If they differ, the receipt is invalid.
The canonicalization rules MUST be the same as for arguments_hash (§4.8).
5. Conformance Levels
Implementations claim a numeric conformance level. Each level builds on the previous; an implementation claiming Level 3 MUST also satisfy Levels 1 and 2.
5.1 Level 1 — Logged
The implementation produces an Action Receipt that validates against the v0.1 JSON Schema for every Action that reaches the production action boundary.
Level 1 requires:
- Every Action emits a receipt
- Receipts are syntactically valid (pass JSON Schema validation including format checkers for UUID + RFC 3339)
- Receipts are retained for at least 30 days (implementer’s storage choice)
Level 1 does NOT require: tamper resistance, policy binding, approval evidence, or external verifiability.
5.2 Level 2 — Policy-Bound
Everything in Level 1, plus: every receipt records the named, versioned policy that decided. Approval evidence is present where required.
Level 2 requires:
policy.name,policy.version,policy.decisionaccurately record the policy that decided- The named, versioned policy is independently inspectable (an auditor can look up
policy.name+policy.versionand read its text) - When
policy.decision == "require-approval"AND the Action proceeded, the receipt includes anapprovalblock
Level 2 does NOT require: receipt hash integrity, replay defense, argument-mutation defense.
5.3 Level 3 — Portable Proof
Everything in Level 2, plus: every receipt is independently verifiable via arguments_hash and receipt_hash.
Level 3 requires:
arguments_hashcorrectly hashes the canonicalized arguments at policy-evaluation timereceipt_hashcorrectly hashes the canonicalized receipt content excludingreceipt_hash- The canonicalization scheme is documented and reproducible (an auditor can recompute both hashes from the receipt’s other fields plus access to the canonicalization rules)
Level 3 does NOT require: replay defense, argument-mutation defense at execution time, approver chain verification.
5.4 Level 4 — Tamper-Evident
Everything in Level 3, plus: the implementation defends against the receipt-forgery attacks documented in threat-model.md.
Level 4 requires:
- Receipts MUST be rejected when an Action’s arguments at
executingdiffer from arguments atpolicy_evaluated(mutation defense) - Approval-bearing receipts MUST be rejected when the same approval token is reused for a different Action (replay defense)
- Receipts MUST be rejected when
approval.approver.idis not authorized by policy to approve actions of this capability (unauthorized-approver defense) - Receipts MUST be rejected when the policy referenced by
policy.name+policy.versiondoes not exist in the implementation’s policy store (policy-downgrade defense)
Level 4 is the target for production-grade compliance evidence.
5.5 Declaring a conformance level
An implementation declares its conformance level in its documentation, in the format:
”
conforms to AgentBoundary v0.1 at Level N for actions of capability namespace <namespace>.”
An implementation MAY claim different levels for different capability namespaces (e.g., Level 3 for github.*, Level 1 for stripe.* during partial rollout).
Conformance MAY be independently tested via the conformance suite shipped in the AgentBoundary repository at scenarios/. Each scenario file declares a setup, an attempted action, and expected outcomes. Run with npx agentboundary run scenarios/ or uvx agentboundary run scenarios/.
6. Versioning
AgentBoundary uses semantic versioning for the receipt format. The version field in every receipt encodes the spec version it claims to conform to.
- Patch versions (
v0.1.x) — editorial corrections, clarifications, no semantic change. Implementations claimingv0.1SHALL accept anyv0.1.xreceipt. - Minor versions (
v0.2) — backward-compatible additions (new OPTIONAL fields, new enum values for non-required fields, new conformance levels). Implementations claiming an older minor SHOULD accept newer minor receipts that don’t use new required fields. - Major versions (
v1.0,v2.0) — breaking changes (changed semantics, new required fields, removed fields). Implementations MUST NOT accept receipts of a different major version unless they explicitly support multiple majors.
6.1 Forward and backward compatibility
A v0.1 implementation:
- MUST accept v0.1 receipts (obvious)
- MAY accept v0.2 receipts (forward compatible if no required new fields)
- MUST NOT accept v1.0+ receipts (incompatible major)
- MUST NOT accept “v0.1” receipts that fail JSON Schema validation (a string version literal doesn’t excuse invalid syntax)
6.2 v0.2 roadmap
The following are explicitly out of scope for v0.1 and will be addressed in v0.2:
- Mandatory canonicalization scheme (RFC 8785 candidate)
- Delegation chains for A2A (agent-to-agent action forwarding)
- Cryptographic receipt signing (beyond SHA-256 hashing)
- Receipt linkage (parent/child receipts for compound actions)
- Standard MCP elicitation → AgentBoundary receipt mapping
- Cyber-insurance evidence-bundle profile
7. Open questions for v0.2
The following design questions are unresolved in v0.1 and will be addressed in v0.2 based on deployment feedback. Implementers SHOULD document their position on each.
7.1 Is approval permitted on allow decisions?
v0.1 currently permits a receipt with policy.decision == "allow" AND an approval block (the JSON Schema does not forbid it; only require-approval requires approval).
Argument for permitting: an operator may manually approve an Action that policy would have auto-approved (defense in depth). The receipt should record the human acknowledgment.
Argument against permitting: allowing optional approval on allow decisions invites confusion — a reader sees approval and infers policy.decision was require-approval. This undermines the receipt’s clarity.
v0.1 position: permitted but discouraged. v0.2 may forbid or formalize.
7.2 What’s the canonical canonicalization?
v0.1 RECOMMENDS RFC 8785 (JSON Canonicalization Scheme) for arguments_hash and receipt_hash but does not mandate it. Different implementations may produce different hashes for the same logical content, making cross-implementation verification harder.
v0.2 candidate: mandate RFC 8785 unless a stronger candidate emerges from deployment.
7.3 Should target.system follow a standard format?
v0.1 says target.system MUST be “stable enough that an auditor can identify the system later” but doesn’t prescribe a format. URI? hostname? a registry identifier?
v0.2 candidate: require target.system be either a URI or registered in an AgentBoundary system registry (to be specified).
7.4 What about multi-step actions?
A “refund” might involve querying the original charge, computing the refund amount, calling the refund API. v0.1 treats each tool call as a separate Action. Should there be a concept of a “compound action” with a parent receipt?
v0.2 candidate: introduce OPTIONAL parent_receipt_id field for receipt linkage.
7.5 How should A2A delegation be recorded?
When Agent A calls Agent B and Agent B takes a controlled Action, who is the Actor? Agent A (the requester), Agent B (the executor), or both (delegation chain)?
v0.2 candidate: introduce OPTIONAL delegation_chain field listing the chain of agents that participated.
7.6 Should receipts be cryptographically signed (not just hashed)?
receipt_hash provides tamper-evidence but not authenticity. A bad actor with write access to the receipt store could rehash a forged receipt. Public-key signing (Ed25519) would provide authenticity.
v0.2 candidate: OPTIONAL signature field for implementations that support signing. Not REQUIRED because key management is operator-specific.
End of v0.1 spec text. The threat model and OWASP mapping live in threat-model.md and owasp-mapping.md respectively.