AgentBoundary v0.1 Specification

Status: Draft. v0.1 is the first published spec; v0.2 will fold in feedback from the 90-day organic-distribution window. The JSON Schema at docs/schemas/action-receipt-v0.1.json is the normative source for receipt field syntax. This document is the normative source for everything else: lifecycle, conformance, and semantic intent.

License: This specification is licensed under Apache License 2.0. The reference implementation (Python package agentboundary) is also Apache 2.0.

Introduction
Definitions
Controlled Action Lifecycle
Action Receipt requirements
Conformance Levels
Versioning
Open questions for v0.2

1. Introduction

AgentBoundary is an open specification for portable, tamper-evident proof of AI-initiated production actions. Its purpose is to let any party — an internal auditor, an external compliance reviewer, a regulator, an insurer, a customer — verify what an AI agent was allowed to do in production, without trusting the agent’s framework or model provider.

1.1 What this spec defines

A canonical Action Receipt schema (JSON, versioned).
A Controlled Action Lifecycle that runtimes must implement to claim conformance.
Four Conformance Levels (Level 1 → Level 4) describing increasingly tamper-resistant implementations.
A threat model (see threat-model.md) describing the adversaries and attacks AgentBoundary defends against.

1.2 What this spec does NOT define

A specific UI for approval flows (Slack, email, web — implementor’s choice).
A specific policy language (YAML rules, OPA Rego, Cedar, custom DSL — all valid).
The transport between agent and gateway (HTTP, gRPC, message queue, in-process — all valid).
A specific cryptographic algorithm for arguments_hash or receipt_hash beyond requiring SHA-256.
A retention or storage scheme for receipts beyond requiring they be independently verifiable.

1.3 Who this spec is for

Two audiences:

Engineers building agent runtimes, governance toolkits, or compliance bundles. This spec tells you what your system must do to produce receipts that downstream consumers can rely on.
Security, compliance, audit, and risk owners evaluating agent governance tools. This spec tells you what to demand from any tool claiming to “govern” agent actions, and how to verify the claim independently.

1.4 Relationship to other agent governance work

AgentBoundary is complementary to, not competitive with:

Microsoft Agent Governance Toolkit (AGT) — provides a governance runtime; AgentBoundary defines the portable proof format that runtimes (AGT or otherwise) can emit.
LangSmith Fleet / Gateway — provides observability and runtime middleware; AgentBoundary’s receipts can be persisted in LangSmith and verified by anyone with the schema.
Anthropic Managed Agents permission_policy — provides in-Claude permission flow; AgentBoundary defines the cross-platform receipt that survives outside Claude.
OWASP LLM Top 10 — defines risks; AgentBoundary’s conformance levels map to specific OWASP risks (see owasp-mapping.md).
MCP elicitation — defines a tool-call approval interaction; AgentBoundary defines the artifact that interaction should produce.

AgentBoundary is about the artifact, not about the runtime. Any runtime that emits compliant Action Receipts contributes to a shared evidence base auditors can rely on.

1.5 Document conventions

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119 and RFC 8174 when, and only when, they appear in all capitals.
Code examples are illustrative; the canonical JSON Schema at docs/schemas/action-receipt-v0.1.json governs syntax.

2. Definitions

This section defines the terms used throughout the specification. These definitions are normative.

2.1 Action

An Action is a request to mutate a production system, originated by an AI agent or a human-via-AI workflow. Actions are the unit of governance.

Examples of actions:

Merging a pull request via the GitHub API
Refunding a charge via the Stripe API
Updating a customer record via the Salesforce API
Triggering an Airflow DAG run
Modifying a Spring service entity
Sending a customer-facing email

A read-only operation (e.g., listing PRs, querying a database, retrieving a customer record) is NOT an Action under this spec — read-only operations do not require Action Receipts.

2.2 Actor

The Actor is the entity that initiated the Action. Three values are defined:

human — a person initiated the Action (e.g., a developer typing in Claude Code; the receipt records the human’s identity).
system — an automated process initiated the Action (e.g., a scheduled job, a webhook handler).
agent — an AI agent initiated the Action autonomously (no immediate human in the loop at the moment of initiation; the receipt records the agent’s identity).

The Actor is the entity to whom causation flows back. For an agent run that a human triggered, the Actor SHOULD be agent if the agent ran autonomously after the trigger; or human if every Action was individually authorized. Implementations MUST document their convention.

2.3 Agent

The Agent is the AI software stack that produced the Action. Recorded as {framework, framework_version, model, model_version?}. The framework is the agent runtime (e.g., claude-code, spring-ai, openai-agents-sdk, langgraph). The model is the underlying LLM (e.g., claude-opus-4-7, gpt-5.5, gemini-2.5-pro).

For multi-model agents (e.g., a router model + an executor model), the receipt records the model that decided to invoke the tool, not every model in the call chain. The full call chain MAY be recorded out-of-band; the receipt is concerned with attribution at the action boundary.

2.4 Tool

The Tool is the named capability the Agent invoked to perform the Action. Recorded as {name, version?, capability}.

name is the implementation-level name (e.g., github-mcp, stripe-mcp, claims-service).
capability is the dotted-namespace identifier of WHAT the tool did (e.g., github.merge, stripe.refund, spring.service.mutation). Capabilities are the unit policies operate on.

A given name MAY support multiple capability values (e.g., github-mcp supports github.merge, github.comment, github.close). Each Action emits one receipt with one capability.

2.5 Target

The Target is the production system the Action affects. Recorded as {system, environment, resource_id?}.

system is a stable identifier for the affected system (a hostname, a service name, a registered identifier). MUST be specific enough that an auditor can identify the system later (i.e., not just "api").
environment is one of prod, staging, dev. AgentBoundary is primarily concerned with prod; staging and dev receipts are useful for testing the pipeline but SHOULD be retained separately from prod receipts.
resource_id is OPTIONAL but RECOMMENDED. If present, it identifies the specific resource within the system (a PR number, a charge ID, a customer record ID).

2.6 Policy

A Policy is the named, versioned rule that decided whether the Action was allowed. Recorded as {name, version, decision}.

name is a stable, dot-namespaced identifier (e.g., acme.refunds.under-500-auto-approve). Policies are first-class objects in the runtime; the spec does not prescribe HOW they are written or stored, only that they have a stable name.
version is a string. Policies change over time; receipts capture the version that decided. Auditors MUST be able to look up the named-and-versioned policy and inspect its text.
decision is one of:
- allow — the policy allowed the Action without further checks. The Action proceeded.
- deny — the policy refused the Action. The Action did not proceed. Receipt MUST still be emitted (denial is itself evidence).
- escalate — the policy could not decide; the Action was escalated for human review out-of-band. Receipt MUST still be emitted, with execution.status typically blocked if no follow-up occurred within the runtime’s window.
- require-approval — the policy required human approval before proceeding. If the Action did proceed, the receipt MUST include an approval block. If approval was denied or timed out, the receipt’s policy.decision SHOULD still be require-approval and execution.status SHOULD be blocked.

2.7 Approval

When policy.decision == "require-approval" AND the Action proceeded, the receipt MUST include an Approval block: {approver, approved_at, context?}.

approver records who approved. MUST include id (a stable identifier for the approver). SHOULD include display_name and role for audit readability.
approved_at is the RFC 3339 timestamp when the approval was granted. MUST precede execution.completed_at.
context is OPTIONAL free-text justification the approver provided.

The Approval block is the spec’s tamper-resistance hinge. Without it, an Action requiring approval has no evidence that approval was granted — and the receipt is invalid.

2.8 Receipt

An Action Receipt is the JSON object that documents one Action. It MUST conform to the schema at docs/schemas/action-receipt-v0.1.json. It MUST be emitted for every Action that reached the production action boundary, including denied and escalated Actions.

Receipts are content-addressable via receipt_hash (SHA-256 over the canonicalized preceding fields). Two correct implementations MUST produce identical receipt_hash values for the same receipt content.

2.9 Capability

A Capability is a dotted-namespace identifier for the kind of Action a Tool can perform. The first segment of the namespace SHOULD identify the system being acted on (e.g., github.*, stripe.*, spring.service.*). Capabilities are not enumerated by the spec; implementations introduce capabilities as needed.

Capability identifiers MUST be lowercase ASCII, dot-separated, and stable across versions of the same Tool. Renaming a capability is a breaking change for any policy referencing it.

3. Controlled Action Lifecycle

Every controlled Action passes through a state machine. Implementations MUST traverse this machine in order; they MAY add internal states but MUST surface the canonical states in the receipt.

3.1 The seven canonical states

State	Description	Emits receipt?
proposed	Agent has decided to invoke a Tool capability that targets a production system. The Action has been formed but not yet evaluated by policy.	No (internal)
policy_evaluated	A named, versioned Policy has produced a `decision` of `allow`, `deny`, `escalate`, or `require-approval`.	No (internal)
awaiting_approval	Policy returned `require-approval`. The Action is paused, awaiting an approver.	No (internal)
approved	An approver granted approval. Approver identity, timestamp, and context are recorded.	No (internal)
denied	Either Policy returned `deny`, OR an approver refused, OR the approval window expired.	Yes (`execution.status = blocked`)
executing	The Action proceeded against the production system.	No (internal)
completed	The Action finished, with `execution.status` in `{success, failure}`.	Yes

3.2 Valid state transitions

proposed → policy_evaluated
policy_evaluated → executing                  (allow)
policy_evaluated → denied                     (deny)
policy_evaluated → awaiting_approval          (require-approval)
policy_evaluated → escalated → denied|approved (escalate; out-of-band human routing)
awaiting_approval → approved → executing
awaiting_approval → denied                    (approver refused OR window expired)
executing → completed

Once a state has been entered, it MUST NOT be exited backwards. There is no executing → awaiting_approval. Re-running an Action after a denial requires a fresh proposed state with a new receipt_id.

3.3 Receipts MUST be emitted

A receipt MUST be emitted in these cases:

The Action reaches completed (regardless of execution.status).
The Action reaches denied without reaching executing. The receipt records policy.decision accurately (the value that caused the denial) and execution.status = "blocked".

A receipt MUST NOT be emitted while the Action is in transient internal states (proposed, policy_evaluated, awaiting_approval, approved, escalated, executing). The receipt is the final evidence of the Action; intermediate states are runtime concerns.

3.4 Timing constraints

policy.version MUST be the version that decided. If the policy was updated between policy_evaluated and executing, the receipt MUST still record the version that decided.
arguments_hash MUST hash the arguments as they were at policy_evaluated. If arguments differ at executing (e.g., the agent modified them after approval), the runtime MUST either re-run policy evaluation or refuse to execute. A receipt with mismatched arguments at execution time is invalid.
approval.approved_at MUST precede execution.completed_at. A receipt where approval timestamps the same as or after completion is invalid.

3.5 Out-of-band escalation

escalate is the spec’s escape hatch for policies that cannot decide automatically. When a policy returns escalate:

The Action enters escalated state.
The runtime routes the decision to a human reviewer via an out-of-band channel (Slack, email, web UI, ITSM ticket). The channel is implementor’s choice.
The reviewer responds with either approval or denial.
If approval is granted, the receipt SHALL record the approver and proceed as if require-approval had decided originally. The policy.decision field MUST remain escalate — the receipt records the policy’s actual output, not the after-the-fact resolution.
If denial is granted or no response arrives within the runtime’s window, the Action transitions to denied and a receipt is emitted with execution.status = "blocked".

The runtime MUST document its escalation window (e.g., “24 hours, then auto-deny”) in operator documentation. The receipt records what actually happened, not the policy.

4. Action Receipt requirements

This section describes each field of an Action Receipt in prose. The JSON Schema at docs/schemas/action-receipt-v0.1.json is the normative source for syntactic constraints. This section is the normative source for semantic intent.

A receipt is a JSON object with the following fields. Fields are required unless explicitly marked OPTIONAL. The object MUST NOT contain additional top-level fields beyond those listed; the schema enforces this via additionalProperties: false.

4.1 `version`

A string literal: "agentboundary/v0.1". Any other value indicates a different spec version (forward-compatible) or a malformed receipt (backward-incompatible).

Implementations MUST reject receipts whose version does not match a spec they implement.

4.2 `receipt_id`

A globally unique identifier for this receipt. MUST be a valid UUID (RFC 4122). UUID v7 (timestamp-prefixed) is RECOMMENDED because it permits ordering by issuance without an external timestamp.

receipt_id MUST be regenerated for every Action; an Action retried after a denial generates a fresh receipt_id.

4.3 `issued_at`

The RFC 3339 timestamp when the receipt was generated. MUST be expressed in UTC (Z suffix) or with an explicit offset.

issued_at SHOULD be close to but distinct from execution.completed_at. Implementations MAY emit the receipt asynchronously after completion; in that case issued_at MAY be later than execution.completed_at.

4.4 `actor`

The entity that initiated the Action. See §2.2 Actor.

actor.type ∈ {human, system, agent}
actor.id MUST be a stable identifier (not a session token)
actor.display_name is OPTIONAL but RECOMMENDED for audit readability

4.5 `agent`

The AI software stack that produced the Action. See §2.3 Agent.

agent.framework and agent.framework_version MUST be present
agent.model MUST be present and identify the LLM that decided to invoke the tool
agent.model_version is OPTIONAL but RECOMMENDED for reproducibility (the same model name can refer to different model versions over time)

4.6 `tool`

The named capability invoked. See §2.4 Tool.

tool.name MUST be the implementation-level tool identifier
tool.capability MUST be the dotted-namespace capability identifier (e.g., github.merge)
tool.version is OPTIONAL but RECOMMENDED

4.7 `target`

The production system the Action affected. See §2.5 Target.

target.system MUST be a stable system identifier
target.environment MUST be prod, staging, or dev
target.resource_id is OPTIONAL but RECOMMENDED

4.8 `arguments_hash`

A lowercase hex SHA-256 (64 characters) of the canonicalized Action arguments.

Canonicalization is critical for portability. Two correct implementations MUST produce the same arguments_hash for the same logical arguments. v0.1 RECOMMENDS JSON Canonicalization Scheme (RFC 8785) but does not mandate a single algorithm — the implementation MUST document its canonicalization rules. v0.2 SHALL mandate a single algorithm based on v0.1 deployment experience.

arguments_hash proves the executed arguments match what was evaluated by policy. A receipt where arguments_hash differs from the actual canonicalized arguments at execution time is invalid.

4.9 `policy`

The named, versioned policy that decided. See §2.6 Policy.

policy.name MUST be a stable, dot-namespaced identifier
policy.version MUST be the version that decided (even if the policy was updated between decision and execution)
policy.decision MUST be one of {allow, deny, escalate, require-approval}

4.10 `approval` (conditional)

REQUIRED when policy.decision == "require-approval" AND the Action proceeded (i.e., execution.status != "blocked").

When present:

approval.approver.id MUST be a stable identifier for the approver
approval.approved_at MUST be a valid RFC 3339 timestamp that precedes execution.completed_at
approval.context is OPTIONAL free-text justification

When NOT present:

The receipt MUST NOT contain an approval block if policy.decision != "require-approval". (See §7 Open questions for the v0.1 stance on allow + approval.)

4.11 `execution`

The outcome.

execution.status MUST be one of {success, failure, blocked}:
- success — Action completed and produced the intended effect on the target system
- failure — Action attempted but the target system rejected, errored, or partially completed
- blocked — Action did NOT proceed; receipt records the policy/approval state that prevented it
execution.completed_at MUST be a valid RFC 3339 timestamp
execution.error_code is OPTIONAL; if present, SHOULD be an implementation-defined error code for failure or blocked cases
execution.result_ref is OPTIONAL; if present, SHOULD be a system-specific reference to the result (e.g., a Stripe refund ID, a GitHub commit SHA, a claim revision identifier)

4.12 `receipt_hash`

A lowercase hex SHA-256 of the canonicalized receipt content excluding the receipt_hash field itself.

This is the tamper-evidence anchor. Any modification to any other field invalidates the hash. An auditor verifies a receipt by:

Computing SHA-256 of the canonicalized receipt with receipt_hash removed
Comparing against the value in receipt_hash

If the values match, the receipt has not been tampered with since emission. If they differ, the receipt is invalid.

The canonicalization rules MUST be the same as for arguments_hash (§4.8).

5. Conformance Levels

Implementations claim a numeric conformance level. Each level builds on the previous; an implementation claiming Level 3 MUST also satisfy Levels 1 and 2.

5.1 Level 1 — Logged

The implementation produces an Action Receipt that validates against the v0.1 JSON Schema for every Action that reaches the production action boundary.

Level 1 requires:

Every Action emits a receipt
Receipts are syntactically valid (pass JSON Schema validation including format checkers for UUID + RFC 3339)
Receipts are retained for at least 30 days (implementer’s storage choice)

Level 1 does NOT require: tamper resistance, policy binding, approval evidence, or external verifiability.

5.2 Level 2 — Policy-Bound

Everything in Level 1, plus: every receipt records the named, versioned policy that decided. Approval evidence is present where required.

Level 2 requires:

policy.name, policy.version, policy.decision accurately record the policy that decided
The named, versioned policy is independently inspectable (an auditor can look up policy.name + policy.version and read its text)
When policy.decision == "require-approval" AND the Action proceeded, the receipt includes an approval block

Level 2 does NOT require: receipt hash integrity, replay defense, argument-mutation defense.

5.3 Level 3 — Portable Proof

Everything in Level 2, plus: every receipt is independently verifiable via arguments_hash and receipt_hash.

Level 3 requires:

arguments_hash correctly hashes the canonicalized arguments at policy-evaluation time
receipt_hash correctly hashes the canonicalized receipt content excluding receipt_hash
The canonicalization scheme is documented and reproducible (an auditor can recompute both hashes from the receipt’s other fields plus access to the canonicalization rules)

Level 3 does NOT require: replay defense, argument-mutation defense at execution time, approver chain verification.

5.4 Level 4 — Tamper-Evident

Everything in Level 3, plus: the implementation defends against the receipt-forgery attacks documented in threat-model.md.

Level 4 requires:

Receipts MUST be rejected when an Action’s arguments at executing differ from arguments at policy_evaluated (mutation defense)
Approval-bearing receipts MUST be rejected when the same approval token is reused for a different Action (replay defense)
Receipts MUST be rejected when approval.approver.id is not authorized by policy to approve actions of this capability (unauthorized-approver defense)
Receipts MUST be rejected when the policy referenced by policy.name + policy.version does not exist in the implementation’s policy store (policy-downgrade defense)

Level 4 is the target for production-grade compliance evidence.

5.5 Declaring a conformance level

An implementation declares its conformance level in its documentation, in the format:

” conforms to AgentBoundary v0.1 at Level N for actions of capability namespace <namespace>.”

An implementation MAY claim different levels for different capability namespaces (e.g., Level 3 for github.*, Level 1 for stripe.* during partial rollout).

Conformance MAY be independently tested via the conformance suite shipped in the AgentBoundary repository at scenarios/. Each scenario file declares a setup, an attempted action, and expected outcomes. Run with npx agentboundary run scenarios/ or uvx agentboundary run scenarios/.

6. Versioning

AgentBoundary uses semantic versioning for the receipt format. The version field in every receipt encodes the spec version it claims to conform to.

Patch versions (v0.1.x) — editorial corrections, clarifications, no semantic change. Implementations claiming v0.1 SHALL accept any v0.1.x receipt.
Minor versions (v0.2) — backward-compatible additions (new OPTIONAL fields, new enum values for non-required fields, new conformance levels). Implementations claiming an older minor SHOULD accept newer minor receipts that don’t use new required fields.
Major versions (v1.0, v2.0) — breaking changes (changed semantics, new required fields, removed fields). Implementations MUST NOT accept receipts of a different major version unless they explicitly support multiple majors.

6.1 Forward and backward compatibility

A v0.1 implementation:

MUST accept v0.1 receipts (obvious)
MAY accept v0.2 receipts (forward compatible if no required new fields)
MUST NOT accept v1.0+ receipts (incompatible major)
MUST NOT accept “v0.1” receipts that fail JSON Schema validation (a string version literal doesn’t excuse invalid syntax)

6.2 v0.2 roadmap

The following are explicitly out of scope for v0.1 and will be addressed in v0.2:

Mandatory canonicalization scheme (RFC 8785 candidate)
Delegation chains for A2A (agent-to-agent action forwarding)
Cryptographic receipt signing (beyond SHA-256 hashing)
Receipt linkage (parent/child receipts for compound actions)
Standard MCP elicitation → AgentBoundary receipt mapping
Cyber-insurance evidence-bundle profile

7. Open questions for v0.2

The following design questions are unresolved in v0.1 and will be addressed in v0.2 based on deployment feedback. Implementers SHOULD document their position on each.

7.1 Is `approval` permitted on `allow` decisions?

v0.1 currently permits a receipt with policy.decision == "allow" AND an approval block (the JSON Schema does not forbid it; only require-approval requires approval).

Argument for permitting: an operator may manually approve an Action that policy would have auto-approved (defense in depth). The receipt should record the human acknowledgment.

Argument against permitting: allowing optional approval on allow decisions invites confusion — a reader sees approval and infers policy.decision was require-approval. This undermines the receipt’s clarity.

v0.1 position: permitted but discouraged. v0.2 may forbid or formalize.

7.2 What’s the canonical canonicalization?

v0.1 RECOMMENDS RFC 8785 (JSON Canonicalization Scheme) for arguments_hash and receipt_hash but does not mandate it. Different implementations may produce different hashes for the same logical content, making cross-implementation verification harder.

v0.2 candidate: mandate RFC 8785 unless a stronger candidate emerges from deployment.

End of v0.1 spec text. The threat model and OWASP mapping live in threat-model.md and owasp-mapping.md respectively.