AgentBoundary · v0.1 · open spec

AgentBoundary v0.1 Specification

Status: Draft. v0.1 is the first published spec; v0.2 will fold in feedback from the 90-day organic-distribution window. The JSON Schema at docs/schemas/action-receipt-v0.1.json is the normative source for receipt field syntax. This document is the normative source for everything else: lifecycle, conformance, and semantic intent.

License: This specification is licensed under Apache License 2.0. The reference implementation (Python package agentboundary) is also Apache 2.0.

Table of Contents

  1. Introduction
  2. Definitions
  3. Controlled Action Lifecycle
  4. Action Receipt requirements
  5. Conformance Levels
  6. Versioning
  7. Open questions for v0.2

1. Introduction

AgentBoundary is an open specification for portable, tamper-evident proof of AI-initiated production actions. Its purpose is to let any party — an internal auditor, an external compliance reviewer, a regulator, an insurer, a customer — verify what an AI agent was allowed to do in production, without trusting the agent’s framework or model provider.

1.1 What this spec defines

1.2 What this spec does NOT define

1.3 Who this spec is for

Two audiences:

  1. Engineers building agent runtimes, governance toolkits, or compliance bundles. This spec tells you what your system must do to produce receipts that downstream consumers can rely on.
  2. Security, compliance, audit, and risk owners evaluating agent governance tools. This spec tells you what to demand from any tool claiming to “govern” agent actions, and how to verify the claim independently.

1.4 Relationship to other agent governance work

AgentBoundary is complementary to, not competitive with:

AgentBoundary is about the artifact, not about the runtime. Any runtime that emits compliant Action Receipts contributes to a shared evidence base auditors can rely on.

1.5 Document conventions

2. Definitions

This section defines the terms used throughout the specification. These definitions are normative.

2.1 Action

An Action is a request to mutate a production system, originated by an AI agent or a human-via-AI workflow. Actions are the unit of governance.

Examples of actions:

A read-only operation (e.g., listing PRs, querying a database, retrieving a customer record) is NOT an Action under this spec — read-only operations do not require Action Receipts.

2.2 Actor

The Actor is the entity that initiated the Action. Three values are defined:

The Actor is the entity to whom causation flows back. For an agent run that a human triggered, the Actor SHOULD be agent if the agent ran autonomously after the trigger; or human if every Action was individually authorized. Implementations MUST document their convention.

2.3 Agent

The Agent is the AI software stack that produced the Action. Recorded as {framework, framework_version, model, model_version?}. The framework is the agent runtime (e.g., claude-code, spring-ai, openai-agents-sdk, langgraph). The model is the underlying LLM (e.g., claude-opus-4-7, gpt-5.5, gemini-2.5-pro).

For multi-model agents (e.g., a router model + an executor model), the receipt records the model that decided to invoke the tool, not every model in the call chain. The full call chain MAY be recorded out-of-band; the receipt is concerned with attribution at the action boundary.

2.4 Tool

The Tool is the named capability the Agent invoked to perform the Action. Recorded as {name, version?, capability}.

A given name MAY support multiple capability values (e.g., github-mcp supports github.merge, github.comment, github.close). Each Action emits one receipt with one capability.

2.5 Target

The Target is the production system the Action affects. Recorded as {system, environment, resource_id?}.

2.6 Policy

A Policy is the named, versioned rule that decided whether the Action was allowed. Recorded as {name, version, decision}.

2.7 Approval

When policy.decision == "require-approval" AND the Action proceeded, the receipt MUST include an Approval block: {approver, approved_at, context?}.

The Approval block is the spec’s tamper-resistance hinge. Without it, an Action requiring approval has no evidence that approval was granted — and the receipt is invalid.

2.8 Receipt

An Action Receipt is the JSON object that documents one Action. It MUST conform to the schema at docs/schemas/action-receipt-v0.1.json. It MUST be emitted for every Action that reached the production action boundary, including denied and escalated Actions.

Receipts are content-addressable via receipt_hash (SHA-256 over the canonicalized preceding fields). Two correct implementations MUST produce identical receipt_hash values for the same receipt content.

2.9 Capability

A Capability is a dotted-namespace identifier for the kind of Action a Tool can perform. The first segment of the namespace SHOULD identify the system being acted on (e.g., github.*, stripe.*, spring.service.*). Capabilities are not enumerated by the spec; implementations introduce capabilities as needed.

Capability identifiers MUST be lowercase ASCII, dot-separated, and stable across versions of the same Tool. Renaming a capability is a breaking change for any policy referencing it.

3. Controlled Action Lifecycle

Every controlled Action passes through a state machine. Implementations MUST traverse this machine in order; they MAY add internal states but MUST surface the canonical states in the receipt.

3.1 The seven canonical states

StateDescriptionEmits receipt?
proposedAgent has decided to invoke a Tool capability that targets a production system. The Action has been formed but not yet evaluated by policy.No (internal)
policy_evaluatedA named, versioned Policy has produced a decision of allow, deny, escalate, or require-approval.No (internal)
awaiting_approvalPolicy returned require-approval. The Action is paused, awaiting an approver.No (internal)
approvedAn approver granted approval. Approver identity, timestamp, and context are recorded.No (internal)
deniedEither Policy returned deny, OR an approver refused, OR the approval window expired.Yes (execution.status = blocked)
executingThe Action proceeded against the production system.No (internal)
completedThe Action finished, with execution.status in {success, failure}.Yes

3.2 Valid state transitions

proposed → policy_evaluated
policy_evaluated → executing                  (allow)
policy_evaluated → denied                     (deny)
policy_evaluated → awaiting_approval          (require-approval)
policy_evaluated → escalated → denied|approved (escalate; out-of-band human routing)
awaiting_approval → approved → executing
awaiting_approval → denied                    (approver refused OR window expired)
executing → completed

Once a state has been entered, it MUST NOT be exited backwards. There is no executing → awaiting_approval. Re-running an Action after a denial requires a fresh proposed state with a new receipt_id.

3.3 Receipts MUST be emitted

A receipt MUST be emitted in these cases:

  1. The Action reaches completed (regardless of execution.status).
  2. The Action reaches denied without reaching executing. The receipt records policy.decision accurately (the value that caused the denial) and execution.status = "blocked".

A receipt MUST NOT be emitted while the Action is in transient internal states (proposed, policy_evaluated, awaiting_approval, approved, escalated, executing). The receipt is the final evidence of the Action; intermediate states are runtime concerns.

3.4 Timing constraints

3.5 Out-of-band escalation

escalate is the spec’s escape hatch for policies that cannot decide automatically. When a policy returns escalate:

The runtime MUST document its escalation window (e.g., “24 hours, then auto-deny”) in operator documentation. The receipt records what actually happened, not the policy.

4. Action Receipt requirements

This section describes each field of an Action Receipt in prose. The JSON Schema at docs/schemas/action-receipt-v0.1.json is the normative source for syntactic constraints. This section is the normative source for semantic intent.

A receipt is a JSON object with the following fields. Fields are required unless explicitly marked OPTIONAL. The object MUST NOT contain additional top-level fields beyond those listed; the schema enforces this via additionalProperties: false.

4.1 version

A string literal: "agentboundary/v0.1". Any other value indicates a different spec version (forward-compatible) or a malformed receipt (backward-incompatible).

Implementations MUST reject receipts whose version does not match a spec they implement.

4.2 receipt_id

A globally unique identifier for this receipt. MUST be a valid UUID (RFC 4122). UUID v7 (timestamp-prefixed) is RECOMMENDED because it permits ordering by issuance without an external timestamp.

receipt_id MUST be regenerated for every Action; an Action retried after a denial generates a fresh receipt_id.

4.3 issued_at

The RFC 3339 timestamp when the receipt was generated. MUST be expressed in UTC (Z suffix) or with an explicit offset.

issued_at SHOULD be close to but distinct from execution.completed_at. Implementations MAY emit the receipt asynchronously after completion; in that case issued_at MAY be later than execution.completed_at.

4.4 actor

The entity that initiated the Action. See §2.2 Actor.

4.5 agent

The AI software stack that produced the Action. See §2.3 Agent.

4.6 tool

The named capability invoked. See §2.4 Tool.

4.7 target

The production system the Action affected. See §2.5 Target.

4.8 arguments_hash

A lowercase hex SHA-256 (64 characters) of the canonicalized Action arguments.

Canonicalization is critical for portability. Two correct implementations MUST produce the same arguments_hash for the same logical arguments. v0.1 RECOMMENDS JSON Canonicalization Scheme (RFC 8785) but does not mandate a single algorithm — the implementation MUST document its canonicalization rules. v0.2 SHALL mandate a single algorithm based on v0.1 deployment experience.

arguments_hash proves the executed arguments match what was evaluated by policy. A receipt where arguments_hash differs from the actual canonicalized arguments at execution time is invalid.

4.9 policy

The named, versioned policy that decided. See §2.6 Policy.

4.10 approval (conditional)

REQUIRED when policy.decision == "require-approval" AND the Action proceeded (i.e., execution.status != "blocked").

When present:

When NOT present:

4.11 execution

The outcome.

4.12 receipt_hash

A lowercase hex SHA-256 of the canonicalized receipt content excluding the receipt_hash field itself.

This is the tamper-evidence anchor. Any modification to any other field invalidates the hash. An auditor verifies a receipt by:

  1. Computing SHA-256 of the canonicalized receipt with receipt_hash removed
  2. Comparing against the value in receipt_hash

If the values match, the receipt has not been tampered with since emission. If they differ, the receipt is invalid.

The canonicalization rules MUST be the same as for arguments_hash (§4.8).

5. Conformance Levels

Implementations claim a numeric conformance level. Each level builds on the previous; an implementation claiming Level 3 MUST also satisfy Levels 1 and 2.

5.1 Level 1 — Logged

The implementation produces an Action Receipt that validates against the v0.1 JSON Schema for every Action that reaches the production action boundary.

Level 1 requires:

Level 1 does NOT require: tamper resistance, policy binding, approval evidence, or external verifiability.

5.2 Level 2 — Policy-Bound

Everything in Level 1, plus: every receipt records the named, versioned policy that decided. Approval evidence is present where required.

Level 2 requires:

Level 2 does NOT require: receipt hash integrity, replay defense, argument-mutation defense.

5.3 Level 3 — Portable Proof

Everything in Level 2, plus: every receipt is independently verifiable via arguments_hash and receipt_hash.

Level 3 requires:

Level 3 does NOT require: replay defense, argument-mutation defense at execution time, approver chain verification.

5.4 Level 4 — Tamper-Evident

Everything in Level 3, plus: the implementation defends against the receipt-forgery attacks documented in threat-model.md.

Level 4 requires:

Level 4 is the target for production-grade compliance evidence.

5.5 Declaring a conformance level

An implementation declares its conformance level in its documentation, in the format:

conforms to AgentBoundary v0.1 at Level N for actions of capability namespace <namespace>.”

An implementation MAY claim different levels for different capability namespaces (e.g., Level 3 for github.*, Level 1 for stripe.* during partial rollout).

Conformance MAY be independently tested via the conformance suite shipped in the AgentBoundary repository at scenarios/. Each scenario file declares a setup, an attempted action, and expected outcomes. Run with npx agentboundary run scenarios/ or uvx agentboundary run scenarios/.

6. Versioning

AgentBoundary uses semantic versioning for the receipt format. The version field in every receipt encodes the spec version it claims to conform to.

6.1 Forward and backward compatibility

A v0.1 implementation:

6.2 v0.2 roadmap

The following are explicitly out of scope for v0.1 and will be addressed in v0.2:

7. Open questions for v0.2

The following design questions are unresolved in v0.1 and will be addressed in v0.2 based on deployment feedback. Implementers SHOULD document their position on each.

7.1 Is approval permitted on allow decisions?

v0.1 currently permits a receipt with policy.decision == "allow" AND an approval block (the JSON Schema does not forbid it; only require-approval requires approval).

Argument for permitting: an operator may manually approve an Action that policy would have auto-approved (defense in depth). The receipt should record the human acknowledgment.

Argument against permitting: allowing optional approval on allow decisions invites confusion — a reader sees approval and infers policy.decision was require-approval. This undermines the receipt’s clarity.

v0.1 position: permitted but discouraged. v0.2 may forbid or formalize.

7.2 What’s the canonical canonicalization?

v0.1 RECOMMENDS RFC 8785 (JSON Canonicalization Scheme) for arguments_hash and receipt_hash but does not mandate it. Different implementations may produce different hashes for the same logical content, making cross-implementation verification harder.

v0.2 candidate: mandate RFC 8785 unless a stronger candidate emerges from deployment.

7.3 Should target.system follow a standard format?

v0.1 says target.system MUST be “stable enough that an auditor can identify the system later” but doesn’t prescribe a format. URI? hostname? a registry identifier?

v0.2 candidate: require target.system be either a URI or registered in an AgentBoundary system registry (to be specified).

7.4 What about multi-step actions?

A “refund” might involve querying the original charge, computing the refund amount, calling the refund API. v0.1 treats each tool call as a separate Action. Should there be a concept of a “compound action” with a parent receipt?

v0.2 candidate: introduce OPTIONAL parent_receipt_id field for receipt linkage.

7.5 How should A2A delegation be recorded?

When Agent A calls Agent B and Agent B takes a controlled Action, who is the Actor? Agent A (the requester), Agent B (the executor), or both (delegation chain)?

v0.2 candidate: introduce OPTIONAL delegation_chain field listing the chain of agents that participated.

7.6 Should receipts be cryptographically signed (not just hashed)?

receipt_hash provides tamper-evidence but not authenticity. A bad actor with write access to the receipt store could rehash a forged receipt. Public-key signing (Ed25519) would provide authenticity.

v0.2 candidate: OPTIONAL signature field for implementations that support signing. Not REQUIRED because key management is operator-specific.


End of v0.1 spec text. The threat model and OWASP mapping live in threat-model.md and owasp-mapping.md respectively.