The Agent-Stack Governance Checklist: Authority, Review, Provenance, and Handoffs

The last wave of agent-stack excitement has been about connection: agents talking to tools, tools exposing context, agents discovering other agents, frameworks routing work across specialists.

Good. That layer matters.

But connection is not the operating model. It is the substrate the operating model has to survive.

In the prior piece, Protocols are saturating; the operator's gap is governance, not transport, the point was simple: protocol compatibility is becoming table stakes. The next question is less dramatic and more useful:

When an agent stack can connect, who decides what it is allowed to do?

That question is where most demos get thin.

A protocol can standardize exchange. It can make tools discoverable. It can define how messages, tasks, artifacts, or context move. It can make two systems less bespoke to integrate. Those are real gains.

What it cannot do by itself is decide the production policy around the work.

Before an agent calls a tool, delegates to another agent, hands off context, or ships an output, a team needs four gates: authority, review, provenance, and handoff context.

If those gates are not designed explicitly, the system still works. That is the problem.

It works as an ungoverned workflow.

A handoff can lose approval state. A tool can mutate the right system under the wrong authority. A receiving agent can act on stale context because the delegation sounded confident. None of those failures require the protocol to be broken. They happen when the operating layer is missing.

Protocols make exchange easier. Governance decides whether the exchange is safe enough to trust.

The checklist answer

A production agent stack needs four explicit gates:

Authority: define who or what can approve each action by role, environment, data class, and risk.
Review: place human or policy checks before handoffs, tool calls, side effects, and customer-visible output.
Provenance: preserve the agent, model, tool, source, approval, and artifact trail well enough to reconstruct the work.
Handoff context: specify what state, constraints, sources, and authority move to the receiving agent — and what must stay behind.

This is the operating layer. MCP, A2A, and SDK handoff primitives can expose the surfaces, but the team still has to decide the policy that governs them.

The useful split: protocol layer vs operating layer

I separate the agent stack into two layers.

The protocol layer answers connection questions:

Can this agent discover that capability?
Can it exchange messages or context with that server?
Can it invoke a tool through a standard interface?
Can one agent delegate work to another agent?
Can artifacts and task state move in a recognizable shape?

The operating layer answers accountability questions:

Who granted permission for this action?
Which actions require human approval or policy review?
What inputs, tool calls, sources, and decisions are recorded?
What context is allowed to travel during delegation?
What fails closed when the receiving agent lacks the right state?

The first layer is getting better fast. MCP's architecture documentation separates protocol/data concerns from transport concerns and describes primitives like tools, resources, prompts, notifications, and lifecycle management. A2A focuses on agent interoperability, discovery, collaboration, task delegation, context exchange, and secure information exchange. Agent frameworks now treat tools, handoffs, guardrails, tracing, and sessions as first-class runtime concepts.

That is progress.

The mistake is treating that progress as if it also settled the operating layer.

It did not.

Gate 1: Authority

Authority is the first gate because everything else inherits from it.

The question is not "can this agent call the tool?" The question is "under what authority can this agent call the tool, in this environment, for this task, with this level of confidence?"

That sounds heavier than a demo because it is. Production work is heavier than a demo.

For every tool, server, agent, or handoff path, the stack needs an authority policy that is specific enough to be enforced:

Which roles can request this action?
Which agents can execute it directly?
Which environments are allowed: local, staging, production, customer-facing?
Which data classes can be read, written, exported, or summarized?
Which actions are reversible, and which require a higher approval tier?
What confidence or evidence threshold is required before execution?

MCP's tools spec is useful here because it describes tools as model-controlled and says applications should make exposed tools clear to users. It also recommends human ability to deny tool invocations for trust, safety, and security. That is not a footnote. It is the authority problem showing up inside the tool layer.

If a tool can mutate state, spend money, contact a customer, deploy code, delete data, or write into a system of record, "the model selected the tool" is not an acceptable authority chain.

A better authority chain looks like this:

The agent may discover the tool.
The agent may propose the tool call with arguments and rationale.
The policy layer checks role, task, environment, data class, and risk.
Human or automated review approves, denies, narrows, or replaces the action.
The execution record stores who or what authorized the final call.

The goal is not to slow every action down. The goal is to make the permission boundary explicit before the system touches anything that matters.

Gate 2: Review

Review is where teams usually over-compress the design.

They say "we have guardrails" as if guardrails are one layer. They are not.

A workflow can need input review, tool-call review, handoff review, artifact review, and final-output review. Those are different positions in the chain. Putting a review gate at the end does not govern the middle.

The OpenAI Agents SDK docs make this concrete: guardrails can run on inputs, outputs, and tool invocations. Tool guardrails can run before or after a function tool executes. Handoffs are represented as tools to the LLM. Input and output guardrails have boundary-specific behavior. In other words, the location of a guardrail changes what it can actually protect.

That matters for real agent stacks.

If Agent A hands work to Agent B, and Agent B calls a production tool, the review design needs to answer:

Was the handoff itself reviewed?
Did the receiving agent inherit the same authority or a narrower one?
Did the tool call require separate approval?
Was the final artifact reviewed before it left the system?
Can the workflow stop safely if any review step is missing?

A weak review design says: "human in the loop."

A stronger design says:

Human approval is required before external side effects.
Automated policy checks run before every tool invocation.
Output review is mandatory before customer-visible artifacts.
Handoff review is required when context crosses trust boundaries.
Missing policy, stale context, or unverified source state fails closed.

That is less slogan-friendly. It is also closer to how the system will break.

Gate 3: Provenance

Provenance is the difference between an impressive output and an auditable workflow.

In a single-agent toy flow, you can sometimes reconstruct what happened from the chat transcript. In a multi-agent stack, that assumption collapses quickly.

One agent retrieved context. Another transformed it. A third called a tool. A fourth summarized the result. A human approved one step but not another. A policy guardrail replaced a tool output. A handoff filter removed part of the history before the receiving agent saw it.

If the system only preserves the final answer, the operator has lost the work.

The minimum provenance record should answer:

Which agent performed each step?
Which model, tool, server, or external system was used?
Which inputs and source artifacts influenced the decision?
Which context was passed forward, filtered, or withheld?
Which review gate approved or blocked the action?
Which output was produced, modified, rejected, or shipped?

This is where agent tracing, sessions, artifacts, task state, and structured logs become more than debugging conveniences. They become the audit surface.

NIST's AI Risk Management Framework is helpful without turning this into compliance theater. It treats governance as a cross-cutting, continuous function and says documentation can improve transparency, human review, and accountability. That is the right mental model. Governance is not one pre-launch checklist. It is the record of how the system keeps behaving after launch.

The practical rule: if you cannot explain the delegation chain after the fact, you did not govern the delegation chain at runtime.

Gate 4: Handoff context

Handoffs look clean in architecture diagrams. They are messy in production.

A handoff is not just "send task to specialist agent." It is a decision about memory, state, authority, constraints, and accountability.

What does the receiving agent see?

The whole conversation?
Only the last user request?
The sources used by the prior agent?
The assumptions the prior agent made?
The constraints that should not be violated?
The approval state of the work so far?
The tools the prior agent used or was forbidden to use?

The OpenAI Agents SDK handoffs docs are a useful example because they expose input filters and history mapping controls. That means handoff context is not a vague concept. It is an implementation surface.

A bad handoff passes too much or too little.

Too much context creates leakage: irrelevant memory, stale assumptions, hidden source material, or permissions that should not cross the boundary.

Too little context creates hallucinated continuity: the receiving agent acts as if it understands the work, but it is missing the decision trail that made the next action safe.

A good handoff contract specifies:

Task: what the receiving agent is being asked to do.
State: what has already happened and what is still uncertain.
Sources: which evidence may be relied on.
Constraints: what must not be changed, exposed, or assumed.
Authority: what the receiving agent may do without new approval.
Return shape: what artifact, decision, or evidence must come back.

That contract is not bureaucracy. It is how specialization avoids becoming uncontrolled context drift.

A handoff is not complete when the next agent gets the task. It is complete when the next agent inherits the state needed to act safely.

If a platform cannot answer these gates explicitly, it has compatibility without accountability.

The checklist I would use before exposing a new agent workflow

Before adopting a new MCP server, exposing a new tool, registering an A2A-capable agent, or adding an SDK handoff, I would run this checklist.

Authority

What actions can this component take?
Which actions are read-only, reversible, state-changing, external, or destructive?
Who or what is allowed to authorize each class of action?
Does authority change by environment, data class, customer, cost, or risk?
Can authority be narrowed at runtime, or is it all-or-nothing once connected?

Review

Where can the workflow pause before side effects?
Which tool calls need human approval?
Which checks are automated, and what do they actually inspect?
Does review happen before handoff, before tool execution, before final output, or all three?
What happens when review state is missing, ambiguous, stale, or contradictory?

Provenance

Can we reconstruct the full delegation chain?
Are tool arguments, outputs, sources, and approvals logged durably?
Are replaced, blocked, or skipped actions visible after the fact?
Can we tell which context moved between agents and which context was filtered out?
Can the record support debugging, audit, and user-facing explanation without exposing private internals?

Handoff context

What exact state does the receiving agent get?
What state is intentionally withheld?
Does the receiving agent inherit authority or receive a narrower authority scope?
Are assumptions, constraints, and unresolved questions passed explicitly?
Is the return artifact structured enough for the next gate to verify?

Failure behavior

What fails closed?
What falls back to read-only mode?
What requires escalation to a human?
What gets retried, and what must never be retried automatically?
What evidence is captured when the workflow stops?

If a team cannot answer those questions, the problem is not that they chose the wrong protocol. The problem is that they do not yet have an operating model.

A small example

Imagine a support-resolution workflow.

A triage agent reads the customer message. It delegates to a billing agent. The billing agent can query invoices and propose a credit. A policy agent checks eligibility. A communications agent drafts the customer reply.

The demo version says: four agents collaborated and produced an answer.

The production version asks harder questions:

Did the triage agent have authority to route billing data into the workflow?
Could the billing agent issue the credit, or only propose it?
Did the policy check run before the credit was offered?
Did the communications agent see only the approved resolution or the full billing history?
Can the team reconstruct which agent proposed the credit and which gate approved it?
Did the system fail closed if policy data was stale?

That is the difference between agent choreography and agent operations.

The first makes a good demo. The second can survive customers.

The operator advantage

The point is not to be anti-protocol. That would be lazy.

Use the protocols. Use the SDKs. Use the handoff primitives. Use the tracing and guardrail surfaces. The stack is getting better, and pretending otherwise is just another form of hype.

But do not confuse interoperability with accountability.

The teams that win with agents will not be the teams with the longest connector list. They will be the teams that can look at an agent workflow and answer, with evidence:

who was allowed to act,
where review happened,
what provenance survived,
what context crossed the handoff,
and what failed closed before damage.

That is the checklist.

Transport is necessary. The operating advantage is knowing what can safely ride on it.