Protocols Make Agent Exchange Legible. Governance Still Lives Above the Protocol.

Agent protocols are doing something useful: they are making exchange legible.

That matters. A system where every agent integration is a one-off pile of custom prompts, hidden assumptions, and bespoke API glue does not scale. If agents are going to discover one another, hand off tasks, expose capabilities, exchange artifacts, stream status, or call tools, the wire format cannot be vibes.

This is why the current protocol work is worth taking seriously. Agent2Agent defines a common shape for agent discovery, task execution, messages, artifacts, task states, and transport bindings. MCP defines a host/client/server pattern for connecting AI applications to external context and capabilities like resources, prompts, and tools.

That is real progress.

It is also not governance.

The mistake is treating interoperability as readiness. Once two agents can talk, or once a model can invoke a tool through a standard interface, the hard question does not disappear. It gets sharper:

Who is allowed to do what, on whose behalf, with which context, under what review path, and with what evidence left behind?

A protocol can make the exchange legible. It cannot decide the operating model for you.

The protocol layer and the governance layer are different layers

The cleanest way to think about this is to separate two questions.

Protocols make the handoff visible. The governance layer decides whether the handoff should happen, under what controls, and with what evidence.

The protocol layer asks:

How does one participant describe its capabilities?
How are tasks, messages, artifacts, and states represented?
How does an AI application discover and invoke tools or retrieve context?
What transport, schema, and metadata make the interaction machine-readable?

The governance layer asks:

Who has authority to approve an action?
Which actions need human review before execution?
What context may be used, retained, or passed forward?
Which sources are trusted enough to influence a decision?
What happens when an agent needs more input, more permission, or escalation?
Who is accountable when the handoff chain produces a bad result?

These are not competing layers. They need each other.

Without protocol structure, agent systems become brittle. Without governance structure, they become ambiguous. The first problem is engineering drag. The second is operational risk.

The agent stack needs both, but they should not be confused.

What protocols actually give you

A2A-style protocols are useful because they give agents a shared way to present and exchange work. The official A2A specification describes pieces like Agent Cards, task objects, task states, messages, parts, artifacts, streaming events, push notifications, and transport bindings. In plain language: an agent can advertise what it does, another system can send it work, and both sides can reason about the status and output in a more standardized way.

That is a much better substrate than every team inventing a private handoff contract from scratch.

MCP is useful on a different axis. It standardizes how AI applications connect to external context and capabilities. Hosts, clients, and servers create a structure where applications can consume resources, prompts, and tools. Tools can query databases, call APIs, or perform computations. Resources can expose context. Prompts can package reusable workflows.

Again: useful substrate.

The practical value is not that these protocols magically make agents safe. The value is that they reduce the amount of custom integration required before a team can even ask the real operating questions.

Protocol adoption turns agent exchange from hidden glue into a visible surface.

That is the point.

The primitive can be standardized while the decision remains local. That is where most agent protocol conversations get sloppy.

What protocols do not decide

A task state can say an agent needs input or authorization. It does not decide who inside your company is authorized to provide it.

A tool schema can describe a database write operation. It does not decide whether the model should be allowed to call it without review.

A resource can expose context. It does not decide whether that context can be retained in memory, passed to another agent, summarized into a customer record, or used in a later decision.

An Agent Card can describe capabilities and security metadata. It does not decide whether the upstream agent should trust the downstream agent for a regulated workflow.

This is where teams get sloppy. They see a standard interface and mentally promote it into a standard operating model.

But the protocol is not your approval policy. It is not your provenance model. It is not your escalation path. It is not your audit trail. It is not your least-privilege design.

The MCP tools documentation is clear enough about this boundary: tools are model-controlled, but the protocol does not mandate a specific user interaction model. The documentation also recommends human ability to deny tool invocations and clear UI/confirmation flows.

That is the right posture. The protocol can expose the primitive. The product and operating layer still have to decide the control.

Security guidance makes the same point from another angle. MCP's own security material calls out risks like confused deputy problems and token passthrough. OWASP's LLM application guidance names categories like prompt injection, sensitive information disclosure, supply chain risk, excessive agency, and system prompt leakage.

Those are not arguments against protocols. They are arguments against confusing connectivity with accountability.

The operator checklist before adopting an agent protocol stack

If a team is evaluating an agent stack, I would not start with, "Does it support A2A?" or "Does it expose MCP tools?"

I would start with these questions.

The protocol decision is downstream of the operating decision. First define authority, review, context, provenance, escalation, and accountability.

1. What authority is being transferred?

Every handoff transfers something: a request, context, state, judgment, tool access, or permission to continue.

Be specific. Is the downstream agent allowed to recommend, draft, retrieve, mutate, approve, spend, notify, delete, deploy, or commit? Those are not the same action class.

If the protocol says a task is in an authorization-required state, who resolves that state? The original user? A human reviewer? A service account policy? A supervisor agent? A role-based approval rule?

The answer belongs above the protocol.

2. Which tool calls require review?

A tool catalog is not a control model.

A read-only lookup, a customer-visible email, a production database write, and a payment action should not share the same approval path just because they are all exposed as callable tools.

Teams need an action taxonomy before they need a larger tool list. Which tools are safe for autonomous use? Which require confirmation? Which require dual review? Which should never be model-invoked directly?

If you cannot answer that, the protocol made the dangerous part easier to reach.

3. What provenance travels with the work?

Agent systems need source discipline. When one agent passes an artifact to another, the receiving side should know where the instruction came from, which sources supported the claim, what context was used, and what assumptions were introduced.

Otherwise, a bad upstream interpretation becomes downstream confidence.

This is especially important with retrieval, browser actions, customer data, internal docs, and generated summaries. A clean handoff should carry enough provenance for the next actor to verify the work instead of inheriting it blindly.

4. What memory is allowed to persist?

The most under-discussed governance surface is memory custody.

It is one thing for an agent to use context during a task. It is another for that context to become durable memory, future personalization, training data, or cross-agent state.

Before connecting agents through shared protocols, decide what can be retained, where it lives, who can read it, how it expires, and which workflows are forbidden from writing durable memory at all.

Without that policy, every successful handoff becomes a possible data boundary failure.

5. What happens when the handoff is incomplete?

Real work does not move in perfect packets.

An agent may need missing context. A tool may require credentials. A downstream system may reject a request. A human may need to approve a risky action. A source may be ambiguous.

The protocol may represent states like needing input or authorization. Your operating layer has to define the playbook: pause, ask, escalate, retry, route to a human, downgrade the action, or fail closed.

The difference between a demo and a production system is often what happens at the interruption point.

6. Who owns the final decision?

Multi-agent systems make responsibility easy to blur.

The planner suggested it. The tool agent executed it. The reviewer summarized it. The memory agent retained it. The orchestration layer routed it.

That chain can sound sophisticated while making accountability disappear.

Before adopting any protocol stack, identify the accountable owner for each workflow class. Not just the component. The person, team, policy, or system-of-record that owns the decision boundary.

If no one owns the boundary, the protocol will not create ownership for you.

The next bottleneck is not communication. It is accountable operation.

I am bullish on protocol work because it makes the agent stack less imaginary.

A2A and MCP move the conversation away from hand-wired demos and toward shared interfaces for agents, tools, context, tasks, and artifacts. That is necessary infrastructure. Serious systems need legible exchange.

But the more legible the exchange becomes, the more obvious the missing layer gets.

The next year of agent systems will not be won by teams that merely connect the most agents to the most tools. It will be won by teams that can answer the governance questions without slowing every workflow to a human bottleneck.

That means designing authority, review, provenance, memory, escalation, and accountability as product architecture. Not policy PDFs floating somewhere outside the system. Not prompt instructions pretending to be controls. Not a dashboard that shows activity without ownership.

The protocol should make the handoff visible.

The operating layer should make the handoff trustworthy.

That is the standard worth building toward.

Sources

Agent2Agent Protocol specification: https://a2a-protocol.org/latest/specification/
Model Context Protocol architecture: https://modelcontextprotocol.io/docs/learn/architecture.md
MCP tools specification: https://modelcontextprotocol.io/specification/2025-11-25/server/tools.md
MCP security best practices: https://modelcontextprotocol.io/docs/tutorials/security/security_best_practices.md
OWASP Top 10 for LLM Applications 2025: https://genai.owasp.org/resource/owasp-top-10-for-llm-applications-2025/