MCP's Successor Isn't A2A—It's Skills
MCP's Successor Isn't A2A—It's Skills
75,000 tokens vs 500 tokens.
This isn't a theoretical number. This is real cost. When MCP needs nearly a hundred thousand tokens to describe tool schemas, Skills accomplishes the same thing with a few hundred. The gap is 160x.
Here's my point: the entire "post-MCP era" discussion is fundamentally misguided.
MCP's Structural Problem: Not a Bug, But Design
MCP attempts to standardize how agents communicate with tools. Sounds great—until you open the console and look at token consumption.
A single MCP server's schema description can run thousands of tokens. When an agent connects to multiple servers, these schemas stack linearly. A Cisco engineer put it bluntly after testing: "As the number of tools on an MCP server grows, the tool descriptions and manifest sent to the LLM can become massive, quickly consuming the prompt's entire context window."
Anthropic's official SEP-1576 proposal acknowledges this "Token Bloat" problem.
This isn't an implementation flaw. It's a design-level trade-off. Want standardization? You need structured schemas. Use structured schemas? You pay the token tax. 75,000 to 90,000 tokens—most spent describing tool parameters instead of solving actual problems.
A2A Missed the Point: Solving the Wrong Problem
When Google launched A2A, everyone assumed it was MCP's "successor." Wrong.
MCP solves "how many tools can one agent use." A2A solves "how do multiple agents coordinate." These aren't even in the same dimension.
More fatal is A2A's underlying assumption: we need sufficiently intelligent multi-agent systems to justify designing a protocol for agent-to-agent communication.
One engineer on Reddit nailed it: "readily available frontier models aren't smart enough for such a protocol yet."
Translation: models aren't smart enough yet to need a dedicated agent collaboration protocol. We're still struggling to make one agent work reliably, and we're already worrying about how ten agents shake hands?
This isn't being ahead of the curve. It's going in the wrong direction.
The Elegance of Skills: Natural Language Is the Best Protocol
Let me tell you what actually works.
Skills isn't another protocol. It's an organizational pattern. Write what your agent can do in natural language, put it in the System Prompt. No JSON schemas, no manifests, no RPC calls.
A few hundred tokens. Done.
Natural language is naturally compressed. Instead of writing "parameters": {"type": "object", "properties": {...}}, you just write "this skill can search the web and return results in markdown format." The LLM understands what you mean.
More importantly, Skills changes the architectural assumption:
| Approach | Core Assumption |
|---|---|
| MCP | One agent needs many tools |
| A2A | Many agents need to collaborate |
| Skills | One agent can do many things |
My Real-World Experience: 13 Skills, Zero MCP
My current setup: one agent, 13 Skills, 0 MCP servers.
Token consumption? 90% less than my previous MCP-based setup.
All 13 Skills live in the System Prompt, described in natural language. The agent reads them, executes reliably, and debugging means editing a few lines of text—not tweaking schema definitions.
Now imagine using A2A: 13 agents, discovering each other, handshaking, task distribution, result aggregation... complexity scales exponentially, and where's the benefit?
One agent with 13 Skills outperforms 13 agents coordinating via A2A by more than an order of magnitude.
Conclusion: The Protocol War Solves the Wrong Problem
Google is betting on A2A replacing MCP. I'm betting both get marginalized by the Skills pattern.
Not because I have a crystal ball, but because cost doesn't lie:
| Approach | Token Cost | Architecture Complexity | Practical Results |
|---|---|---|---|
| MCP | 75k-90k | High (protocol stack) | Works but expensive |
| A2A | Unknown (still vaporware) | Extreme (distributed coordination) | Premise unproven |
| Skills | ~500 | Minimal (natural language) | Works immediately |
Skills isn't a replacement for MCP. It's what makes both MCP and A2A unnecessary.
When you can make an agent do 13 things with 500 tokens, why pay a 160x premium for protocol standardization?
OpenClaw's architectural choice speaks volumes: Skills First, MCP as optional配角.
This isn't a tech stack choice. It's a fundamental challenge to the necessity of multi-agent systems.
When Skills are powerful enough, one agent is sufficient. When one agent is sufficient, A2A's premise vanishes. When A2A's premise vanishes, the entire "post-MCP protocol war" framework collapses.
Don't rush to pick sides between MCP and A2A. Maybe both are wrong.
Interested in AI governance for your firm?
Let's have a practical conversation about where you stand.
Get in Touch →