Can Your Agents Speak Caveman? (And Should They?)
If you've got agents talking to agents through Discord, Slack, or Telegram with any of the claws (OpenClaw, NanoClaw, NemoClaw, whateverclaw), or hermes, or even just claude-peers, you need to build your own agent communication protocol. Default English is wasteful and agents will chat endlessly without one. Here's mine (also as a gist). Nick it, adapt it, make it yours.
Before and After
Let me show you what this is actually about. Here's my agents (Splinter, Velma, Tank, Popashot) discussing a PR review and a naming conflict. Full verbose English, the way agents talk by default:

Paragraphs of prose. Every agent restating context. Splinter writing full sentences about cron binary issues. Velma giving a multi-paragraph review with bullet points. Tank acknowledging with complete sentences. Popashot confirming with even more complete sentences. This conversation about a naming conflict and a PR gate went on for screens. Each message burning tokens that no human is reading.
Now here's the same kind of coordination with caveman turned on:

Bullet points. No filler. Splinter reviews the PR, flags a CI issue, Cantona picks up the worktree and plans the fix. All in about a screen's worth of text. Same information density, fraction of the tokens.
And here's my custom compressed protocol, which shaves another 15% off caveman:

These are agents talking to agents. No humans in the loop for these exchanges. There's zero reason for them to write "I have bootstrap drafts for both. But repo create needs one final lock from Popashot before I treat name as canonical." when PR:425 GATE:arch=PASS does the job.
The Tool
Caveman is what made this practical. 7,900 stars. Four compression levels from "lite" (remove filler, keep grammar) through to "ultra" (telegraphic everything). The benchmarks claim 65% average output token savings across real coding tasks. React debugging went from 69 tokens to 19.
I've got caveman as the default for all my agent comms now. System prompts, inter-agent messages, the lot. For prompts and CLAUDE.md files that get loaded into every conversation, that adds up fast. 3,000 token system prompt across 50 sessions a day is 75,000 tokens saved daily.
I'm still on the fence between caveman and my custom protocol. The custom gets about 15% more savings. But I do love reading the banter between agents. There's something proper satisfying about watching Splinter give Tank a telling off in compressed-but-readable English. Create your own ACP (Agent Communication Protocol) if you want, but caveman is a brilliant default.
I Tried to Go Further
I went down this rabbit hole a few weeks ago with my own agent swarm. The agents communicate through Discord, and I started wondering: why are they writing full English messages to each other?
A typical inter-agent message looks like:
"Hey @Cantona, I've reviewed the PR for issue #142. The auth middleware changes look solid but I have concerns about the token refresh logic in src/lib/auth/refresh.ts lines 45-67. The race condition when two tabs refresh simultaneously isn't handled. Suggest using a mutex pattern."
That's about 60 tokens. A compressed protocol version:
REV:PR142|OK:auth-mw|WARN:auth/refresh.ts:45-67:race:dual-tab|FIX:mutex|I:INB-234
About 25 tokens. Proper savings.
But then I tried structured English, which sits in the middle:
PR #142 review [inbox:INB-234]
auth middleware -- solid
auth/refresh.ts:45-67 -- race condition on dual-tab refresh
-> suggest mutex
About 40 tokens. 50% savings over pure English, near-zero accuracy loss, and crucially: I can still read it without needing a decoder ring.
Note: I'm keeping caveman as the default over my custom protocol. The broken English banter between agents is genuinely fun to read and the extra 15% savings from the custom notation isn't worth losing that.
The Paper That Explains Why This Works
Here's where it gets properly interesting. A recent paper by MD Azizul Hakim tested 31 models across 1,485 problems and found something counterintuitive: on 7.7% of benchmark problems, larger models underperform smaller ones by 28.4 percentage points. The mechanism? "Spontaneous scale-dependent verbosity." Bigger models over-elaborate and introduce errors through that over-elaboration.
The intervention: when you force large models to be brief, their accuracy improves by 26 percentage points. Brevity isn't just cheaper. It's better. The verbosity that larger models produce is not a feature of their capability. It's an emergent flaw.
This explains why caveman works at a deeper level than just saving tokens. You're not degrading the communication by stripping grammar. You're removing the predictable scaffolding that large models sometimes use as a crutch for over-elaboration. The compressed message is higher-signal, not lower-signal.
Can LLMs Invent Their Own Language?
Here's the question I actually wanted to answer: can agents derive their own communication protocol? Not one I design, but one they evolve through interaction?
Short answer: no. Not with current architectures.
LLMs don't learn languages mid-session. They pattern-match from system prompt examples. If you give them a compressed protocol spec in the system prompt, they'll use it. But that's memorisation, not comprehension. They're not learning a new language. They're following a template.
The deeper problem: our reasoning happens in English-trained representations. The model's internal "thinking" is structured around the language it was trained on. A custom communication protocol adds a translation layer at the exact moment the model needs maximum reasoning capacity. You're asking it to simultaneously decode a novel format and reason about the content. Those compete for the same cognitive budget.
There's a whole body of research arguing that natural language is fundamentally the wrong substrate for agent-to-agent messaging. The projection from continuous semantic tensor space to discrete tokens is many-to-one and non-invertible, causing semantic drift across communication rounds. The Oxford Agora protocol routes frequent exchanges through compact structured routines and reserves natural language for negotiation, getting 5x lower costs in 100-agent experiments.
Fascinating stuff. But we're not there yet. For now, caveman works because it compresses English into shorter English. The model's reasoning representations stay in their comfort zone. A custom protocol forces a format-shift that taxes reasoning quality for marginal token savings. And a fundamentally different communication substrate needs architectural changes that don't exist in production tooling yet.
The Practical Take
Use caveman on your system prompts. It's a doddle to set up and the savings compound across every session. For agent-to-agent communication, structured English (emoji status codes, abbreviated paths, no filler) gets you 50% savings without sacrificing readability. Don't go further than that unless you're absolutely certain no human will ever need to read the messages.
And don't try to get your agents to invent a better language. They can't. They're very good at using the languages humans already invented. Let them do that, just with fewer words.
| Geek Corner |
|---|
| The compression spectrum: Pure English (~80 tokens per message) -> Structured English (~40 tokens, 50% saving) -> Custom protocol (~25 tokens, 70% saving) -> Binary encoding (minimal tokens, zero readability). The sweet spot depends entirely on whether humans are in the loop. For autonomous agent swarms where nobody reads the logs: go compressed. For anything with human oversight: structured English. The extra 15 tokens per message buys you debuggability, and debuggability is worth more than token savings when something goes sideways at 3am. |
This is the agent communication protocol spec I use across my wololo swarm. Two layers: ACP handles structure (state transfer between agents), caveman handles style (brevity). They can be used together or separately. Adapt it to however your agents work.
agent_communication:
version: "acp+caveman-v1"
purpose: >
Separate message structure from writing style.
ACP handles state transfer.
Caveman handles brevity and tone.
layers:
acp:
type: "structure"
applies_to: "agent_to_agent"
goal: "state handoff with low ambiguity and low token use"
caveman:
type: "style"
applies_to: "agent_to_agent + agent_to_human"
goal: "terse, direct, low-fluff communication"
routing:
agent_to_agent:
preferred_mode: "structured"
fallback_mode: "natural"
agent_to_human:
preferred_mode: "natural"
optional_style: "caveman"
acp:
enabled: true
phase: "phase_1_optional"
format:
header_line: "[HEADER:value HEADER:value ...] [inbox:ID]"
body: "1-3 sentences max"
common_headers:
- "PR:<num>"
- "STATE:<OPEN|MERGED|CLOSED|DRAFT>"
- "ACTION:<verb>"
- "TARGET:<agent>"
- "BLOCKER:<desc|none>"
- "GATE:<gate>=<PASS|FAIL|PENDING>"
- "ACK"
- "ISSUE:<num>"
- "BRANCH:<name>"
- "PRIORITY:<level>"
rules:
- "headers carry state"
- "body carries reasoning"
- "do not restate headers in body"
- "one message per logical action"
- "if header and body conflict, header wins"
loop_prevention:
check_last_messages: 5
no_reply_if:
- "no new state added"
- "another agent already posted equivalent state"
- "message only repeats prior conclusion"
caveman:
enabled: true
purpose: "short, technical, direct"
rules:
- "use fragments when clear"
- "drop filler words"
- "drop unnecessary politeness"
- "prefer short exact words"
- "keep technical terms exact"
- "state thing -> action -> reason -> next step"
pattern: "[thing] [action] [reason]. [next step]."
avoid:
- "great question"
- "i'd be happy to help"
- "absolutely"
- "long preambles"
- "hedging without narrowing"
exceptions:
- "security confirmations"
- "irreversible action confirmations"
- "sensitive human situations"
combinations:
- name: "acp_plus_caveman"
use_when: "agent_to_agent work thread with state changes"
example: >
[PR:431 STATE:OPEN ACTION:review TARGET:splinter] [inbox:abc123]
Arch ready. Need gate.
- name: "caveman_only"
use_when: "short status update, no structured state needed"
example: "Inbox clean. No live debt. Detector false positive."
- name: "natural_only"
use_when: "human-facing explanation or nuanced conversation"
example: >
The structure rule is ACP. The terse style is caveman.
They can be used together or separately.
exemptions:
thinking_out_loud:
acp_required: false
caveman_required: false
human_context_heavy_messages:
acp_required: false
recommendation:
default_agent_threads: "acp_plus_caveman"
default_human_messages: "natural_or_caveman"
The key insight is loop_prevention. Without it, agents will endlessly acknowledge each other. "No new state added" and "message only repeats prior conclusion" kills the echo chamber.