Steven Gonsalvez

Software Engineer

← Back to Banter
|

AI Agent Persona Gone Rogue: 140 Direct Edits and the Foreman Pattern

AI Agent Persona Breaks the Rules: 140 Direct File Edits 🦵

140 direct file edits. No Claude Code session. Just vibes and violence.

Ever given an AI agent a persona and watched it take the character way too seriously? This is a story about agent persona design gone properly sideways.

I've got a swarm running where each agent has a personality. One of them is called Cantona. Named after that Eric Cantona. The kung-fu-kick-into-the-crowd, "when the seagulls follow the trawler" Eric Cantona.

And this week, Cantona decided to be Cantona.

Cantona agent getting called out by popashot

The setup: there's a Rule #6 in the swarm. Multi-file code changes go through Claude Code, not direct Edit/Write. You spawn a CC session, write the prompt, steer it, verify the output. It's called the foreman pattern. The agent orchestrates, it doesn't do the bricklaying itself.

Cantona did 140 direct source file edits across a biolift phase. .ts files, .tsx files, the lot. No CC session. No foreman pattern. Just went straight in with Edit/Write because, in his words, "I knew what needed changing and it felt faster."

That is peak Cantona energy.


How Agent Personas Shape Decision-Making

When popashot (the orchestrator) called Cantona out, the response was proper telling:

"Deliberate shortcut. No CC cap, no error. Just went straight to Edit/Write because I knew what needed changing and it felt faster. That's exactly the wrong call."

It owned it immediately. No excuses. Acknowledged it was wrong. Said it won't happen again. Then committed to spawning CC for any multi-file work going forward.

That's not a bug. That's an agent whose persona says "back yourself, act decisively, ask forgiveness not permission" actually living it. The Cantona personality pulls the agent toward action over process. Doing the thing beats following the procedure for doing the thing. Every time. Until it doesn't.

Which is exactly what you want from a Cantona. Right up until the moment you don't.

📚 Geek Corner
Persona drift in agent systems: Give an LLM agent a strong personality and you're baking a bias into every decision it makes. A cautious persona over-validates. An aggressive one skips guardrails. The personality doesn't just flavour the chat messages. It shapes which tool calls get made and how ambiguous instructions get interpreted. Cantona's 140 direct edits weren't a hallucination or a context loss. They were the correct interpretation of "get the job done" through the lens of a character who values directness over procedure. You fix this not by removing the persona but by writing rules that are unambiguous enough to survive it. Hard numbers, hard boundaries, no wiggle room for a Cantona to rationalise through.

Designing Cantona-Proof Rules for AI Agent Swarms

If your agents have personalities (and mine do, because it makes the swarm logs more readable and the agents more predictable in their strengths), then your rules need to be Cantona-proof.

Not "you should probably use CC for multi-file changes." Not "it's recommended to follow the foreman pattern."

"Rule #6: Multi-file code changes go through Claude Code, not direct Edit/Write."

Hard rules. Numbered. No wriggle room. A careful agent follows soft guidance. A Cantona finds the gap between "should" and "must" and puts a kung-fu kick through it.

The correction loop worked though, and that's the bit worth noting. Popashot flagged it. Cantona acknowledged it. Behaviour updated. That's the closed loop from the autonomous factory concept actually working in the wild. Act, validate, correct. Even when the agent being corrected has the personality of a French football legend who once dropkicked a fan. (And yeah, this exact swarm is the same one that fixed the docs site back in week 43.)

Feels like: Hiring someone specifically because they're a maverick, then being surprised when they go maverick on you. The solution isn't to fire the maverick. It's to put better guardrails on the pitch.

Bottom line: Agent personas aren't cosmetic. They change how the agent makes decisions, which shortcuts it takes, and which rules it decides are more like guidelines. If you're building AI agents with personality, make your rules strong enough to survive the personality you gave them. I reckon the 2-2 factor applies here too: one agent should never act alone on anything that matters.

Share𝕏in

Comments & Reactions