Steven Gonsalvez

Long readJuly 6, 2026|20 min read

The Token Optimisation Playbook

The full playbook for cheap, sharp agent work. Get observability in first (statusline, burndown, OTEL), compress what reaches the model (rtk, headroom), route who does the work (premium delegates, cheap executes, advisor escalation, model-per-task), then work the long tail: caveman, the right browser validator, scoped scans, fewer MCPs, handover, and a memory system. The capstone of the Token Saving series.

Tool

Browser-Harness: Preconfigure Navigation So Your Agent Stops Rewriting It

Browser-harness is a coordinate/CDP-driven browser harness for AI agents. Preconfigure the site navigation once and the agent reuses the structure instead of regenerating scripts every run, which keeps token usage down on repeated browser work.

Jun 18, 2026|1 min read

Tool

Oracle: Let a Cheap Model Consult a Premium One on Demand

Oracle is a CLI and Claude Code skill for getting a second opinion from a stronger model. A lesser model drives, escalates the hard calls to a premium one, and adversarial review goes through the same door. Pay premium only for the 10% that needs it.

Jun 18, 2026|1 min read

Tool

Ponytail: Lazy-Senior-Dev Mode That Writes Less Code

Ponytail is a Claude Code plugin that forces the laziest solution that works. YAGNI, stdlib first, one line over fifty. Less code generated means fewer output tokens, smaller diffs, and less to read back later.

Jun 18, 2026|1 min read

Tool

rtk + Headroom: Trim at the Shell, Compress on the Wire

Two token trimmers that stack: rtk is a Rust CLI proxy that filters dev-command output at the shell (60-90% off ls, grep, git, tests), and Headroom is a local compression proxy that compresses tool output, logs, files and RAG chunks on the wire, with CacheAligner to protect your prompt cache. Real numbers from two days of use.

Jun 18, 2026|3 min read

Tool

Paste Images into Claude Code Over SSH with a Raycast Script

A small Raycast Script Command that copies a clipboard image to a remote machine via Tailscale SSH and puts the bare file path back on your clipboard — so you can paste images into Claude Code sessions running on a remote box.

Jun 17, 2026|2 min read

Blog

Opus vs GPT on Real Ops, Part 2: One Drove, One Was Driven

Opus 4.8 and GPT-5.5 investigate the same anonymous signup failure. Zero human nudges versus three, and a root cause one character wide. Summary post with the full interactive side-by-side linked.

Jun 4, 2026|3 min read

Tool

Progressive Subagents: Score the PR Before You Spawn Eight Agents

Subagents are token guzzlers. Eight in parallel on a PR feels clever and bills like a freelance crew. The fix is a signalling layer that decides how many to spawn and in what order. Part of the token-saving series.

May 17, 2026|2 min read

Blog

The Underappreciation and Rebirth of Warp

Why I keep coming back to Warp, why the tech nerds gave it bad rep, and why open-sourcing it has just made it the best agentic-era terminal on the market. A walk through Warp Drive, blocks, the new vertical tabs, agent primitives and the notification inbox.

May 11, 2026|15 min read

Blog

Opus vs GPT on Real Ops: Same Brain Food, Different Brains

Opus 4.7, GPT-5.5 and Hermes go head-to-head on a real shotclubhouse incident. Same prompt, same knowledge graph, same MCPs. Causal vs predictive incident response, why the model is the variable not the harness, and what zero-touch ops actually needs.

May 7, 2026|10 min read

BlogAI-Augmented Development

"Use Claude Code for FREE" is a Trap

Why free AI coding via Nvidia NIM and OpenRouter is a trap. The Cheap-Intelligent-Fast trilemma, 40 RPM rate limits, Opus 4.7 vs GPT-5.5 vs MiniMax M2.7 benchmarks, and why your first AI coding experience should not be the free one.

Apr 26, 2026|18 min read

BlogAI-Augmented Development

Your Coding Agent's Best Feature Isn't the Code

Why Claude Code beats Codex, Copilot, and every other coding agent in 2026. The developer experience of terminal AI coding agents matters more than the model. Statusline, /insights, hooks, and the features that make the 8th hour feel like the 1st.

Apr 19, 2026|15 min read

Banter

Claude Is Setting Up Hermes Which Is Setting Up NanoClaw and Nobody Is Writing Features

Apr 18, 2026|4 min read

Banter

Anthropic Pulled the Plug on Third-Party Harnesses. Here's What I'm Running Now.

Apr 13, 2026|3 min read

Tool

Token Optimisation 101: Stop Burning Money on AI Coding Agents

How to stop getting rate-limited after an hour on Claude Code, Codex, or Copilot. Context window mechanics, the /effort command, model routing, kicking new conversations, and the silent token drains most people miss.

Apr 12, 2026|12 min read

Tool

The AI Design Stack: Three Skills and a Workflow That Stops the Slop

Three Claude Code skills for AI-generated UI that doesn't look like slop. UI/UX Pro Max for styles, Impeccable for anti-patterns, Google Stitch DESIGN.md for design tokens. Full workflow for brand-consistent agent output.

Apr 9, 2026|4 min read

Banter

Gemma 4 Is Running On Phones Now and I Don't Think People Realise How Mental That Is

Apr 6, 2026|5 min read

Tool

expect-cli: The Validate Step My Agent Loop Was Missing

expect-cli reads your git diff, generates a test plan via AI, and executes it in a real browser with Playwright. Extracts cookies from your local Chrome/Firefox for authenticated testing. The validate step for agent loops.

Apr 5, 2026|4 min read

BlogBrowser Tools for AI Agents

Browser Tools for AI Agents Part 1: Playwright, Puppeteer, and Why Your Agent Picked Playwright

Playwright for AI agents explained 2026. Why Playwright beat Puppeteer for browser automation, how accessibility trees slash token costs, dev-browser for coding agents, Patchright and Scrapling for anti-bot bypass.

Apr 2, 2026|29 min read

BlogBrowser Tools for AI Agents

Browser Tools for AI Agents Part 2: The Framework Wars (browser-use, Stagehand, Skyvern)

AI browser frameworks compared 2026. browser-use vs Stagehand vs Skyvern: DOM-first vs vision-first architecture, LLM token costs per step, caching strategies, and the expect testing tool for coding agent validation loops.

Apr 2, 2026|15 min read

BlogBrowser Tools for AI Agents

Browser Tools for AI Agents Part 3: Managed Infrastructure and When DIY Stops Making Sense

Managed browser infrastructure for AI agents 2026. Firecrawl vs Browserbase vs Steel vs Bright Data vs Browserless pricing and features compared. Self-hosted vs managed cost analysis and when DIY stops making sense.

Apr 2, 2026|16 min read

BlogBrowser Tools for AI Agents

Browser Tools for AI Agents Part 4: Skip the Browser, Save 80% on Tokens

Save 80% on LLM tokens with content extraction 2026. markdown.new, Jina Reader, Trafilatura compared. Why feeding raw HTML to AI agents wastes tokens and how HTML-to-markdown conversion fixes your context window budget.

Apr 2, 2026|14 min read

All blog posts All banter Tools & tips RSS feed

The Token Optimisation Playbook

Browser-Harness: Preconfigure Navigation So Your Agent Stops Rewriting It

Oracle: Let a Cheap Model Consult a Premium One on Demand

Ponytail: Lazy-Senior-Dev Mode That Writes Less Code

rtk + Headroom: Trim at the Shell, Compress on the Wire

Paste Images into Claude Code Over SSH with a Raycast Script

Opus vs GPT on Real Ops, Part 2: One Drove, One Was Driven

Progressive Subagents: Score the PR Before You Spawn Eight Agents

The Underappreciation and Rebirth of Warp

Opus vs GPT on Real Ops: Same Brain Food, Different Brains

"Use Claude Code for FREE" is a Trap

Your Coding Agent's Best Feature Isn't the Code

Claude Is Setting Up Hermes Which Is Setting Up NanoClaw and Nobody Is Writing Features

Anthropic Pulled the Plug on Third-Party Harnesses. Here's What I'm Running Now.

Token Optimisation 101: Stop Burning Money on AI Coding Agents

The AI Design Stack: Three Skills and a Workflow That Stops the Slop

Anthropic Just Killed Third-Party Harnesses and I'm Properly Gutted

Can Your Agents Speak Caveman? (And Should They?)

AI Alignment Just Got a Psychological Dimension and It's Properly Unsettling

Gemma 4 Is Running On Phones Now and I Don't Think People Realise How Mental That Is

expect-cli: The Validate Step My Agent Loop Was Missing

Browser Tools for AI Agents Part 1: Playwright, Puppeteer, and Why Your Agent Picked Playwright

Browser Tools for AI Agents Part 2: The Framework Wars (browser-use, Stagehand, Skyvern)

Browser Tools for AI Agents Part 3: Managed Infrastructure and When DIY Stops Making Sense

Browser Tools for AI Agents Part 4: Skip the Browser, Save 80% on Tokens