Browser Tools for AI Agents Part 3: Managed Infrastructure and When DIY Stops Making Sense
There's a moment in every engineer's life where they're two weeks into building something, surrounded by Docker configs and cron jobs and a spreadsheet tracking proxy rotation, and they think: "Someone must sell this as a service." And someone does. Several someones, actually. The question is whether paying them makes you clever or lazy.
I've been on both sides of that line. Ran my own headless Chrome fleet on Hetzner for a couple of months, complete with a homegrown retry queue and a Grafana dashboard I was unreasonably proud of. Worked a treat until Cloudflare changed their fingerprinting and the entire pipeline went sideways. Was looking at managed options by the end of the week.
This is Part 3 of the browser tools series. Same framing throughout: how do you give your coding agents the right browser infrastructure for a closed loop of research, build, and validate? Not consumer browsing, not manual QA. Agent-driven validation of what you're shipping. Part 1 covered the low-level tools. Part 2 went deep on frameworks and SDKs. Now we're talking about the services that will run Chrome for you, for money, and whether that money is well spent.
The Five Contenders
Five platforms worth taking seriously in this space, each with a different angle on the same problem. I'll go through them one at a time, then we'll do the maths on when DIY actually stops being the sensible choice.
Firecrawl: Content Ingestion for the LLM Age
Firecrawl is the one your AI agent probably already knows about. 103,000 GitHub stars. Apache 2.0 licence. It positions itself as "the web data API for AI" and honestly that's a fair description.
What Firecrawl does differently from rolling your own Playwright scraper is handle the entire pipeline from URL to LLM-ready markdown. You give it a URL, it renders the JavaScript, strips the boilerplate, and hands you back clean structured content that won't burn half your context window on nav bars and cookie banners. P95 latency of 3.4 seconds, claims 96% web coverage. In my experience the coverage number is roughly accurate for English-language content, drops a fair bit for sites with aggressive anti-bot or heavy client-side rendering behind auth walls.
The pricing uses a credit system and this is where it gets a bit sneaky. The $16/month Hobby plan gives you 3,000 credits. One credit per page for basic scraping. Sounds decent until you turn on "Enhanced Mode" for anti-bot sites, which costs 5 credits a pop, and then add JSON structured output which doubles it again. So your 3,000 credit plan is suddenly 333 pages if you need the good stuff. The $83/month Standard plan with 100,000 credits is where it starts making economic sense for anything beyond a weekend project.
There's also an /extract endpoint that uses a separate token-based billing. Completely independent from the credit system. Two billing models in one product. Classic SaaS move.
| Geek Corner |
|---|
| The self-hosted caveat: Firecrawl's open-source version on GitHub is a subset, not a mirror. You get the basic scraping engine but you lose their proprietary "Fire-engine" that handles the hard anti-bot stuff. No proxy rotation. No browser sandbox. No agent mode. No dashboard. You're essentially getting a slightly fancier Playwright wrapper. It's useful for simple sites and saves you writing your own HTML-to-markdown converter, but if you're self-hosting because you thought you'd get the cloud features for free, you'll be disappointed. You also need to bring your own LLM API key for the extraction features, which adds another cost nobody mentions on the landing page. |
Feels like: Ordering a flat white at a cafe. Yes, you could make one at home with a kettle and an AeroPress. But theirs has the fancy milk art and they've already cleaned up.
Bottom line: Firecrawl is brilliant for bulk content ingestion where you need markdown or structured JSON at the other end. Less suitable as a general-purpose browser automation tool. If your agent needs to click buttons, fill forms, navigate multi-step workflows, you'll outgrow it quickly.
Browserbase: The Stagehand Company
Browserbase is interesting because it's really two products wearing a trenchcoat. The first product is the remote browser infrastructure. Spin up Chrome instances in the cloud via API, connect with Playwright or Puppeteer, do your thing. The second product is Stagehand, their open-source AI browser automation framework with 50k+ GitHub stars and half a million weekly downloads. Stagehand gives you natural-language browser control with three atomic primitives (act, extract, observe) and an agent mode that chains them together.
The strategy is obvious and honestly quite smart: give away the framework, charge for the infrastructure. Stagehand works locally with your own browser, but the moment you need concurrent sessions, stealth mode, session recording, or proxy rotation, you're pointed at Browserbase's cloud. It's the Red Hat model adapted for browser automation.
Pricing starts free with 1 browser hour (which is really just enough to verify it works). The $20/month Developer plan gets you 100 hours. The $99/month Startup plan gets 500 hours, 100 concurrent browsers, 5GB of proxy data. Overage is $0.10 per browser hour. That's roughly $0.0017 per minute of browser time once you're past the included hours.
One thing I appreciate about Browserbase is the CAPTCHA solving is included free on all plans. No per-solve charges. With Steel and Browserless, CAPTCHA solving is either extra or limited, which adds up fast on sites that throw CAPTCHAs every third page load.
Feels like: Renting a really nice workshop instead of building one in your garage. The tools are there, the ventilation works, someone else sweeps up. You just have to accept it's not yours.
Bottom line: If you're building an AI agent that needs to interact with the web in complex ways and you don't want to manage infrastructure, Browserbase with Stagehand is the most ergonomic option going. The pricing is fair below 500 hours. Above that, start doing your sums.
Steel Browser: The Self-Hostable Underdog
Steel is what you reach for when you look at Browserbase's pricing page and think "I could run that myself." Because you actually can. Apache 2.0 licence, 6.8k stars, and a Docker one-liner that genuinely works:
docker run -p 3000:3000 -p 9223:9223 ghcr.io/steel-dev/steel-browser
That gives you a browser API server with session management, cookie persistence, anti-detection plugins, Chrome extension support, and a debugging UI. On your own hardware. No monthly bill beyond whatever you're paying for the VPS.
The cloud offering starts with a free tier of 100 browser hours per month, which is notably more generous than Browserbase's 1 hour. The $29/month Starter and $99/month Developer plans include credits at decreasing per-hour rates ($0.10 down to $0.05/hour at the Pro tier). CAPTCHA solving is billed separately at $3-3.50 per thousand solves depending on your plan.
| Geek Corner |
|---|
| Steel vs Browserbase, the honest comparison: Steel's self-hosted offering gives you more raw capability for less money if you're comfortable managing Docker containers. Session management, proxy support, anti-detection, all there. What you lose compared to Browserbase is the tighter integration with agentic frameworks (CrewAI, LangChain, MCP), the free CAPTCHA solving, and the session recording/replay features. Steel also has a thinner community ecosystem. If your use case is "I need browsers in the cloud and I don't want to manage servers," Browserbase wins. If your use case is "I need browsers on my own infrastructure and I'm fine being the ops team," Steel wins. The features are roughly at parity. The difference is who maintains the boxes. |
Feels like: A flat-pack kitchen from IKEA. All the pieces are there, the instructions mostly make sense, and the end result is perfectly functional. But you're assembling it yourself on a Sunday afternoon, and there will be leftover screws that worry you.
Bottom line: Steel is the best self-hosted browser API available right now. If you've got a VPS and docker-compose skills, it's genuinely hard to justify paying Browserbase's monthly fees for equivalent functionality. The trade-off is your time, which depending on your situation, might actually be the more expensive resource.
Bright Data: The Enterprise Nuclear Option
Right, let's talk about the 800-pound gorilla. Bright Data has been in the proxy game since before "AI agent" was a phrase anyone used. Their pitch for the Agent Browser is simple: 400 million IPs in 195 countries, unlimited concurrent sessions, automatic fingerprint rotation, autonomous CAPTCHA solving, and every anti-detection trick invented in the last decade. All piped through a real GUI browser (not headless) that makes bot detection systems think your scraper is a person in Sao Paulo using Firefox on their lunch break.
The catch is the price. Pay-as-you-go is $8 per GB of data transferred. Not per page. Per gigabyte. The starter plan is $499/month for 71GB, working out to about $7/GB. Enterprise is $1,999/month for 399GB at $5/GB. If you're doing image-heavy scraping or pulling PDFs, those gigabytes go fast.
For context, a typical news article page is roughly 2-3MB. At $8/GB, that's about 333-500 pages per dollar. At $499/month you're looking at somewhere between 23,000 and 35,000 pages depending on content weight. An equivalent Hetzner VPS running Playwright costs under four euros a month and can do the same volume in a day, assuming the sites don't block you. Which is the entire point of Bright Data. The sites don't block you when you're routing through 400 million residential IPs.
| Geek Corner |
|---|
| Why Bright Data's pricing is per-GB, not per-page: It's because the actual cost to Bright Data isn't the browser session, it's the proxy bandwidth. Residential IP bandwidth is expensive. Those IPs come from real devices on real ISPs with real data caps. The per-GB model reflects the underlying cost structure. It also means your bill is wildly unpredictable if you're scraping sites with varying page weights. A JavaScript-heavy SPA that pulls 15MB per page will cost you 5x more than a lightweight blog, even though the information extracted might be identical. Plan accordingly, or use their Scraper API which has per-request pricing instead. |
Feels like: Hiring a private military contractor to get your parcel through customs. Wildly overkill for most situations, but when the situation actually calls for it, nothing else will do.
Bottom line: Bright Data makes sense when you're scraping sites that actively fight back with sophisticated bot detection, when you need geo-specific data from 195 countries, or when your business depends on data that justifies a four-figure monthly bill. For scraping a few thousand blog posts into your RAG pipeline, you're setting fire to money.
Browserless: The Flexible Middle Ground
Browserless sits in an interesting niche. It's the most "infrastructure-y" of the bunch, less opinionated about what you're doing with the browser and more focused on just giving you Chrome-as-a-service that works reliably. 12.9k GitHub stars. Docker image that spins up in seconds. Works with both Puppeteer and Playwright out of the box.
The cloud pricing uses "units" where one unit equals 30 seconds of browser time. The free tier gives you 1,000 units (roughly 8.3 hours of browser time). The $25/month Prototyping plan gets 20k units. The $140/month Starter plan gets 180k units with 40 concurrent browsers. The $350/month Scale plan pushes to 500k units and 100 concurrent sessions.
What makes Browserless compelling beyond just "Chrome in the cloud" is BrowserQL, their query language for stealth automation. It's available on the cloud and enterprise tiers and handles CAPTCHA solving, anti-detection evasion, and fingerprint management through a declarative syntax. Think of it as a higher-level abstraction over the usual Playwright/Puppeteer API, specifically designed for sites that don't want to be scraped.
The V2 release added session persistence (cookies and cache surviving between sessions), session replay for debugging, Chrome extension loading, and hybrid automations that let you stream live browser sessions during script execution. That last one is proper useful for debugging agents that go off the rails.
The self-hosted story is where it gets complicated. You can run the Docker image free for non-commercial use. For commercial use, you need a licence. And the open-source code is SSPL-1.0, which is the same licence MongoDB famously switched to. SSPL is technically open-source but prevents you from offering Browserless as a managed service to others. For internal use it's fine. For building a product on top of it, read the licence carefully.
Feels like: A Swiss Army knife. Not the best at any single thing, but competent at everything and fits in your pocket.
Bottom line: Browserless is the right choice if you want flexibility across use cases (scraping, testing, PDF generation, screenshots) without committing to one vendor's agent framework. The self-hosted option is genuinely viable if you accept the SSPL constraints. The cloud pricing is competitive at medium scale.
The Maths: When Does DIY Stop Winning?
Right then. The bit everyone actually wants to know. I've been running the numbers for a while and here's a rough break-even table. This assumes a Hetzner CX31 (4 vCPU, 8GB RAM, roughly 7 EUR/month) running Playwright in Docker with a basic retry queue. Your engineer time at not-free-per-hour to maintain it.
| Monthly Pages | DIY Cost (Hetzner) | Browserbase ($99) | Steel Self-Hosted | Browserless ($140) | Bright Data ($499) |
|---|---|---|---|---|---|
| 1,000 | ~7 EUR + time | $99 (way overkill) | ~7 EUR + time | $140 (way overkill) | $499 (madness) |
| 10,000 | ~7 EUR + time | $99 | ~7 EUR + time | $140 | $499 |
| 50,000 | ~14 EUR + time | $99 | ~14 EUR + time | $140 | $499 |
| 100,000 | ~21 EUR + time | $198 (overage) | ~21 EUR + time | $140 | $499 |
| 500,000 | ~42 EUR + time | ~$990 | ~42 EUR + time | $350 | $999 |
That "time" column is doing a lot of heavy lifting. If you value your time at zero (student project, learning exercise, you genuinely enjoy debugging Chrome crashes), DIY wins at every scale. The VPS costs are absurdly cheap.
But the moment you factor in the 2am Cloudflare rotations, the Chrome memory leaks that crash your container every 48 hours, the proxy rotation you'll need to build yourself, the CAPTCHA solving integration, the session management, the retry logic for flaky pages...
For simple, friendly sites? DIY wins below 50,000 pages per month and it's not even close. That 7 EUR Hetzner box will chew through cooperative websites all day long.
For anti-bot sites? Managed wins from page one. Your Hetzner Playwright setup will get blocked on the first request to any site running Cloudflare Bot Management, DataDome, or PerimeterX. You'll spend three days building fingerprint rotation, another two on proxy integration, and then they'll change their detection and you're back to square one. The managed services have entire teams solving this problem full-time. You do not.
The Honest Crossover
Here's the framework I use now.
If the site serves content without fighting back, run Playwright on a cheap VPS and don't overthink it. Most documentation sites, blogs, public APIs with HTML endpoints, government data portals, these all scrape trivially. You don't need a service for this. You need a cron job.
If the site mildly fights back (basic rate limiting, simple CAPTCHAs), Firecrawl or Browserless gets you past it without building infrastructure. Firecrawl if you just want the content as markdown. Browserless if you need to interact with the page first.
If you need browser automation at scale with good developer ergonomics and you don't want to manage servers, Browserbase plus Stagehand is the strongest package. The framework is genuinely good and the infrastructure is solid.
If you want the same capability but on your own terms and your own hardware, Steel Browser gives you that. Docker, your VPS, your rules.
If the site actively fights back with sophisticated bot detection and you need data from it badly enough to justify the spend, Bright Data is the only game in town that consistently wins against top-tier anti-bot systems. That 400M IP network exists for a reason.
The Comparison Grid
| Firecrawl | Browserbase | Steel | Bright Data | Browserless | |
|---|---|---|---|---|---|
| What it is | Web content API for LLMs | Remote browser infra | Self-hostable browser API | Enterprise proxy + browser | Chrome-as-a-service |
| GitHub stars | 103k | N/A (Stagehand: 50k+) | 6.8k | N/A (CLI only) | 12.9k |
| Licence | Apache 2.0 (limited self-host) | Proprietary | Apache 2.0 | Proprietary | SSPL-1.0 |
| Self-hosted? | Partial (no Fire-engine) | No | Yes, full | No | Yes, with licence |
| Free tier | 500 lifetime credits | 1 browser hour | 100 hours/month | ~$25 trial credits | 1k units |
| Cheapest paid | $16/month | $20/month | $29/month | $499/month | $25/month |
| Anti-bot | Cloud only (Fire-engine) | Stealth mode + proxies | Stealth plugins | 400M+ residential IPs | BrowserQL |
| CAPTCHA solving | Not included | Free on all plans | $3-3.50 per 1k solves | Included | 10 units per solve |
| Best for | Content ingestion | Agent automation | Self-hosted infra | Heavily protected sites | Flexible automation |
| Not great for | Multi-step interactions | Budget-conscious at scale | Framework integrations | Budget-conscious anything | Building a competing SaaS |
What's Next
Part 4 will cover the other side of this coin: tools that extract structured content without running a browser at all. Readability algorithms, LLM-based extraction, and the surprisingly effective approach of just asking the search engine for the data instead of scraping the source. Sometimes the best browser is no browser.
If you missed the earlier parts, Part 1 covers the open-source agent frameworks and Part 2 digs into MCP servers for browser control. And if you want to see some of these tools in action, the vibe coding arc post has examples of agents actually using browser tools to ship code.