Steven Gonsalvez

Software Engineer

← Back to Tools & Tips
Free|

markdown.new + Jina Reader: Stop Feeding Your LLM Raw HTML

Two tools for converting web pages to clean markdown. markdown.new runs on Cloudflare edge, Jina Reader uses a URL prefix. Both slash token usage by 80% or more.

Visit tool →

Same problem, two fixes

Raw HTML is a token furnace. You feed a web page into your context window and 80% of the budget goes on <div> tags and inline styles that add absolutely nothing. Rubbish way to spend your tokens.

markdown.new runs on Cloudflare's edge. Paste a URL, get clean markdown back in under a second. No signup, 500 requests a day free. Their own numbers: 16,180 tokens of HTML down to 3,150 as markdown. Five times more content per context window. Proper mint.

Jina Reader is even simpler. Stick r.jina.ai/ in front of any URL and it does the conversion. r.jina.ai/https://example.com gives you markdown straight back. Handles PDFs too, does image captioning, and you can target specific CSS selectors if you only want part of a page.

I reach for Jina when I need the extras like PDF parsing or JSON extraction. For quick page grabs where speed matters, markdown.new is the one. Either way, stop chucking raw HTML at your models.

Share𝕏in

Comments & Reactions