Deepgram: $200 Free STT That Makes Voice Coding Actually Work
Speech-to-text API with a generous free tier that turns voice into a viable input for coding agents. Fast enough you forget it is there.
Why This Matters for Coding Agents
Voice input for coding used to be a gimmick. Whisper was slow. Commercial options cost a fortune. The latency between speaking and text appearing was long enough to break your train of thought.
Deepgram changed the maths. Their Nova-3 model does real-time streaming transcription fast enough that the text appears as you speak, not after. And the free tier gives you $200 in credit, which is roughly 12,000 minutes of transcription. That's a lot of talking before you pay a penny.
The Vibe Coding Angle
Wire Deepgram into any STT tool (justspeaktoit, a custom script, whatever) and suddenly voice is a real input method for your coding agents. "Refactor the auth middleware to use the new token format" spoken out loud, transcribed in under 200ms, piped into Claude Code. No typing. No context switch.
The accuracy on technical speech is surprisingly good. It handles "refactor," "middleware," "useState," "async await" without flinching. Not perfect on obscure library names, but proper solid on the vocabulary you actually use while coding.
Getting Started
# Sign up at deepgram.com, grab your API key
# $200 free credit, no card required
# Quick test with curl
curl -X POST "https://api.deepgram.com/v1/listen" \
-H "Authorization: Token YOUR_KEY" \
-H "Content-Type: audio/wav" \
--data-binary @audio.wav
Or just install justspeaktoit which wraps Deepgram with a macOS menu bar app. Press hotkey, speak, text appears. Sorted.