Modern podcasts hold hours of knowledge, but only if you can search them. At @TranscriptedAI, we’ve converted 6,000+ podcast episodes into a RAG-ready knowledge base—turning 15,000 hours of spoken content into data that AI can search and cite in seconds. Here’s our pipeline: 🧵
Step 1: Transcription 🎙️→📝 We use Deepgram’s nova-3 model for accurate, punctuated text with speaker diarization. Why @DeepgramAI? High accuracy, lower costs, built-in diarization, and simple webhook callbacks for async processing.
Our callback system is bulletproof: • Validates & persists raw payloads to Cloud Storage • Uses distributed locks for exactly-once processing • Handles race conditions and prevents double-processing • Stores ALL data before any mutations run No lost information, ever. 🔒
Step 2: Speaker Identification Deepgram gives us “Speaker 0, Speaker 1…” We send structured prompts to Gemini 2.0-Flash to map these to real names like “Joe Rogan” or guest names. We only need a portion of the transcript—Gemini handles imperfect diarization beautifully.