Glossary · Letter S
Speech-to-Text (STT)
AI software that converts spoken audio into written text - the "ears" of a voice AI agent. Streaming STT (which transcribes as you speak) is required for low-latency voice agents. Industry-leading STT in 2026: Deepgram for English, AssemblyAI for batch, Whisper for self-hosted.
Related terms
Got a workflow to automate?
Most concepts in this glossary, we ship as services.
Book a 30-min call. We will scope which of these (lead scoring, voice agent, missed-call recovery, AI agent) fits your specific bottleneck.