Glossary · Letter S

Speech-to-Text (STT)

AI software that converts spoken audio into written text - the "ears" of a voice AI agent. Streaming STT (which transcribes as you speak) is required for low-latency voice agents. Industry-leading STT in 2026: Deepgram for English, AssemblyAI for batch, Whisper for self-hosted.

Got a workflow to automate?

Most concepts in this glossary, we ship as services.

Book a 30-min call. We will scope which of these (lead scoring, voice agent, missed-call recovery, AI agent) fits your specific bottleneck.