What is Text-to-Speech (TTS)?
Definition
Text-to-Speech (TTS) is AI technology that converts written text into natural-sounding spoken audio. It is the inverse of speech-to-text and is used for accessibility, content creation, and voice interfaces.
How It Works
Modern TTS uses deep learning models that generate human-like speech from text input. The models handle pronunciation, intonation, pacing, and emotion to produce natural-sounding audio.
TTS systems can generate speech in multiple languages and voices, with some systems able to clone specific voices from small audio samples.
glossaryTextToSpeechHowItWorks3
Why It Matters
TTS enables accessibility for users who prefer or require audio content. It also powers voice assistants, audiobook generation, and content narration.
While meeting AI tools primarily use speech-to-text, TTS is relevant for features like reading back summaries or providing audio versions of meeting notes.
glossaryTextToSpeechWhyItMatters3