What is Speaker Diarization?

Definition
Speaker diarization is the process of identifying and separating different speakers in an audio recording. It answers the question "who spoke when?" by labeling each segment of audio with the corresponding speaker.

How It Works

Speaker diarization analyzes voice characteristics like pitch, tone, and speaking patterns to create unique voice profiles for each participant. These profiles are used to attribute speech segments to specific speakers.

The process typically involves voice activity detection (finding when someone speaks), feature extraction (analyzing voice characteristics), clustering (grouping similar voice segments), and labeling (assigning speaker identities).

glossarySpeakerDiarizationHowItWorks3

Why It Matters

Speaker diarization is essential for meaningful meeting transcriptions. Knowing who said what transforms a generic transcript into an actionable meeting record.

It enables features like attributed action items, speaker-specific analytics, and accurate meeting minutes where each contribution is properly credited.

glossarySpeakerDiarizationWhyItMatters3

Get Started with Whisper

Download Whisper and experience invisible AI assistance tailored to your workflow.