Meet Scribe: The world's most accurate transcription model

Thursday, March 6 2025 at 5:00 pm (GMT)

About 45 minutes

About this event

Scribe, our first Speech to Text model, is the world’s most accurate transcription model. Built to handle the unpredictability of real-world audio, Scribe transcribes speech in 99 languages, featuring word-level timestamps, speaker diarization, and audio-event tagging—all delivered in a structured response for seamless integration.

Want to learn more about how you can use Scribe for your business? Register now for an overview from some of the researchers and engineers behind the new model.

Scribe is engineered for precision. In FLEURS & Common Voice benchmark tests across 99 languages, it consistently outperforms leading models like Gemini 2.0 Flash and Whisper Large V3. Whether it’s meeting summaries, movie subtitles, or even song lyrics, Scribe delivers the highest accuracy rate in Italian (98.6%), English (96.73%) and 97 other languages.

Developers can integrate Scribe today via our Speech to Text API to generate structured JSON transcripts with speaker diarization, word-level timestamps & non-speech event markers (e.g. laughter). A low-latency version for real-time applications will be released soon.

Hosted by

Team member

LJ T
Louis Jordan

ElevenLabs

ElevenLabs is an AI audio research and deployment company. As voice becomes a primary interface for using technology, we build AI that makes these interactions effortless and human.

View all events

Share this event

Copy permalink