.

Technology

Whisper

Whisper: OpenAI's robust, open-source ASR model for multilingual speech recognition, translation, and language identification.

Whisper is OpenAI's general-purpose Automatic Speech Recognition (ASR) model, trained on a massive, diverse dataset for high-accuracy performance. It functions as a powerful multitasking system: handling multilingual transcription, direct speech translation, and language identification. The architecture processes audio in a sliding 30-second window, performing autoregressive predictions. Developers can select from six distinct model sizes to optimize for specific speed versus accuracy tradeoffs: this is the go-to solution for reliable, large-scale audio processing.

https://github.com/openai/whisper
40 projects · 28 cities

Related technologies

Recent Talks & Demos

Showing 21-40 of 40

Members-Only

Sign in to see who built these projects