Technology

Whisper

Whisper: OpenAI's robust, open-source ASR model for multilingual speech recognition, translation, and language identification.

Whisper is OpenAI's general-purpose Automatic Speech Recognition (ASR) model, trained on a massive, diverse dataset for high-accuracy performance. It functions as a powerful multitasking system: handling multilingual transcription, direct speech translation, and language identification. The architecture processes audio in a sliding 30-second window, performing autoregressive predictions. Developers can select from six distinct model sizes to optimize for specific speed versus accuracy tradeoffs: this is the go-to solution for reliable, large-scale audio processing.

https://github.com/openai/whisper

40 projects · 28 cities

Related technologies

OpenAI Whisper 14 FastAPI 159 GPT-4 678 Ollama 82 OpenAI API 500 FFmpeg 20 Next 197 PostgreSQL 144 Gemini 254 LangChain 439 LiveKit 14 Llama-2 337 React 260 BERT 186 BLOOM 116 Claude-3 443 GPT-3 390 GPT-4o 72

Recent Talks & Demos

Showing 21-40 of 40

Members-Only

Sign in to see who built these projects

Sign in View FAQ

AI Speech Input for Windows Apps

OpenAI Whisper Azure Speech Service

ORION: ROS 2 ESP32 Robot

ROS 2 Jazzy micro-ROS

Rafiki AI Tutor

Llama Whisper Large-Scale AutoGrading

Amazon Bedrock Whisper Turbo

LuchaCoach: Local AI Meeting Coach

OpenAI GPT-4 OpenAI Whisper API

Aura: Local AI Gaming Companion

Llama 3 OpenAI Whisper

Real-Time Voice Agents

llama OpenAI API

Podcast Localization using LLMs

Singapore Apr 25

WhisperX GPT-4o

Transcriber R&D project

San Francisco Feb 27

Instantcasts: Fast Whisper Transcripts

San Francisco Jan 29

Real-Time LLaMA Voice Assistant

Medellín Dec 5

Automated Video Editing with LLMs

Amsterdam Nov 12

Whisper Clipboard

Amsterdam Sep 25

YT shorts finder

Vision models Whisper

Bengaluru Aug 8

GPT-4 OpenAI API

GGML ONNX Runtime

Whisper/VAD Multi-Model Segmentation

Whisper WebRTC-VAD

Capsule Transcriber: ML Transcription

Los Angeles Mar 19

Whisper WebRTC-VAD

FrogBase: Audiovisual Knowledge Search

San Francisco Aug 9