.

Technology

Whisper

Whisper: OpenAI's robust, open-source ASR model for multilingual speech recognition, translation, and language identification.

Whisper is OpenAI's general-purpose Automatic Speech Recognition (ASR) model, trained on a massive, diverse dataset for high-accuracy performance. It functions as a powerful multitasking system: handling multilingual transcription, direct speech translation, and language identification. The architecture processes audio in a sliding 30-second window, performing autoregressive predictions. Developers can select from six distinct model sizes to optimize for specific speed versus accuracy tradeoffs: this is the go-to solution for reliable, large-scale audio processing.

https://github.com/openai/whisper
40 projects · 28 cities

Related technologies

Recent Talks & Demos

Showing 1-24 of 40

Members-Only

Sign in to see who built these projects

ibaAgent: Agentic Time-Series Analysis
Nürnberg Apr 22
LangGraph OpenAI GPT
Shablon: Programmatic Video Templates
Dublin Apr 9
Python FastAPI
SetForMoney: AI Expense Tracking
Upstate NY Mar 10
GPT-5 Whisper
Mebot AI: Multimodal Digital Twins
Portland Mar 5
OpenAI API Gemini
Fluo: Adaptive Multi-Agent Language Learning
Toronto Feb 26
FastAPI SQLAlchemy
Reachy Mini: OpenClaw Brain
New York City Feb 17
OpenClaw Honcho
Public Speaking AI Agent
Tiruchirappalli Jan 31
React TypeScript
LLM Nutrition Pipeline Architecture
Toronto Jan 29
Codex Whisper
VibeCoding Workflow Demo
Tokyo Jan 15
Claude Code Gemini CLI
Zulu.cash: Private Local AI Agent
Houston Dec 9
Whisper Ollama
Xing Xing: Efficient AI Music
Toronto Dec 3
PyTorch Demucs
Tally: Ambient AI Continuous Memory
London Nov 25
Whisper Gemini
LiveKit Voice Agent Orchestration
San Francisco Nov 20
LiveKit Claude Agent SDK
CityPulse: Multi-Modal Video Understanding
New York City Nov 17
Ollama FastAPI
Zettaware: Language Model Motion Planning
Seattle Nov 12
Claude-3 Sonnet
Readback: AI ATC Training
Berlin Nov 12
GPT-5 Whisper
CityPulseNYC: Multi-Modal RAG
New York City Nov 6
Ollama LLaMA 3B
Rekognition Twelve Labs NASCAR Indexing
Seattle Oct 22
Twelve Labs Amazon S3
Local Transcription
Valencia Oct 22
ChatGPT bash
Second Brain Voice Agent API
Poland Oct 14
LangChain Python
AI Speech Input for Windows Apps
Raleigh Sep 30
OpenAI Whisper Azure Speech Service
ORION: ROS 2 ESP32 Robot
Bogotá Sep 25
ROS 2 Jazzy micro-ROS
Rafiki AI Tutor
Nairobi Sep 25
GPT-4 Whisper
Llama Whisper Large-Scale AutoGrading
Dubai Aug 23
Amazon Bedrock Whisper Turbo