Technology

Mozilla DeepSpeech

An open-source speech-to-text engine based on Baidu’s Deep Speech research and implemented via TensorFlow.

DeepSpeech delivers a production-ready STT framework that transforms audio into text using a streamlined end-to-end architecture. By bypassing traditional phonetic-to-word mappings in favor of a single recurrent neural network (RNN), the engine achieves high accuracy across diverse acoustic environments. It supports real-time transcription on hardware ranging from high-end NVIDIA GPUs to low-power Raspberry Pi 4 devices. Developers can leverage pre-trained English models or train custom datasets to handle specific vocabularies and accents (e.g., medical terminology or regional dialects).

https://github.com/mozilla/DeepSpeech

1 project · 1 city

Related technologies

Amazon Polly 1 Amazon Transcribe 2 Azure Speech to Text 1 BERT 179 CMU Sphinx 2 eSpeak 1 Festival 1 Google Cloud Speech-to-Text 2 Google Cloud Text-to-Speech 1 GPT-3 191 GPT-4 528 IBM Watson Speech to Text 2 IBM Watson Text to Speech 1 Kaldi 2 Keras 74 Microsoft Azure Text-to-Speech 1 ONNX 82 OpenAI Whisper 10

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

UzbekVoice

Tashkent Oct 31

Google Cloud Speech-to-Text Google Cloud Text-to-Speech