Technology

Speech

Speech technology converts human voice into actionable data: Automatic Speech Recognition (ASR) transcribes spoken words, while Text-to-Speech (TTS) generates natural-sounding audio for systems like Amazon Alexa and Google Assistant.

Speech technology is a core AI pillar, bridging human-machine communication through two primary functions. Automatic Speech Recognition (ASR) analyzes acoustic signals, using deep learning models to convert spoken word into legible text, a process critical for dictation and data capture. Conversely, Text-to-Speech (TTS) synthesis generates human-like audio from written text, with modern neural models achieving high fidelity and natural intonation. Applications are ubiquitous: this technology drives voice assistants (Siri, Google Assistant), streamlines customer service via Interactive Voice Response (IVR), and boosts workplace efficiency by being up to three times faster than typing for data input. The global market is projected to exceed $19 billion by 2025, confirming its essential role across all sectors.

https://www.techtarget.com/whatis/definition/speech-technology

31 projects · 33 cities

Related technologies

GPT-4 678 OpenAI API 500 FastAPI 159 Deepgram 18 LangChain 439 Llama-2 337 OpenAI Whisper 14 TypeScript 259 Android 15 BERT 186 BLOOM 116 Cloud Speech-to-Text API 3 ElevenLabs 51 Flask 32 Flutter 27 Gemini 254 GPT-3 390 MCP 201

Recent Talks & Demos

Showing 1-24 of 31

Members-Only

Sign in to see who built these projects

Sign in View FAQ

ibaAgent: Agentic Time-Series Analysis

Nürnberg Apr 22

LangGraph OpenAI GPT

GitHub API Claw

qode.world: Multi-lingual AI Interviewer

Ho Chi Minh City Mar 7

Kubernetes (K8s) Node

Public Speaking AI Agent

Tiruchirappalli Jan 31

React TypeScript

Kuralit: Intent-Driven Mobile Interface

Tiruchirappalli Jan 31

Nihongo Convo: AI Conversation Practice

Nashville Jan 29

gpt-4o-mini gpt-4o-mini-tts

Mori Solution: Construction RAG Pipeline

Nashville Jan 29

Real Estate Voice AI Interface

Valencia Jan 29

OpenAI API Node

Active Story v2

FastAPI Claude API

Muse: Playful Voice Coding

Next TypeScript

Zulu.cash: Private Local AI Agent

Podcastify AI: Spaced Repetition Lessons

Flask OpenAI API

Emotional AI: iMessage to Voice

Claude Code Cartesia Sonic-3

Deepgram OpenAI ElevenLabs Production

LiveKit Voice AI Coach

Los Angeles Nov 20

Youz: AI Landmark Guide

Nürnberg Nov 20

Secure AI Health Assistant with EHR

OpenAI API FastAPI

EchoKit Voice AI on ESP32

LlamaEdge EchoKit

Gemma 3n: Offline Android RAG

Android MediaPipe GenAI

AI Speech Input for Windows Apps

OpenAI Whisper Azure Speech Service

Rafiki AI Tutor

OpenAI Real-Time Voice Agents

Fort Wayne Aug 19

GPT-4 OpenAI Realtime API

LLM Dialogue: Memory and TTS Demo