Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Nova Sonic: Real-time Voice Assistants
This talk covers building real-time, bidirectional voice applications using Amazon’s Nova Sonic foundation model, with code samples and implementation guidance.
Description:
Explore a real-time, bidirectional speech interactions using cutting-edge foundation models. Learn how to create fluid, natural-sounding voice applications that respond intelligently in real-time.
For developers and architects interested in conversational AI applications with state-of-the-art voice technology. Code samples and implementation guidelines will be provided to help you get started with your own voice-enabled projects.
Technical Prerequisites:
- Programming experience
- Basic understanding of audio processing concepts
- Basic understanding of Foundation Models
Implements Amazon Nova Sonic bidirectional streaming for real-time speech processing via Python.
Related projects
Talking in Real Time: Voice Agents for Live Conversations
Miami
A walkthrough of building a low‑latency, customizable voice agent for real‑time meetings and call‑center use, including integration demos…
The Rise of Visual Agents: Speaking the Future of Business Intelligence
Medellín
Explore how voice and visual AI combine to create Visual Agents that generate dynamic visualizations, insights, and actions…
A deep dive on voice AI and voice agents
Dublin
An in-depth exploration of ElevenLabs’ voice synthesis technology, covering its core features, integration methods, and practical implementation in…
Voicebots
Los Angeles
Learn how to create, customize, and share voice‑enabled GPTs, explore practical use cases, and get feedback on prompt…
Voice AI Agent Architecture: Streaming Deepgram → OpenAI → ElevenLabs in Production
Bogotá
A live technical walkthrough of building a production voice AI agent, detailing orchestration of Deepgram, OpenAI, and ElevenLabs…
Vocal Docs - A smart document editor you can talk to
Seattle
This talk explores a document editor that uses speech for both dictation and editing, demonstrating how language models…