Cartesia Sonic-3 Projects .

Technology

Cartesia Sonic-3

Cartesia Sonic-3 is the real-time, streaming Text-to-Speech (TTS) model that delivers human-like voice with a 90-millisecond latency, leveraging State Space Models (SSMs) for superior speed and emotional range.

This is the Sonic-3 model: the industry's fastest, most expressive voice AI. We built it on State Space Models (SSMs)—a major architectural shift from slow Transformer models—to achieve ultra-low latency. You get a model latency of just 90 milliseconds and a total end-to-end response time of 190 milliseconds, essential for fluid, real-time conversation. Sonic-3 captures the full spectrum of human speech, including genuine emotion and laughter, and supports 42 languages for global deployment. Backed by $100 million in funding from partners like NVIDIA and Kleiner Perkins, this is the engine powering millions of monthly voice interactions for major enterprises like ServiceNow.

https://cartesia.ai
1 project · 1 city

Related technologies

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Sign in to see who built these projects