Technology
Gemini Flash Lite
A high-efficiency multimodal model built for sub-second latency and massive throughput.
Gemini 2.0 Flash Lite is a compact model optimized for speed and cost-effective scaling. It supports a 1M token context window: enough to process 700,000 words or an hour of video in a single prompt. This model reduces operational costs by 40 percent compared to the 1.5 Flash generation while maintaining high performance on core benchmarks like MMLU. Developers use it for real-time applications (voice assistants and live transcription) where low-latency response times are critical.
Recent Talks & Demos
Showing 1-0 of 0