Technology

Gemini Flash Lite

A high-efficiency multimodal model built for sub-second latency and massive throughput.

Gemini 2.0 Flash Lite is a compact model optimized for speed and cost-effective scaling. It supports a 1M token context window: enough to process 700,000 words or an hour of video in a single prompt. This model reduces operational costs by 40 percent compared to the 1.5 Flash generation while maintaining high performance on core benchmarks like MMLU. Developers use it for real-time applications (voice assistants and live transcription) where low-latency response times are critical.

https://ai.google.dev/gemini-api/docs/models/gemini#gemini-2.0-flash-lite

0 projects · 0 cities

Recent Talks & Demos

Showing 1-0 of 0

Members-Only

No public projects found for this technology yet.