Technology

LPU

The Language Processing Unit (LPU) is Groq's specialized processor: a custom-built chip architected for deterministic, high-speed, low-latency inference on Large Language Models (LLMs).

The LPU (Language Processing Unit) is a purpose-built accelerator from Groq, designed to resolve the compute and memory bandwidth bottlenecks that plague LLM inference on traditional hardware. Unlike general-purpose GPUs, the LPU utilizes a software-defined, single-core architecture with hundreds of megabytes of integrated SRAM, which acts as primary weight storage (not cache). This design enables deterministic performance and superior sequential processing, a critical factor for language models. The result is industry-leading speed and low latency: benchmarks confirm the LPU Inference Engine can achieve throughputs of up to 241 tokens per second, making real-time, large-scale generative AI applications practical for deployment.

https://groq.com/

1 project · 1 city

Related technologies

AI inference 5 Groq 32 GroqCloud 3 On-prem solutions 1

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Groq LPU: Fastest AI Inference

Singapore Feb 21

Groq LPU