Technology
decoder-only
This architecture is the workhorse of generative Large Language Models (LLMs), specializing in autoregressive text generation via causal (masked) multi-head self-attention.
The decoder-only model is a streamlined Transformer variant, removing the original encoder component to focus exclusively on sequence generation (Source 1.3, 1.5). It operates by predicting the next token based on all preceding tokens in the input sequence, a process called autoregression (Source 1.8). The core mechanism is a stack of decoder blocks, each utilizing masked self-attention to ensure the model cannot ‘look ahead’ at future tokens, maintaining causal integrity (Source 1.2, 1.9). This design powers industry-leading models like OpenAI's GPT series (GPT-3, GPT-4) and Meta's Llama family (Llama-2, Llama-3), making it the standard for tasks requiring fluent, context-aware content creation (Source 1.3, 1.6).
Related technologies
Recent Talks & Demos
Showing 1-1 of 1