Agent Attention Projects .

Technology

Agent Attention

Agent Attention is a novel Transformer attention mechanism: it integrates Softmax and Linear Attention via agent tokens, achieving high expressiveness with linear computational efficiency.

This technology directly addresses the efficiency-expressiveness trade-off in Transformers. Agent Attention introduces a small set of agent tokens (A) to the conventional attention module (Q, K, V). These agents aggregate global information, then broadcast it back to the query tokens (Q), effectively combining the power of Softmax attention with the low computational cost of linear attention. The result is a highly efficient, plug-in module: it delivers up to a 2.2x acceleration in image generation for models like Stable Diffusion while preserving high image quality across diverse vision tasks.

https://github.com/LeapLabTHU/Agent-Attention
1 project · 1 city

Related technologies

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Sign in to see who built these projects