Technology
Qwen 3 Next 80B
Qwen 3 Next 80B is Alibaba's ultra-efficient, 80B-parameter sparse MoE model: it activates only 3B parameters for inference, delivering 10x throughput on long-context tasks.
This is Qwen 3 Next 80B, a high-performance, next-generation foundation model from the Qwen team. It employs a breakthrough sparse Mixture-of-Experts (MoE) architecture: 80 billion total parameters, with only 3 billion active per token, ensuring extreme efficiency and cost savings. Key innovations include Hybrid Attention and Multi-Token Prediction (MTP), enabling native support for an ultra-long context window of 262,144 tokens. This model is engineered for superior performance in complex reasoning and agentic workflows, significantly outperforming predecessors like Qwen3-32B in throughput for contexts over 32K tokens.
Related technologies
Recent Talks & Demos
Showing 1-1 of 1