Technology

Qwen 3 Next 80B

Qwen 3 Next 80B is Alibaba's ultra-efficient, 80B-parameter sparse MoE model: it activates only 3B parameters for inference, delivering 10x throughput on long-context tasks.

This is Qwen 3 Next 80B, a high-performance, next-generation foundation model from the Qwen team. It employs a breakthrough sparse Mixture-of-Experts (MoE) architecture: 80 billion total parameters, with only 3 billion active per token, ensuring extreme efficiency and cost savings. Key innovations include Hybrid Attention and Multi-Token Prediction (MTP), enabling native support for an ultra-long context window of 262,144 tokens. This model is engineered for superior performance in complex reasoning and agentic workflows, significantly outperforming predecessors like Qwen3-32B in throughput for contexts over 32K tokens.

https://qwen3-next.org

1 project · 1 city

Related technologies

llama 136 Open WebUI 2 vLLM 33

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Quiet Local AI Inferencing

Hong Kong Jan 20

llama vLLM