Technology
Mixtral 8x22B
A 141B parameter sparse mixture-of-experts model delivering high-throughput performance and a 176k token context window.
Mistral AI's Mixtral 8x22B sets a new standard for open-weight efficiency using a Sparse Mixture-of-Experts (SMoE) architecture. While the model contains 141B total parameters, it only activates 39B during inference, optimizing compute costs without sacrificing reasoning capabilities. It supports a massive 176k token context window (ideal for complex document analysis) and outperforms Llama 3 70B on key benchmarks like MMLU and GSM8K. Licensed under Apache 2.0, it provides enterprise-grade multilingual support across English, French, Italian, German, and Spanish.
1 project
·
1 city
Related technologies
Recent Talks & Demos
Showing 1-1 of 1