Technology
Mixtral 8x7B
Mixtral 8x7B is a Sparse Mixture of Experts (SMoE) model: it outperforms Llama 2 70B and GPT-3.5 while delivering 6x faster inference.
Mixtral 8x7B (Mistral AI) is a high-performance, open-weight Large Language Model (LLM) built on a Sparse Mixture of Experts (SMoE) architecture. This design leverages eight distinct 'experts' (7B parameters each), activating only two per token: this results in 46.7B total parameters but only 12.9B active parameters per inference, optimizing for speed and cost. The model matches or exceeds GPT-3.5 performance on most benchmarks, handles a 32k token context window, and demonstrates strong multilingual capabilities (English, French, Italian, German, Spanish) and superior code generation.
Related technologies
Recent Talks & Demos
Showing 1-2 of 2