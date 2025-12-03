French artificial intelligence (AI) start-up Mistral has unveiled its latest family of open-weight models, the Mistral 3. The launch includes a large frontier model with multimodal and multilingual capabilities, as well as nine smaller offline-capable fully customizable models. The move comes as Mistral seeks to compete with Silicon Valley's closed-source frontier models in the AI space.

Customization focus Mistral's approach to AI model customization Guillaume Lample, co-founder and chief scientist at Mistral, said many enterprise use cases can be handled by smaller models if they're fine-tuned. He added that while large closed-source models may perform better out-of-the-box, the real gains come from customization. "In many cases, you can actually match or even out-perform closed source models," he said.

Model capabilities Mistral Large 3: A competitive frontier model The Mistral Large 3 model matches some of the key features of larger closed-source AI models like OpenAI's GPT-4o and Google's Gemini 2. It also competes with several open-weight rivals. The Large 3 is one of the first open frontier models to offer multimodal and multilingual capabilities in a single package, making it comparable to Meta's Llama 3 and Alibaba's Qwen3-Omni.

Model design Mistral Large 3's architecture and applications The Mistral Large 3 features a "granular Mixture of Experts" architecture with 41B active parameters and 675B total parameters. This allows efficient reasoning across a 256k context window, delivering both speed and capability. The model can analyze long documents, code, create content, and serve as an AI assistant for complex enterprise tasks like workflow automation.

Model suite Mistral 3: A suite of smaller models The Mistral 3 suite includes nine high-performance dense models in three sizes (14B, 8B, and 3B parameters) and three variants: Base (the pre-trained foundation model), Instruct (chat-optimized for conversation and assistant-style workflows), and Reasoning (optimized for complex logic and analytical tasks). All variants support vision, handle 128K-256K context windows, and work across languages.