Mistral Large 3 | Lab Index

Mistral's largest model. 675B total / 41B active MoE with granular mixture-of-experts architecture, plus 2.5B vision encoder for native multimodal. 256K context. Trained on 3,000 H200 GPUs.

MMLU-Pro: 73.11%, MATH-500: 93.60%. Competitive with frontier models across reasoning, coding, and multilingual tasks. Apache 2.0.

HuggingFace Artificial Analysis OpenRouter

Model Details

Architecture MOE

Parameters 675B

Active params 41B

Context window 256,000

AA Intelligence 23

moeopen-weightfrontiermultimodal

Model Details

Related