Mistral Small 4 | Lab Index

119B total / 6.5B active MoE (128 experts, top-4 routing). First Mistral model to unify instruct, reasoning, multimodal, and agentic coding in one architecture. 256K context. 3x throughput vs Mistral Small 3 with 40% latency reduction. Configurable reasoning effort.

AA Intelligence Index: 12 (non-reasoning). Apache 2.0.

May 31 2026: NVIDIA-collaboration NVFP4 quantized variant shipped on HuggingFace (Mistral-Small-4-119B-2603-NVFP4) for faster Blackwell inference at near-BF16 quality.

HuggingFace Artificial Analysis OpenRouter

Model Details

Architecture MOE

Parameters 119B

Active params 6.5B

Context window 256,000

AA Intelligence 12

moeopen-weightreasoningmultimodalagentic

Model Details

Related