The largest model in the Nemotron 3 family (Nano / Super / Ultra), shipped on HuggingFace June 4, 2026 after a Computex Taipei pre-announcement by Jensen Huang on June 1. 550B total parameters with 55B active per token (~90% sparsity), trained on 20T text tokens in NVFP4 on Blackwell.

Architecture: hybrid Mamba-2 / Attention Mixture-of-Experts with LatentMoE (hardware-aware expert design with a 2,048-dim latent compression), 108 total layers, 512 experts per layer activated top-22, 64 query / 2 KV heads (GQA), and Multi-Token Prediction (MTP) with 2 shared-weight heads for native speculative decoding. Context window 1M tokens after a long-context extension phase. Post-trained with SFT + multi-environment RLVR + Multi-teacher On-Policy Distillation (MOPD), with explicit reasoning-budget control.

Throughput: 5.9× / 4.8× / 1.6× higher inference throughput than GLM-5.1-754B-A40B, Kimi-K2.6-1T-A32B, and Qwen-3.5-397B-17B respectively on the 8K-input / 64K-output setting, at on-par accuracy across agentic and reasoning benchmarks. AA frames Ultra as the leading US open-weights model on its composite at launch.

Headline benchmarks (BF16, post-trained): MMLU-Pro 86.8, GPQA (no tools) 87.0, LiveCodeBench v6 89.0, SWE-Bench Verified 71.9, Terminal-Bench 2.1 56.4, RULER @ 1M 94.7, AA Intelligence Index v4.1 = 38 served at 300+ tokens/s.

Released as four checkpoints under the Linux Foundation OpenMDW-1.1 license: Base-BF16 (pretrained-only), BF16 (post-trained), NVFP4 (quantized for faster inference), and GenRM (the generative reward model used during RL). Distribution targets: HuggingFace, ModelScope, OpenRouter, and build.nvidia.com.

Companion datasets shipped on HuggingFace 2026-06-04/05: Nemotron-Pretraining-Code-v3 (173B tokens of fresh code with Sept-2025 cutoff), Nemotron-Pretraining-Legal-v1, Nemotron-Pretraining-Specialized-v1.2 (factual recall + moral scenarios), Nemotron-Posttraining-v3, Nemotron-SFT-SWE-v3, Nemotron-RL-Ultra-Training-Blends, Nemotron-RL-Science-v1, Nemotron-RL-Multichallenge-v1, Nemotron-RL-CFBench-v1, Nemotron-RL-SysBench-v1, Nemotron-RL-InverseIFEval-v1, Nemotron-RL-Instruction-Following-Structured-Outputs-v2, plus Nemotron-Personas-Vietnam and Nemotron-Personas-El-Salvador.

Model Details

Architecture MOE
Parameters 550B
Active params 55B
Context window 1,000,000
AA Intelligence 38
License OpenMDW-1.1

Benchmark Scores

Benchmark Score Mode
MMLU-Pro 86.8
GPQA (no tools) 87.0
LiveCodeBench v6 89.0
SWE-Bench Verified 71.9
Terminal-Bench 2.1 56.4
RULER @ 1M 94.7

Variants

Name Parameters Notes
Nemotron 3 Ultra 550B-A55B BF16 550B Post-trained flagship; BF16 weights
Nemotron 3 Ultra 550B-A55B NVFP4 550B NVFP4-quantized for higher inference throughput on Blackwell
Nemotron 3 Ultra 550B-A55B Base BF16 550B Pretrained-only base checkpoint
Nemotron 3 Ultra 550B-A55B GenRM 550B Generative reward model used during RL post-training

Paper

frontieropen-weightmoereasoningagentichybrid-architecture

Related