Llama-Nemotron (Nano/Super/Ultra)

Family of reasoning models derived from Meta Llama via Neural Architecture Search (NAS): Ultra (253B, from Llama 3.1 405B with skip attention, variable FFN, FFN fusion), Super (49B, from Llama 3.3 70B), and Nano (8B). First open models with dynamic reasoning toggle (on/off at inference).

Ultra reasoning ON: MATH-500 97.0, GPQA 76.0, AIME25 72.5, LiveCodeBench 66.3. Outperforms DeepSeek-R1 on GPQA at less than half the parameters. Super fits on single H100-80GB. v1.5 adds RPO, RLVR, and iterative DPO for enhanced agentic capabilities.

Paper (arXiv)HuggingFace (Ultra 253B)HuggingFace (Super 49B)Artificial Analysis (Ultra)Artificial Analysis (Super)OpenRouter (Ultra)OpenRouter (Super)

Model Details

Architecture DENSE

Context window 128,000

AA Intelligence 15

Base model llama-3.1

Variants

Name	Parameters	Notes
Llama-3.1-Nemotron-Ultra-253B	—	253B, derived from Llama 3.1 405B via NAS
Llama-3.3-Nemotron-Super-49B	—	49B, derived from Llama 3.3 70B via NAS
Llama-3.1-Nemotron-Nano-8B	—	8B

Paper

arXiv HTML

open-weightreasoningfrontier

Model Details

Variants

Paper

Related