Command A+ | Lab Index

Cohere's first fully Apache-2.0-licensed open model (the prior Command A was CC-BY-NC). 218B total / 25B active MoE with vision-language input, 128K context (64K output), 48 languages (up from 23), agentic tool use, and native citation generation via embedded grounding-span tags that link every factual claim to its source document.

Two infrastructure stories sit alongside the model. First, the W4A4 variant runs on one NVIDIA B200 or two H100s at negligible quality loss from BF16 — Cohere only quantizes the MoE experts to 4-bit while keeping attention pathways at full precision, recovered via quantization-aware distillation (a KL-divergence distill from the BF16 teacher). Second, the model is positioned for sovereign critical infrastructure deployment, the first artifact under the combined Cohere × Aleph Alpha banner following their April 2026 merger.

Self-reported benchmarks: τ²-Bench Telecom 85% (up from 37% on Command A Reasoning), Terminal-Bench Hard 25% (up from 3%), agentic QA +20% over Command A Reasoning, MMMU 75.1, MMMU-Pro 63, MathVista 80.6. AA Intelligence Index v4.1: 29. 270 tok/s output speed.

Announcement (Cohere blog)HuggingFace (BF16)HuggingFace (FP8)HuggingFace (W4A4)Artificial Analysis VentureBeat coverage

Model Details

Architecture MOE

Parameters 218B

Active params 25B

Context window 128,000

AA Intelligence 29

License Apache 2.0

Languages 48

Benchmark Scores

Benchmark	Score	Mode
τ²-Bench Telecom	85%	—
Terminal-Bench Hard	25%	—
MMMU	75.1	—
MMMU-Pro	63	—
MathVista	80.6	—

Variants

Name	Parameters	Notes
command-a-plus-05-2026-bf16	—	16-bit; runs on 4× B200 or 8× H100
command-a-plus-05-2026-fp8	—	8-bit; runs on 2× B200 or 4× H100
command-a-plus-05-2026-w4a4	—	4-bit (recommended); runs on 1× B200 or 2× H100; produced via Quantization-Aware Distillation from BF16 teacher

frontieropen-weightmoemultimodalreasoningagenticenterprisemultilingual

Model Details

Benchmark Scores

Variants

Related