Command A+
modelCohere's first fully Apache-2.0-licensed open model (the prior Command A was CC-BY-NC). 218B total / 25B active MoE with vision-language input, 128K context (64K output), 48 languages (up from 23), agentic tool use, and native citation generation via embedded grounding-span tags that link every factual claim to its source document.
Two infrastructure stories sit alongside the model. First, the W4A4 variant runs on one NVIDIA B200 or two H100s at negligible quality loss from BF16 — Cohere only quantizes the MoE experts to 4-bit while keeping attention pathways at full precision, recovered via quantization-aware distillation (a KL-divergence distill from the BF16 teacher). Second, the model is positioned for sovereign critical infrastructure deployment, the first artifact under the combined Cohere × Aleph Alpha banner following their April 2026 merger.
Self-reported benchmarks: τ²-Bench Telecom 85% (up from 37% on Command A Reasoning), Terminal-Bench Hard 25% (up from 3%), agentic QA +20% over Command A Reasoning, MMMU 75.1, MMMU-Pro 63, MathVista 80.6. AA Intelligence Index v4.0: 37 (#29/87 models). 270 tok/s output speed.
Model Details
Benchmark Scores
| Benchmark | Score | Mode |
|---|---|---|
| τ²-Bench Telecom | 85% | — |
| Terminal-Bench Hard | 25% | — |
| MMMU | 75.1 | — |
| MMMU-Pro | 63 | — |
| MathVista | 80.6 | — |
Variants
| Name | Parameters | Notes |
|---|---|---|
| command-a-plus-05-2026-bf16 | — | 16-bit; runs on 4× B200 or 8× H100 |
| command-a-plus-05-2026-fp8 | — | 8-bit; runs on 2× B200 or 4× H100 |
| command-a-plus-05-2026-w4a4 | — | 4-bit (recommended); runs on 1× B200 or 2× H100; produced via Quantization-Aware Distillation from BF16 teacher |