First audio LLM to unlock test-time compute scaling via Chain-of-Thought reasoning. 33B parameters. Surpasses Gemini 2.5 Pro on audio understanding benchmarks.

Model Details

Architecture DENSE
Parameters 33B

Variants

Name Parameters Notes
Step-Audio-R1 Released Nov 27, 2025
Step-Audio-R1.1 Released Jan 14, 2026. Dual-Brain Architecture for real-time spoken dialogue.
Step-Audio-R1.1 (Realtime) Top-ranked speech-to-speech model on AA's Big Bench Audio (96.4%, May 2026), ahead of xAI's Grok Voice Agent.

Paper

audioreasoningopen-weight

Notes

arXiv submission Nov 19, 2025. Model weights released Nov 27, 2025.