Step-Audio-R1 | Lab Index

First audio LLM to unlock test-time compute scaling via Chain-of-Thought reasoning. 33B parameters. Surpasses Gemini 2.5 Pro on audio understanding benchmarks.

Paper (arXiv)GitHub HuggingFace

Model Details

Architecture DENSE

Parameters 33B

Variants

Name	Parameters	Notes
Step-Audio-R1	—	Released Nov 27, 2025
Step-Audio-R1.1	—	Released Jan 14, 2026. Dual-Brain Architecture for real-time spoken dialogue.
Step-Audio-R1.1 (Realtime)	—	Top-ranked speech-to-speech model on AA's Big Bench Audio (96.4%, May 2026), ahead of xAI's Grok Voice Agent.

Paper

arXiv HTML

audioreasoningopen-weight

Notes

arXiv submission Nov 19, 2025. Model weights released Nov 27, 2025.