Inspur
corporateYuanLab AI is the AI research division of Inspur (SZ: 000977), one of the world's largest server manufacturers. Originally the Inspur AI Research Institute, it rebranded as YuanLab around 2024–2025 to reflect its shift toward open-source enterprise-grade foundation models. Unlike VC-backed startups, YuanLab is funded directly by Inspur's corporate capital and has a "home field advantage" in hardware optimization from building the servers many other AI companies train on.
The Yuan model series traces a clear arc from massive dense models to efficient MoE. Yuan 1.0 (2021, 245B dense) was the world's largest single-language Chinese model at launch. Yuan 2.0 (2023) introduced Localized Filtering-based Attention (LFA) across 2B–102B sizes. Yuan 2.0-M32 (2024) marked the MoE pivot with an Attention Router achieving 3.8% accuracy gains over classical expert routing. The current flagship Yuan 3.0 Ultra (March 2026, 1.01T total / 68.8B active) was pruned from 1.5T parameters using Layer-Adaptive Expert Pruning (LAEP), improving training efficiency by 49% and inference speed by 33%. Yuan 3.0 Flash (Dec 2025, 40B MoE) introduced the Reflection Inhibition Reward Mechanism (RIRM) to prevent overthinking, cutting inference costs by ~50%.
People
- Shaohua Wu (Shawn Wu) — Lead Researcher