Zhibin Gou
Core Researcher (R1 math reasoning, GRPO clipping strategy) — DeepSeek
Affiliations
- ● DeepSeek — Core Researcher (R1 math reasoning, GRPO clipping strategy)Formerly: Tsinghua University (MSc, advisor Yujiu Yang); Microsoft Research Asia (intern, ToRA)