DeepSeek
privateDeepSeek is a dominant force in the open-source AI community, founded in 2023 by Liang Wenfeng, who also founded and leads the quantitative hedge fund High-Flyer (幻方量化). High-Flyer, one of China's largest quant funds managing over $8B in assets, provides DeepSeek's funding and compute infrastructure. The company reportedly operates a cluster of ~10,000 NVIDIA A100 GPUs acquired before US export restrictions took full effect.
DeepSeek triggered a global sensation in January 2025 — the "DeepSeek moment" — when DeepSeek-R1 demonstrated reasoning performance rivaling OpenAI's o1 at a fraction of the training cost. The release briefly wiped nearly $1 trillion from US tech stocks and forced a reassessment of assumptions about AI scaling costs worldwide.
The lab has pioneered several key innovations: Multi-head Latent Attention (MLA) for memory-efficient inference, the DeepSeekMoE architecture with fine-grained expert segmentation, FP8 mixed-precision training at scale, Multi-Token Prediction, and GRPO (Group Relative Policy Optimization) for reinforcement learning on reasoning tasks. DeepSeek is notable for its "open-science" approach, publishing detailed technical reports and open-sourcing low-level training infrastructure including DeepGEMM, FlashMLA, DeepEP, and 3FS (Fire-Flyer File System).
DeepSeek's research was published in Nature in 2025 and the company has become a central case study in debates over AI efficiency, US-China technology competition, and the viability of open-weight frontier models.
People
- Liang Wenfeng (梁文锋) — Founder (formerly High-Flyer (Co-founder & CEO, one of China's top quant funds); MEng Zhejiang University (2010); BEng Zhejiang University (2007))
- Chong Ruan OpenReview — Co-founder (departed early 2025; joined DeepRoute.ai Jan 2026)
- Daya Guo Google ScholarOpenReview — Former Core Researcher (first author of R1, proposed GRPO; joined ByteDance Seed April 2026) (formerly Sun Yat-sen University (PhD))
- Xiao Bi Semantic Scholar — Core Researcher (DeepSeek LLM, V2, V3, R1, DeepSeek-Math)
- Deli Chen Google ScholarOpenReview — Core Researcher (R1 reasoning) (formerly Tencent WeChat AI (2021-2023); Peking University (MS))
- Zhenda Xie OpenReview — Core Researcher (architecture)
- Wangding Zeng OpenReview — Core Researcher (MLA architecture)
- Zhihong Shao (邵智宏) Google ScholarOpenReview — Core Researcher (R1, V2, DeepSeek-Math, DeepSeek-Prover, Math-Shepherd) (formerly Tsinghua University (PhD 2019-2024, advisor Minlie Huang); BS Beihang University; MSRA intern (ToRA))
- Zhibin Gou Google Scholar — Core Researcher (R1 math reasoning, GRPO clipping strategy) (formerly Tsinghua University (MSc, advisor Yujiu Yang); Microsoft Research Asia (intern, ToRA))
- Qihao Zhu OpenReview — Core Researcher
News
- 2026-05-11 DeepSeek First External Funding Round Reportedly Near Close at $45–50B Valuation, Led by China's 'Big Fund III' — SCMP
- 2026-05-06 DeepSeek in Talks for First-Ever Outside Round at $45B; Tencent + Big Fund III in Lead Group — TechCrunch
- 2026-04-24 DeepSeek-V4 Released: 1.6T/49B MoE, First Frontier Model Trained Entirely on Huawei Ascend 950PR, MIT License — DeepSeek
- 2026-04-22 Tencent and Alibaba in Talks to Invest in DeepSeek at $20B+ Valuation — First External Funding — Bloomberg
- 2026-04-16 DeepSeek R1 Lead Author Daya Guo Joins ByteDance Seed Amid Intensifying AI Talent War — SCMP
- 2026-04-16 DeepSeek V4 Imminent — 1T-Parameter MoE to Run Solely on Huawei Ascend 950PR Chips — Dataconomy
- 2026-03-28 DeepSeek Before V4: Culture, Organization, and Liang Wenfeng's Unique Goals (English summary) — LatePost (晚点)
- 2026-03-24 DeepSeek's Latest Job Postings Highlight Pivot to Agentic AI — Bloomberg