DeepSeek Before V4: Culture, Organization, and Liang Wenfeng's Unique Goals

Original: V4 发布前的 DeepSeek：特质、组织和梁文锋的独特目标 (WeChat/晚点LatePost, ~March 2026)

Summary by Lab Index. This is a condensed English summary of the major claims in the original Chinese article.

Key Departures

Several core DeepSeek members left between late 2025 and early 2026:

Wang Bingxuan — core author of DeepSeek LLM (the first-generation model), hired away by Tencent's Yao Shunyu in late 2025.
Wei Haoran — core author of the DeepSeek-OCR series, departed around Chinese New Year. Likely joining a major tech company.
Guo Daya — core author of DeepSeek-R1, recently departed. Likely joining a major tech company.
Ruan Chong — a member since the Huanfang (High-Flyer) era, core contributor to Janus-Pro and other multimodal work. Left in early 2025 and joined autonomous driving startup DeepRoute.ai in January 2026.

Despite these departures, the article emphasizes that more people chose to stay. There has been no group-level attrition.

V4 Status

As of writing (~March 2026), DeepSeek V4 has not yet been officially released. A small-parameter version was given to open-source framework communities around January 2026 for adaptation work. The optimistic timeline was a mid-February release (around Chinese New Year), but the article reports V4 may launch in April 2026.

Compensation and Valuation Challenges

DeepSeek has never raised external funding and has no established company valuation. As competitors like MiniMax and Zhipu (Z.ai) go public with rising stock prices, and Moonshot AI (Kimi) and StepFun prepare IPOs, DeepSeek employees are questioning the value of their equity/option agreements. Liang Wenfeng has recently begun working to establish a company valuation and provide more certainty to team members.

When Liang did briefly meet investors in 2023, he proposed an unusual term: a return cap for investors, similar to OpenAI's arrangement with Microsoft. No institution invested.

The "No Overtime" Culture

DeepSeek is described as the only core AI lab globally that does not overwork. While engineers at Google, OpenAI, xAI, and ByteDance work 70–80 hours per week:

Most DeepSeek employees leave the office between 6–7 PM on weekdays.
There is no clock-in system and no morning attendance tracking.
There are no explicit performance reviews or deadlines (DDLs).
Liang Wenfeng believes a person can produce high-quality output for at most 6–8 hours per day, and that fatigued decision-making wastes precious GPU resources.

The company provides free after-work benefits like sports courses and gym/venue reimbursements.

Organization: Flat, Cross-functional, ~200 People

DeepSeek has no second-in-command. In the research team, there are only two levels: Liang Wenfeng and everyone else.
The research team is ~100+ people, functioning like a large academic lab. Researchers (mostly born around 2000) call 1985-born Liang "Boss Liang" (梁老板), though his role is closer to an academic advisor.
Three main groups — base model architecture (a few dozen people), infrastructure (a few dozen), and data (a few dozen) — work in tight, cross-functional collaboration with blurred team boundaries.
Liang personally attends every team's meetings. Most weekly meetings are open to members from other teams.
The total organization has grown past the size of Huanfang (High-Flyer), making it the largest organization Liang has ever managed.

Hiring Profile

Before 2025, DeepSeek almost never hired experienced professionals, preferring new graduates and converting interns. Analysis of 172 researchers who contributed to three generations of models (LLM, V2, V3/R1) showed: over 70% held only bachelor's or master's degrees, and over 70% were under 30 years old.

Research Focus Since R1

Rather than capitalizing on V3/R1's viral success with flashy releases, DeepSeek continued along three lines:

Efficiency optimization — squeezing maximum intelligence per unit of GPU compute. This includes the open-source week infrastructure releases (inference kernels, communication libraries, matrix multiplication libraries, data processing frameworks), NSA (Native Sparse Attention), DSA (Dynamic Sparse Attention), and even replacing CUDA/Triton with the Peking University-developed TileLang at the operator level.
Architecture innovations — mHC (Manifold-Constrained Hyper-Connections) for training stability, and Engram for building long-term memory outside the model. mHC is widely expected to be used in V4.
"Non-mainstream" explorations — DeepSeek-OCR (converting text to images before feeding to the model), continuous learning, autonomous learning, and consultations with neuroscience/brain science advisors to explore mechanisms closer to the human brain.

What DeepSeek Is NOT Doing

Not investing heavily in multimodal generation — Liang believes it is not on the main path to intelligence.
Not aggressively pursuing Agent products — though V3.2 strengthened agent capabilities, DeepSeek's iteration frequency has been lower than competitors. Since early 2025, Zhipu updated 5 versions, MiniMax 4 versions, and Kimi 3 versions focused on Agent/coding. DeepSeek-V3.2 ranks only #12 in OpenRouter's OpenClaw token consumption.
Not building consumer products beyond a basic chatbot — though the article notes DeepSeek has recently begun hiring for Agent product roles, specifically seeking people familiar with Claude Code, OpenClaw, and Manus.

Liang Wenfeng's Unique AGI Goals

Beyond pursuing intelligence ceiling, Liang prioritizes two things most labs do not:

Building on domestic (Chinese) chip ecosystems — adapting models for domestic GPUs, using Chinese-originated open-source tools (TileLang over Triton), and designing data formats for next-generation domestic chips.
"Original innovation" — pursuing directions that big companies or other startups won't try: the Janus unified multimodal series, the Prover formal verification series, OCR research, continuous learning, and brain-inspired approaches.

Outlook

The article concludes that V4, when released, will likely be the strongest open-source model but will not be overwhelmingly dominant, as "strong" has become increasingly context-dependent across different use cases. DeepSeek faces the tension between Liang's emphasis on original exploration and the industry's pressure to simply "stay the strongest."

A person close to DeepSeek says: "Those who stay still have some idealism." Another observer notes: "Only when more companies like DeepSeek appear will Chinese technology have a chance to go from 'replication' to leading."