LongCat-Flash-Chat
model paper560B parameter MoE model activating ~27B per token. Meituan's foundational LLM with PID-controller-based dynamic expert allocation and "Zero-computation Experts" mechanism. 128K context, 100+ tokens/sec on H800.
Outputs 2
LongCat-Flash-Chat
model Architecture MOE
Parameters 560B
Active params 27B
Context window 128,000
LongCat-Flash Technical Report
paperDetails the Zero-computation Experts mechanism and PID-controller routing.
arXiv: 2509.01322