Kimi Linear
model paperHybrid linear attention architecture (KDA + MLA). 3B active / 48B total MoE model with 75% KV-cache reduction and 6x throughput at 1M context.
Outputs 2
Kimi Linear Model
model Architecture MOE
Parameters 48B
Active params 3B
Kimi Linear: Hybrid Linear Attention Architecture
paperarXiv: 2510.26692