LongCat-Flash-Lite

Smaller, cost-efficient LongCat variant based on the "Scaling Embeddings Outperforms Scaling Experts" research. Designed for high-throughput production use cases.

Paper (arXiv)HuggingFace Artificial Analysis

Outputs 2

LongCat-Flash-Lite

model

HuggingFace

Architecture MOE

AA Intelligence 17

Scaling Embeddings Outperforms Scaling Experts in Language Models

paper 2026-01-29

Paper (arXiv)

arXiv HTML

moeefficiencyopen-weight

Outputs 2

LongCat-Flash-Lite

Scaling Embeddings Outperforms Scaling Experts in Language Models

Related