Compute-efficient bilingual (Korean/English) LLMs. Dense Transformers at 2.1B (Nano), 9.8B (Essence), and 32.5B (Flag), all trained on 3T tokens in two stages. Uses depth up-scaling, pruning, and distillation. EMNLP 2025 Oral.

Flag 32.5B base: MMLU 77.68, KMMLU 62.10, HumanEval 51.22. CC-BY-NC-4.0 (Nano), custom license (others).

Model Details

Architecture DENSE
Parameters 32.5B

Variants

Name Parameters Notes
Kanana Nano 2.1B
Kanana Essence 9.8B
Kanana Flag 32.5B

Paper

arXiv: 2502.18934

Venue: EMNLP 2025

open-weightmultilingual

Related