Kanana
modelCompute-efficient bilingual (Korean/English) LLMs. Dense Transformers at 2.1B (Nano), 9.8B (Essence), and 32.5B (Flag), all trained on 3T tokens in two stages. Uses depth up-scaling, pruning, and distillation. EMNLP 2025 Oral.
Flag 32.5B base: MMLU 77.68, KMMLU 62.10, HumanEval 51.22. CC-BY-NC-4.0 (Nano), custom license (others).
Model Details
Architecture DENSE
Parameters 32.5B
Variants
| Name | Parameters | Notes |
|---|---|---|
| Kanana Nano | 2.1B | — |
| Kanana Essence | 9.8B | — |
| Kanana Flag | 32.5B | — |
Paper
arXiv: 2502.18934
Venue: EMNLP 2025