HyperCLOVA X THINK
modelFirst reasoning-focused LLM in the HyperCLOVA X family. Exact parameter count not disclosed (analysts estimate ~32B based on SEED distillation). Pre-trained on ~6 trillion Korean/English tokens using Peri-LN Transformer with muP scaling for consistent hyperparameter transfer across scales.
3-stage curriculum expanding to 128K context. Post-trained with SFT + RLVR. Vision variant uses SigLIP-2 encoder (512x512 per grid) with LLaVA-1.5-HD framework and C-Abstractor connector, supporting up to 1.57M tokens. Matches or exceeds GPT-4.1 on KCSAT STEM. Competitive on KMMLU, CSAT, KoBALT-700, HAERAE-1.0, KoBigBench.
Model Details
Architecture DENSE
Context window 128,000
Paper
arXiv: 2506.22403