LLM-jp-4 | Lab Index

Latest generation with MoE and "thinking" variants. 32B/3.8B active MoE (128 routed experts, top-8, 32 layers, 2560 hidden, 40 heads). 65K context. Trained on 11.7T tokens with llm-jp-tokenizer v4.0. Apache 2.0.

Also includes dense 8B (9B params) with base and thinking variants. MT-Bench JA: 7.57-7.82, MT-Bench EN: 7.70-7.86. Evaluated using GPT-5.4 as judge.

HuggingFace (32B MoE)HuggingFace (8B)

Model Details

Architecture MOE

Parameters 32B

Active params 3.8B

Context window 65,536

Training tokens 11.7T

Variants

Name	Parameters	Notes
llm-jp-4-8B	8B	—
llm-jp-4-32B-A3B	32B	—

moeopen-sourceopen-weightreasoningmultilingual

Model Details

Variants

Related