Sarashina2 | Lab Index

SB Intuitions' first frontier Japanese LLM series and the foundation of all later Sarashina releases. 7B, 13B, and 70B dense Llama-2-style Transformers with RoPE, SwiGLU, and a 102,400-token SentencePiece unigram vocabulary (no Japanese pre-tokenization). The 70B has 80 layers, 8192 hidden dim, 64 attention heads. All variants trained from scratch on 2.1T tokens: ~1T Japanese (Common Crawl cleaned with CCNet and HojiChar) plus English from SlimPajama-627B (books3 removed for copyright). Released August 2024 under the MIT license.

Sarashina2-70B is competitive with the top Japanese LLMs on the Swallow leaderboard and excels on Japan-specific QA such as the abc-multiple-choice and AI King quiz sets. Shown to be token-efficient for Japanese text relative to other LLMs. The series is base-only (no instruction tuning) and directly served as the base for Sarashina2-8x70B (sparse-upcycled MoE) as well as later instruction-tuned releases.

HuggingFace (70B)HuggingFace (13B)HuggingFace (7B)Tech Blog (Japanese)

Model Details

Architecture DENSE

Parameters 70B

Context window 4,096

Training tokens 2.1T

Variants

Name	Parameters	Notes
Sarashina2-7B	7B	—
Sarashina2-13B	13B	—
Sarashina2-70B	70B	—

open-weightmultilingualjapanese

Model Details

Variants

Related