JAIS 2 | Lab Index

Second generation JAIS. 8B and 70B trained from scratch (not adapted) on 2.6T tokens of Arabic, English, and code using Cerebras CS-2 on Condor Galaxy clusters. Custom Arabic-centric vocabulary (150K tokens), RoPE, Squared-ReLU, muP. Post-trained via SFT (20M+ instruction pairs) → DPO → GRPO.

AraGen-12-24 overall: 70.71% (outperforms Qwen2.5-72B and Llama-3.3-70B on Arabic generative tasks). ~2,000 tokens/sec on Cerebras hardware.

Announcement (MBZUAI)Cerebras Blog HuggingFace (70B Chat)

Model Details

Architecture DENSE

Parameters 70B

Context window 8,192

Training tokens 2.6T

Variants

Name	Parameters	Notes
JAIS-2-8B	8B	—
JAIS-2-70B	70B	—

open-weightmultilingualfrontier

Model Details

Variants

Related