World's leading Arabic-centric open LLM, developed by the JAIS consortium (G42/Inception + MBZUAI + Cerebras). 13B GPT-3 decoder with ALiBi positions and SwiGLU. Trained on 395B tokens (116B Arabic + 279B English). Best open-source Arabic model at launch.

Later scaled to 30B (1.63T tokens) and 70B (adapted from Llama 2 with 370B Arabic tokens — largest Arabic dataset for an LLM at the time). Named after Jebel Jais, the UAE's highest mountain.

Model Details

Architecture DENSE
Parameters 30B
Training tokens 1.63T

Variants

Name Parameters Notes
JAIS-13B 13B
JAIS-30B 30B Trained from scratch
JAIS-70B Adapted from Llama 2 (not trained from scratch)

Paper

open-weightmultilingual

Related