Second generation of open-source large language models with 7B and 13B parameters, trained from scratch on 2.6 trillion tokens. Matches or outperforms other open-source models of similar size on benchmarks including MMLU, CMMLU, GSM8K, and HumanEval, with particular strength in medicine and law domains.

Outputs 3

Baichuan 2: Open Large-scale Language Models

paper

Technical report describing the training of the Baichuan 2 model series on 2.6 trillion tokens with evaluations across public benchmarks.

arXiv: 2309.10305

Baichuan2-7B

model

7-billion-parameter variant of the Baichuan 2 series with both base and chat-aligned versions.

Architecture DENSE
Parameters 7B

Baichuan2-13B

model

13-billion-parameter variant of the Baichuan 2 series with both base and chat-aligned versions.

Architecture DENSE
Parameters 13B
open-weightnlp