First open-source large language model developed from scratch for medical applications. Trained on 20 trillion tokens of both general and medical-specific data using a hybrid tokenizer, curriculum-based training strategy with progressive data complexity, and adaptive gradient clipping. The 14B Instruct variant surpasses Qwen2.5-72B-Instruct in medicine. Released alongside the Baichuan-M1-preview deep thinking model.

Outputs 3

Baichuan-M1: Pushing the Medical Capability of Large Language Models

paper

Technical report on the Baichuan-M1 model series, detailing the from-scratch medical training approach and evaluation results.

arXiv: 2502.12671

Baichuan-M1-14B

model

14.5-billion-parameter medical-enhanced model available in Base and Instruct versions.

Architecture DENSE
Parameters 14.5B

Baichuan-M1-preview

model

Deep thinking model with reasoning capabilities across language, vision, and search. Surpassed o1-preview on several benchmarks including mathematics and coding tasks.

open-weightbiologynlp