PCL's flagship Chinese-focused autoregressive language model series — a 201B-parameter dense Transformer alongside a 7B sibling, trained end-to-end on China's Computing Network using the MindSpore framework over Huawei Ascend 910 silicon (Cloud Brain II). Both variants emphasize Chinese core capabilities while supporting English and additional languages, with the released checkpoint trained on ~1.5T tokens.

The 200B checkpoint is sharded into 3,456 model shards for distributed training and inference on Ascend; fine-tuning requires 144 Ascend-910 cards. Weights are distributed via OpenI through an application gate, not on HuggingFace.

A separate multilingual variant, mPengC.mind (7B), applies multilingual incremental learning on top of the 7B base and is released openly on HuggingFace in both NPU and GPU builds.

Model Details

Architecture DENSE
Parameters 200B
Training tokens 1.5T
Training hardware Ascend 910 (Cloud Brain II)

Variants

Name Parameters Notes
PengCheng-Mind-200B 200B Distributed via OpenI application gate; 3,456 model shards
PengCheng-Mind-7B 7B
mPengC.mind (multilingual 7B) 7B Multilingual incremental-learning variant on the 7B base; HF NPU + GPU builds
open-weightnlpmultilingual

Related