CPM-2
modelA large-scale cost-effective pre-trained language model series (up to 198B parameters) that introduced Mixture-of-Experts (MoE) and multilingual capabilities. Achieved state-of-the-art results on both Chinese and English tasks while maintaining computational efficiency.
By Zhengyan Zhang, Yuxian Gu, Xu Han et al. of the Department of Computer Science and Technology, Tsinghua University & BAAI. The CPM lineage is later stewarded externally by OpenBMB (founded 2022), which inherits Tsinghua THUNLP's open-source releases — including the descendant CPM-Ant, CPM-Bee, and MiniCPM families.
Model Details
Architecture MOE
Parameters 198B