Most powerful Z.ai model family. GLM-5 is 744B total params (44B active) via MoE with 256 experts. Hybrid Attention and Multi-Token Prediction. First frontier-scale model trained entirely on 100,000 Huawei Ascend 910B chips. GLM-5-Turbo is the fast variant optimized for the OpenClaw agent ecosystem.

Outputs 3

GLM-5: From Vibe Coding to Agentic Engineering

model

744B total params (44B active) via MoE with 256 experts. Hybrid Attention and Multi-Token Prediction. First frontier-scale model trained entirely on 100,000 Huawei Ascend 910B chips (zero American hardware).

Architecture MOE
Parameters 744B
Active params 44B

GLM-5 Technical Report

paper

arXiv: 2602.15763

GLM-5-Turbo

model

Specialized "fast" version optimized for the OpenClaw agent ecosystem, focusing on continuous task execution and tool reliability.

Architecture MOE
moeagenticcodingfrontierefficiency