Wu Dao 2.0
model paperLandmark 1.75 trillion parameter MoE model, at the time the largest in the world, outscaling GPT-3.
Outputs 2
Wu Dao 2.0
modelLandmark 1.75 trillion parameter MoE model, at the time the largest in the world, outscaling GPT-3.
Architecture MOE
Parameters 1.75T
Wu Dao 2.0: A Roadmap to Cognitive Intelligence
paperRoadmap paper for the Wu Dao 2.0 cognitive intelligence program.