Experimental diffusion language model from Inclusion AI. Explores an alternative to autoregressive generation. LLaDA-MoE extends the approach with Mixture-of-Experts.

Outputs 2

LLaDA

model

Diffusion language model from Inclusion AI.

arXiv: 2502.09992

LLaDA-MoE

model

Mixture-of-Experts variant of LLaDA.

Architecture MOE
generationarchitectureresearch

Related