Cola DLM | Lab Index

"Continuous Latent Diffusion Language Model." A hierarchical latent diffusion LM that replaces token-level autoregression with continuous-latent prior transport: a Text-VAE encodes tokens into continuous latents, then a block-causal DiT prior under flow matching generates new latents that the VAE decodes back to text. ~2B params; scales to ~2000 EFLOPs of pretraining.

Another diffusion-LM bet from a frontier lab, complementing Seed Diffusion's discrete-state approach with a continuous-latent formulation. Open code + weights + checkpoints.

Paper (arXiv)Project page HuggingFace

Paper

arXiv HTML

foundationaldiffusionopen-weightresearch

Paper

Related