Cola DLM
paper"Continuous Latent Diffusion Language Model." A hierarchical latent diffusion LM that replaces token-level autoregression with continuous-latent prior transport: a Text-VAE encodes tokens into continuous latents, then a block-causal DiT prior under flow matching generates new latents that the VAE decodes back to text. ~2B params; scales to ~2000 EFLOPs of pretraining.
Another diffusion-LM bet from a frontier lab, complementing Seed Diffusion's discrete-state approach with a continuous-latent formulation. Open code + weights + checkpoints.