URSA
modelUniform discrete diffusion framework with metric path for video generation (ICLR 2026). Bridges the gap between discrete and continuous diffusion approaches. Formulates video generation as iterative global refinement of discrete spatiotemporal tokens via Linearized Metric Path and Resolution-dependent Timestep Shifting. Scales efficiently to high-resolution image synthesis and long-duration video generation with fewer inference steps. Supports multi-task video generation with asynchronous timestep scheduling in one unified model.
Model Details
Variants
| Name | Parameters | Notes |
|---|---|---|
| URSA-0.6B-IBQ1024 | — | Compact variant for image generation |
| URSA-1.7B-IBQ1024 | — | Full variant |
Paper
arXiv: 2510.24717
Venue: ICLR 2026