Open-source driving world model platform from SenseTime Research with training, inference, and evaluation tools. Features MaskGWM, a generalizable driving world model using video mask reconstruction with a scalable DiT structure. Supports long-horizon prediction and multi-view generation, validated on nuScenes, OpenDV-2K, and Waymo. Accepted at CVPR 2025.

Outputs 2

OpenDWM Platform

library

GitHub Repository

MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction

paper

arXiv: 2502.11663

Venue: CVPR 2025

embodiedgenerationopen-source