Large-scale open SWE environment synthesis. 45,320 executable Docker environments spanning 12.8K repos + ~13,000 curated trajectories, built via multi-agent pipeline on a 64-node cluster (~$1.47M total cost: $891K environments + $576K trajectories).

Models trained on OpenSWE: OpenSWE-32B (62.4% SWE-Bench Verified), OpenSWE-72B (66.0%) — SOTA among SFT methods. Also shows gains on math and science benchmarks.

Paper

arXiv: 2603.13023

Dataset

GitHub Repository

codingdataopen-source

Related