Large-scale benchmark for evaluating agentic coding models. Includes AutoCodeInstruct with distilled answers from DeepSeek-V3.
benchmarkcodingagentic

Notes

V2 released 2026.