Evaluation benchmarks for measuring agentic and coding performance.
benchmarkagenticcoding