MathCoder
paperIntroduces code-integrated mathematical reasoning where each solution interleaves natural language, code, and execution results. Created MathCodeInstruct, a dataset of novel math problems with code-based solutions generated via a customized LLM pipeline. Models fine-tuned at 7B/13B/34B/70B on LLaMA-2 and CodeLLaMA bases.
MathCoder-CL-34B achieves 45.2% on MATH and 83.9% on GSM8k, surpassing ChatGPT-3.5, PaLM-2, and outperforming GPT-4 on competition-level problems. An early influential demonstration of the "reasoning + code execution" paradigm alongside ToRA. ICLR 2024. By Wang, Ren, Zhou, Lu, Luo, Shi, Zhang, Song, Zhan, Li (Shanghai AI Lab + CUHK MMLab + CityU + Nanjing U).