AgentCPM Series | Lab Index

A series of high-performance, "edge-scale" agents (4B-8B) designed for long-horizon reasoning, deep research, and mobile GUI automation. AgentCPM demonstrated that compact models can rival much larger systems in complex agentic tasks through advanced reinforcement learning and search strategies.

GitHub Website

Outputs 3

AgentCPM-GUI: Reinforcement Fine-Tuning for Mobile Agents

paper

Introduces mobile-use agents aligned via reinforcement fine-tuning (GRPO), optimizing for low-latency execution on mobile GUIs.

Paper (arXiv)

arXiv HTML

AgentCPM-Explore: Long-Horizon Deep Exploration

paper 2026-02-06

A 4B-scale agent capable of 100+ rounds of interaction, achieving frontier-level deep search performance by matching Claude-4.5-Sonnet on specific benchmarks.

Paper (arXiv)HuggingFace

arXiv HTML

AgentCPM-Report: Open-Ended Deep Research

paper 2026-02-06

Introduces the "Writing As Reasoning Policy" (WARP) for autonomous long-form report generation, interleaving drafting and deep retrieval.

Paper (arXiv)HuggingFace

arXiv HTML

agenticon-deviceefficiencyresearch

Outputs 3

AgentCPM-GUI: Reinforcement Fine-Tuning for Mobile Agents

AgentCPM-Explore: Long-Horizon Deep Exploration

AgentCPM-Report: Open-Ended Deep Research

Related