MSR technique that treats a markdown "skill document" as the trainable parameter for frozen LLM agents: the skill file is iteratively optimized against task feedback while the model weights stay fixed. Reports best-or-tied results in 52 of 52 evaluation cells across 6 agent benchmarks — a data point for the "optimize the context, not the weights" school of agent post-training. Paper May 2026; blog June 30.

Paper

agentspost-trainingresearch