Theoretical analysis of emergent abilities in language models through the lens of pre-training loss, providing a unified framework for understanding when and why capabilities appear at scale. Published at NeurIPS 2024.

Paper

arXiv: 2403.15796

Venue: NeurIPS 2024

scalingtheoryresearch