System for scaling LLM training to over 10,000 GPUs.

Paper

Citations 24
infrastructuretrainingscaling