Highly optimized kernels for Multi-head Latent Attention.

Library

Stars 12.7k
infrastructureattention