A KVCache-centric disaggregated architecture for LLM serving. Won Best Paper at FAST 2025.

Outputs 2

Mooncake: KVCache-centric Disaggregated Architecture

paper
Venue FAST 2025
Citations 13

Mooncake Dataset & Code

dataset
infrastructureefficiency