A KVCache-centric disaggregated architecture for LLM serving. Won Best Paper at FAST 2025.

Outputs 2

Mooncake: KVCache-centric Disaggregated Architecture

paper
Venue FAST 2025
Citations 12

Mooncake Dataset & Code

dataset
infrastructureefficiency