Archive
LLM Infrastructure

Three Features Of Continuous Batching

Archive
Hard
200pts0 solves
Modern inference engines layer three optimizations on top of continuous batching: _____(1) (pause a long decode for a short newcomer), _____(2) (share KV for identical prompt heads), and _____(3) (split big prefill work across decode steps). Fill the 3 blanks. Flag format: CONGRESS{1:[word],2:[word-with-hyphens],3:[word-with-hyphens]}. Example: CONGRESS{1:preemption,2:prefix-reuse,3:chunked-prefill}.
Show hint
Three concepts vLLM and SGLang all advertise.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.