Three Features Of Continuous Batching
ArchiveHard
Modern inference engines layer three optimizations on top of continuous batching: _____(1) (pause a long decode for a short newcomer), _____(2) (share KV for identical prompt heads), and _____(3) (split big prefill work across decode steps). Fill the 3 blanks. Flag format: CONGRESS{1:[word],2:[word-with-hyphens],3:[word-with-hyphens]}. Example: CONGRESS{1:preemption,2:prefix-reuse,3:chunked-prefill}.
Show hint
Three concepts vLLM and SGLang all advertise.
Archive — no submissions accepted
This challenge is preserved for reference. Play live challenges at /challenges.