Archive
LLM Infrastructure

The Paper's Kernel Library

Archive
Medium
150pts0 solves
NVIDIA + CMU's 2024 library providing high-performance attention kernels (with paged KV and cascade attention) used as a backend by vLLM and SGLang. Name it. Flag format: CONGRESS{name}. Example: CONGRESS{flashmla}.
Show hint
Paper + 'inference'.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.