The Quantization That Zero-Crashes
ArchiveHard
Lin et al. (2023) proposed a weight-only post-training quantization method that preserves the salient 1% of weights by activation-aware scaling, used in llama.cpp and vLLM. Three-letter acronym?
Show hint
Activation + 'weight' + 'quantization', initials.
Archive — no submissions accepted
This challenge is preserved for reference. Play live challenges at /challenges.