A NVIDIA's CPU Offload Method
ArchiveMedium
DeepSpeed's technique for running models larger than GPU memory by keeping most of the weights on CPU and streaming layers into GPU as needed, named after a broader ZeRO family. Which name?
Show hint
The ZeRO family + the task.
Archive — no submissions accepted
This challenge is preserved for reference. Play live challenges at /challenges.