LLM Infrastructure

A NVIDIA's CPU Offload Method

Archive

Medium

150pts0 solves

DeepSpeed's technique for running models larger than GPU memory by keeping most of the weights on CPU and streaming layers into GPU as needed, named after a broader ZeRO family. Which name?

Show hint

The ZeRO family + the task.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.