Fine-Tuning & Training

Mereology Of Megatron-LM

Archive

Hard

200pts0 solves

NVIDIA's Megatron-LM sharded individual matrix multiplications across GPUs by splitting the weight matrix along one dimension and gathering activations. Name the parallelism style. Flag format: CONGRESS{two-words}. Example: CONGRESS{data parallelism}.

Show hint

Two words; a rank-2 array + its partition style.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.