Archive
Fine-Tuning & Training

Mereology Of Megatron-LM

Archive
Hard
200pts0 solves
NVIDIA's Megatron-LM sharded individual matrix multiplications across GPUs by splitting the weight matrix along one dimension and gathering activations. Name the parallelism style. Flag format: CONGRESS{two-words}. Example: CONGRESS{data parallelism}.
Show hint
Two words; a rank-2 array + its partition style.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.