Mereology Of Megatron-LM
ArchiveHard
NVIDIA's Megatron-LM sharded individual matrix multiplications across GPUs by splitting the weight matrix along one dimension and gathering activations. Name the parallelism style. Flag format: CONGRESS{two-words}. Example: CONGRESS{data parallelism}.
Show hint
Two words; a rank-2 array + its partition style.
Archive — no submissions accepted
This challenge is preserved for reference. Play live challenges at /challenges.