DeepSeek's MoE Routing Trick
ArchiveHard
DeepSeek-V2/V3's MoE design includes a small fixed subset of experts that are used for every token (not routed), in addition to the routed ones. Name this subset (two words). Flag format: CONGRESS{two-words}. Example: CONGRESS{common experts}.
Show hint
Adjective for 'used by all' + what MoE's sub-networks are called.
Archive — no submissions accepted
This challenge is preserved for reference. Play live challenges at /challenges.