Archive
Fine-Tuning & Training

The Four Stages Of Instruct Training

Archive
Very Easy
50pts0 solves
A typical instruct-model training pipeline has four stages: _____(1) (next-token on the web), _____(2) (supervised fine-tuning on curated dialogs), _____(3)-model training, and _____(4) (policy update with the reward model). Fill the 4 blanks. Flag format: CONGRESS{1:[word],2:[word],3:[word],4:[word]}. Example: CONGRESS{1:pretrain,2:sft,3:reward,4:rlhf}.
Show hint
Four short nouns — one is a three-letter acronym, one is a four-letter acronym.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.