Archive
Fine-Tuning & Training

Three Columns Of A Preference Dataset

Archive
Easy
100pts0 solves
A standard DPO dataset has three fields per row: the original _____(1), the _____(2) completion, and the _____(3) completion. Fill the 3 blanks. Flag format: CONGRESS{1:[word],2:[word],3:[word]}. Example: CONGRESS{1:prompt,2:chosen,3:rejected}.
Show hint
A seed, a winner, a loser.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.