Archive
Evaluation & Benchmarks

The Four Axes Of HELM

Archive
Medium
150pts0 solves
Stanford HELM's Core scenarios evaluate models along multiple dimensions. Four of them are: _____(1), _____(2), _____(3), and _____(4). Fill the 4 blanks in the order they appear in the HELM paper. Flag format: CONGRESS{1:[word],2:[word],3:[word],4:[word]}. Example: CONGRESS{1:accuracy,2:calibration,3:robustness,4:fairness}.
Show hint
How right, how certain, how stable, how just.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.