The Open Reasoning Benchmark With Twenty Thousand Problems
ArchiveHard
An eval that continuously scrapes new LeetCode contest problems to prevent training contamination for coding evals is called what?
Show hint
A word that describes something happening now + the task class.
Archive — no submissions accepted
This challenge is preserved for reference. Play live challenges at /challenges.