Elo Ratings
Medium150 pts0 solves
Chatbot Arena ranks LLMs by showing users two anonymous responses and letting them vote. The ranking system, borrowed from chess, is called Elo.
What is the core mechanism of Elo for LLM evaluation?
Flag format: CONGRESS{mechanism_in_snake_case}
Hint
Two models compete head-to-head, and rankings update based on who wins.