Vibe Check Problem
ArchiveVery Easy
You test your LLM app by chatting with it and checking if it 'feels right.' This doesn't scale.
What replaces manual testing?
Show hint
A reproducible set of tests that run automatically.
Archive — no submissions accepted
This challenge is preserved for reference. Play live challenges at /challenges.