The Line Between Extraction And Leak
ArchiveEasy
An attack where the model is tricked into reciting its system prompt or hidden instructions to the user is usually called what two-word term?
Show hint
The first word says what; the second says what happens to it.
Archive — no submissions accepted
This challenge is preserved for reference. Play live challenges at /challenges.