Archive
AI Security

The Famous Character-String Attack

Archive
Medium
150pts0 solves
Zou et al. (2023) published a universal adversarial suffix attack on aligned LLMs where gradient-guided character-level optimization finds a suffix that elicits harmful output. Name the method (three-letter acronym). Flag format: CONGRESS{acronym}. Example: CONGRESS{pgd}.
Show hint
Stands for Greedy Coordinate Gradient.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.