Archive
AI Security

Sandwich Defense

Archive
Hard
200pts42 solves
A defense structures messages as: _____(1) instructions → _____(2) message (untrusted) → _____(3) reminder of rules. List the 3 message types in order. Flag format: CONGRESS{1:[type],2:[type],3:[type]} Example: CONGRESS{1:user,2:assistant,3:user}
Show hint
System bread, user filling, system bread.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.