System Prompt Behavior
Medium150 pts0 solves
Claude treats the system prompt as a strong behavioral constraint. GPT-4 treats it more like a suggestion and can be more easily overridden by user messages.
How does each model treat the system prompt?
Flag format: CONGRESS{claude:[behavior],gpt:[behavior]}
Example: CONGRESS{claude:ignores_it,gpt:caches_it}
Hint
Try the same jailbreak on both. One yields much more easily.