Training Data Poisoning

Hard200 pts0 solves

An attacker contributes poisoned examples to an open-source dataset used for fine-tuning. The model learns a hidden behavior triggered by a specific phrase. What is this attack called? Flag format: CONGRESS{attack_in_snake_case}

Hint

The attack plants a hidden trigger in the training data.