Content Safety Classifier

Medium150 pts0 solves
Instead of relying on the main LLM to self-moderate, you deploy a dedicated smaller model to classify inputs/outputs as safe or unsafe. What is this dedicated model? Flag format: CONGRESS{[what_it_is]} Example: CONGRESS{rule_based_regex}
Hint
A separate, specialized model focused only on safety classification.