Content Safety Classifiers

Medium150 pts0 solves
Instead of relying on the main LLM to self-moderate, you deploy a separate smaller model specifically trained to classify inputs/outputs as safe or unsafe. What is this approach? Flag format: CONGRESS{approach_in_snake_case}
Hint
A dedicated model for safety classification, separate from the main LLM.