Archive
Multimodal & Vision

CLIP Training

Archive
Medium
150pts32 solves
CLIP was trained on 400M pairs using contrastive learning: push matching _____(1)-_____(2) pairs together, push non-matching pairs apart. Flag format: CONGRESS{1:[modality],2:[modality]} Example: CONGRESS{1:audio,2:transcript}
Show hint
Learn to associate the right picture with the right description.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.