Multimodal & Vision

The Apple Vision Language Model Suffix

Archive

Hard

200pts0 solves

Apple's 2024 paper describing their approach to vision-language pretraining at scale — scaling law studies, image encoder choice, and data-mixing ablations. Which three-character name did they give the final model?

Show hint

Letters + a single digit.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.