Multimodal RAG

Hard200 pts0 solves
To search across images and text simultaneously, both modalities must be embedded into the same vector space. What is required? Flag format: CONGRESS{[requirement]} Example: CONGRESS{separate_indexes}
Hint
Images and text need to be represented as vectors in the same space.