OCR vs VLM
Easy100 pts0 solves
OCR extracts text from images. VLMs go further.
What can VLMs do that OCR cannot?
Flag format: CONGRESS{understands_[what]}
Example: CONGRESS{understands_font_styles}
Hint
VLMs understand spatial relationships, tables, and what the layout means.