KV Cache
Medium150 pts0 solves
During generation, the model recomputes attention for all past tokens at each step. The KV cache stores intermediate results to avoid this.
What does it store?
Flag format: CONGRESS{[what_is_stored]}
Example: CONGRESS{embedding_vectors}
Hint
The K and V in 'KV' refer to attention mechanism components.