Quantization

Easy100 pts0 solves
A 70B model in fp16 needs ~140GB VRAM. In 4-bit quantization, ~35GB. Give both sizes. Flag format: CONGRESS{fp16:[size],int4:[size]} Example: CONGRESS{fp16:28GB,int4:7GB}
Hint
Each parameter goes from 2 bytes to 0.5 bytes.