QLoRA
Medium150 pts0 solves
QLoRA fine-tunes a 65B model on one GPU by quantizing the base to 4-bit and training LoRA adapters in full precision.
Describe both components.
Flag format: CONGRESS{quantize_base:[precision],adapters:[precision]}
Example: CONGRESS{quantize_base:8bit,adapters:half_precision}
Hint
Freeze and compress the big model, train tiny adapters at full quality.