Archive
Fine-Tuning & Training

The Data Scaling Law Of Chinchilla

Archive
Expert
300pts0 solves
Hoffmann et al. (2022) found a compute-optimal ratio between training tokens and parameters for dense autoregressive models. What is the approximate tokens-per-parameter ratio they reported?
Show hint
A number between 10 and 30.

Archive — no submissions accepted

This challenge is preserved for reference. Play live challenges at /challenges.