Predicting progress for Large Langue Models

Gonna pre-register this take: There’s a good chance that the explosive rate of LLM progress over the past few years is about to hit a ceiling and move into a more gradual slope of progress. I still expect LLMs to keep progressing/potentially reach AGI over the long term.

Mechanism is mostly that there aren’t many OOMs of cost-scaling left. We’ve had explosive progress because willingness-to-spend increased, but that’s gonna hit soft and hard caps very very soon. Lots of uncertainty re: running out of training data but that might be a bottleneck.

- EigenGender

I think this is correct. The rate of Moore’s Law exponential improvement for GPU and CPU hardware hasn’t really changed in the last few years. However, we’ve seen really impressive large neural networks like GPT-3 and Stable Diffusion that use massive amounts of compute to train. I think ML progress will continue in the next few years but at a gradual exponential not in leaps and bounds like it did 2019-2022.

Maybe! But results like the ones shown by GPT-3 show that we might not need Moore’s Law to get exponentially better results after all.

1 Like

Maybe, the software and datasets are definitely getting better quickly too.