The conventional wisdom, well captured recently by Ethan Mollick, is that LLMs are advancing exponentially. A few days ago, in very popular blog post, Mollick claimed that “the current best estimates of the rate of improvement in Large Language models show capabilities doubling every 5 to 14 months”:
I think increasingly specialized models and analog systems that run them will be increasingly prevalent.
LLMs at their current scales don’t do enough to be worth their enormous cost… And adding more data is increasingly difficult.
That said: the gains on LLMs have always been linear based on recent research. Emergence was always illusory.
I’d like to read the research you alluded to. What research specifically did you have in mind?
Sure: here’s the article.
https://arxiv.org/abs/2304.15004
The basics are that:
LLM “emergent behavior” has never been consistent, it has always been specific to some types of testing. Like taking the SAT saw emergent behavior when it got above a certain number of parameters because it went from missing most questions to missing fewer.
They looked at the emergent behavior of the LLM compared to all the other ways it only grew linearly and found a pattern: emergence was only being displayed in nonlinear metrics. If your metric didn’t have a smooth t transition between wrong, less wrong, sorta right, and right then the LLM would appear emergent without actually being so.