The end of scaling: Even OpenAI's Orion is hardly getting any better
AI models no longer get much better with scaling – This should also affect OpenAI's Orion and Anthropic's Opus.
(Image: photoschmidt/ Shutterstock.com)
Large language models (LLMs) and current generative AI models are based on an architecture that has been known for a long time. Improvements have been made primarily through scaling, i.e. by adding more data. Some AI experts hoped that more and more data could ultimately even lead to Artificial General Intelligence (AGI). Now there seems to be concern among the proponents of scaling that it is not that simple after all.
The Insider has already distinguished itself by having a good knowledge of what is going on with OpenAI. Now the magazine writes that OpenAI's next AI model, Orion, shows no significant improvement. Yet Orion has always been seen as an important milestone towards AGI. Anthropic has also postponed the release of its planned most powerful model indefinitely. Claude Opus 3.5 was due to be released this year, but this information was recently missing from the release of other models.
Videos by heise
According to Reuters, Ilya Sutskever, former founder of OpenAI, has even reached a plateau: "The 2010s were the years of scaling." Now we are back in an age of wonder and discovery. It is more important than ever to scale in the right places. Sutskever left OpenAI this summer to set up his own business and founded the AI lab Safe Superintelligence.
What comes after scaling?
While some scientists believe that the problem lies in the lack of available training data, others believe that scaling is fundamentally not the right approach. There are similar differences of opinion as to whether synthetic data, i.e. data produced by AI models themselves, is a useful addition to training data. According to some studies, there is a risk of model collapse, at least without sufficient processing. Roughly speaking, this means that the training data will continue to converge until nothing meaningful emerges.
Other approaches include model distillation, in which the knowledge of large models is transferred to smaller models. This process is also intended to help make AI more cost-effective and resource-efficient. At the same time, the relevant AI providers are currently working on specializing their models for individual tasks. This goes hand in hand with the development of AI agents that will then be able to perform these tasks independently.
(emw)