Study: Large language models are still a long way from human thinking
Until now, some developers hoped that language models would learn skills that go beyond their training workload. This has proven to be a fallacy.
Artificial intelligence is not as capable of thinking as initially thought, a study suggests.
(Image: PHOTOCREO Michal Bednarek/Shutterstock.com)
According to a research team from the Technical University (TU) of Darmstadt and the University of Bath, scaling large language models (LLMs) does not lead to them developing independent skills that they have not been trained to develop. The LLMs always fall back on contextual learning.
The researchers investigated emergent abilities of the LLMs, i.e. unforeseen and sudden leaps in the models' performance. Previously, it appeared that scaling up led to better performance: The larger the model became and the more data was included in the training, the more language-based tasks the models were able to solve. "On the one hand, this raised the hope that further scaling would make the models even better. On the other hand, however, there was also concern that these capabilities could become dangerous, as the LLMs could take on a life of their own and possibly escape human control," explains the TU. However, the LLMs are far removed from the abilities of humans, say the scientists.
Four language model families with a total of 20 models examined
For the study, the team experimented with 20 models and 22 tasks in two settings. The four model families used were GPT, T5, Falcon 2 and LLama. The scientists started from two hypotheses: All previously observed emergent abilities were a consequence of in-context learning (ICL), i.e. the ability to perform a task based on a small number of examples. The second hypothesis was that the emergent abilities of instruction-tuned LLMs were more indicative of instruction enhancement leading to implicit contextual learning, rather than a sign of emergent abilities.
The scientists see both hypotheses confirmed by the results of the study. "The distinction between the ability to follow instructions and the inherent ability to solve a problem is a subtle but important one that is relevant to the methods employed in the use of LLMs and the problems they are designed to solve", the authors write. For example, if the AI follows instructions via a prompt without thinking for itself, the results fit the task but do not make sense on a logical and rational basis. "This is reflected in the well-known phenomenon of 'hallucinating', in which the LLM produces fluid but factually incorrect output."
Following instructions is not a sign of intelligence
"The ability to follow instructions does not imply that one has logic skills and, more importantly, it does not imply the possibility of latent, potentially dangerous abilities," the team said. The scientists assume that their findings apply to all models that are prone to hallucinations or require prompt engineering.
Videos by heise
The study indicated that the abilities previously interpreted as emergent are more likely to be a combination of context-based learning, model memory and linguistic knowledge. This also explains the contradiction that AI skills are sometimes outstanding and sometimes poor.
Nevertheless, the team does not want to rule out a possible threat from artificial intelligence. "Rather, we show that the alleged emergence of complex thinking skills associated with certain threats is not supported by evidence and that we can control the learning process of LLMs well after all," says TU computer science professor Iryna Gurevych, who led the study with Harish Tayyar Madabushi from the University of Bath in the UK. The research team is contributing to a deeper understanding of the capabilities and limitations of LLMs. This demystifies large language models and helps to dispel associated security concerns and create a framework for more efficient use.
(are)