On the limits of large language models and the unattainability of AGI and ASI

Large language models (LLMs) deliver impressive results, but are they truly capable of reaching or surpassing human intelligence?

listen Print view
Robot hand and puzzle

Adding Human Capabilities to LLMs

(Image: Erstellt vom Autor mit KI)

20 min. read
By
  • Dr. Michael Stal
Contents

The rapid development of large language models has sparked an intense debate about their potential to achieve artificial general intelligence and, ultimately, artificial superintelligence.

The Pragmatic Architect – Michael Stal
Michael Stal

Prof. Dr. Michael Stal has been working at Siemens Technology since 1991. His research focuses on software architectures for large complex systems (distributed systems, cloud computing, IIoT), embedded systems, and artificial intelligence. He advises business units on software architecture issues and is responsible for the architectural training of senior software architects at Siemens.

While these systems exhibit remarkable capabilities in language processing, reasoning, and knowledge synthesis, fundamental architectural and theoretical limitations suggest they may not bridge the gap to true general intelligence. This analysis explores the core technical hurdles preventing current LLM paradigms from reaching AGI or ASI.

Artificial General Intelligence (AGI) is a hypothetical form of artificial intelligence that matches or surpasses human cognitive abilities across all domains of knowledge and reasoning. Unlike narrow AI systems designed for specific tasks, AGI would possess flexible intelligence capable of learning, understanding, and applying knowledge in any field with the same ease as human intelligence. Key characteristics of AGI include autonomous learning from minimal examples, knowledge transfer across diverse domains, creative problem-solving in novel situations, and the ability to grasp and manipulate abstract concepts with genuine understanding, not just pattern recognition.

Mehr Infos
M3 logo, hexagons

(Image: Bridgman/Adobe Stock)

On April 22 and 23, 2026, Minds Mastering Machines will take place in Karlsruhe. The conference will focus on practical topics, ranging from classic machine learning to LLMs and agentic AI.

Artificial Superintelligence (ASI) goes beyond AGI, representing an intelligence that vastly surpasses human cognitive abilities in all areas, including creativity, general wisdom, and problem-solving. ASI would not merely match human intelligence but exceed it by orders of magnitude, potentially achieving insights and capabilities unimaginable to humans. The distinction between AGI and ASI is crucial, as AGI represents human-level general intelligence, while ASI implies a fundamentally different category of intelligence.

Large language models, in their current form, are statistical systems trained on vast text corpora to predict the most probable next token in a sequence. These models learn to compress and reproduce patterns from their training data, enabling them to generate coherent and contextually relevant responses. However, their underlying mechanism differs fundamentally from the flexible, adaptive intelligence that characterizes AGI.

Videos by heise

The Transformer architecture, which underpins most current LLMs, introduces several fundamental limitations that constrain their potential for general intelligence. While the attention mechanism is powerful for sequence processing, it operates with fixed weight matrices learned during training. These weights encode statistical relationships between tokens but cannot dynamically adapt to entirely new concepts or domains without retraining. This static nature stands in stark contrast to biological intelligence, which continuously adapts its neural connections based on new experiences.

The feedforward processing of Transformers creates another significant limitation. Information flows in one direction through the network's layers, preventing the iterative, cyclical processing characteristic of human cognition. Human thought involves continuous feedback loops where higher-level concepts influence lower-level processing and vice versa. This bidirectional flow allows humans to refine their understanding through reflection and re-conceptualization—abilities still absent in current LLM architectures.

Furthermore, the discrete tokenization process, which converts continuous human language into discrete tokens, leads to information loss and limits the model's ability to grasp subtle nuances and context-dependent meanings. Human language processing occurs simultaneously across multiple levels, from phonetic and morphological to semantic and pragmatic, with continuous integration across these levels. The bottleneck of tokenization prevents LLMs from accessing this full spectrum of language processing.

The objective of predicting the next token, which drives LLM training, imposes fundamental constraints on how these systems understand and process information. This training paradigm optimizes for statistical correlation rather than causal understanding, leading to sophisticated pattern matching rather than genuine comprehension. While this approach enables impressive performance on many language tasks, it fails to cultivate the causal reasoning and world-modeling capabilities essential for general intelligence.

The supervised learning approach used in LLM training relies on static datasets that represent a snapshot of human knowledge at a particular time. This contrasts with human learning, which involves active exploration, hypothesis generation and testing, and the continuous integration of new experiences into existing knowledge. Humans develop understanding through interaction with their environment, forming and refining mental models based on feedback from their actions. LLMs lack this interactive learning capability and cannot develop genuine understanding through experiential learning.

The scaling hypothesis, which posits that larger models trained on ever-increasing amounts of data will eventually achieve AGI, faces several theoretical challenges. Simply increasing model and dataset size addresses quantity but not the qualitative differences between pattern recognition and understanding. The emergence of new capabilities in larger models often reflects more sophisticated pattern matching rather than fundamental shifts in the nature of intelligence. Without addressing the underlying architectural and training limitations, scaling alone cannot bridge the gap between statistical processing and true intelligence.

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.