Small language models: where small AI models beat the big ones
LLM offshoots with a lower number of parameters are more economical in operation, but also have clear weaknesses.
(Image: iX)
The fact that bigger does not automatically mean better is particularly evident in the field of AI. Because where large language models are as powerful as they are resource-hungry, their smaller counterparts can score points as frugal specialists in local use. Christian Winkler, cover author of the new iX 4/2025, explains in an interview why it doesn't always have to be parameters in the double-digit billion range and what you can realistically expect from them.
Everyone is talking about large language models, but less is said about their smaller variants. When is a model considered a small language model (SLM) and for which fields of application is it suitable?
The transition is more or less fluid. Some experts describe a model with seven billion parameters as small, but I would rather draw the line at four billion parameters. SLMs are suitable for various applications, for example for summarizing texts or as a generative part of a RAG model. They are less suitable for use as a knowledge base.
What about the hardware requirements of SLMs? Is a consumer laptop sufficient to run a text or code generator locally and offline?
As soon as you want to work without a GPU, you should quantize the models. And with SLMs, a laptop can generate texts faster than you can read them. That's enough for most applications. As a code generator, this works better with medium-sized LLMs, because knowledge storage also plays a major role here. With a small GPU or a Mac, however, this is still sufficiently fast, for example with Qwen2.5-Coder-7B-Instruct, which according to the above definition still belongs to SLMs for some.
Videos by heise
The tendency to invent false information is one of the major weaknesses of generative AI. How do SLMs fare with this problem?
Small models are at a disadvantage when it comes to so-called hallucinations. Due to the small number of parameters, they have not aggregated as much knowledge. You therefore have to be particularly careful when prompting and checking the results. The models are therefore better suited to summarizing texts. However, they can also be used to generate particularly creative ideas – in which case the hallucinations are actually more desirable.
Christian, thank you very much for your answers! An overview of the possibilities and limitations of small language models can be found in the new iX. We also show what tools are available for local AI operations – and take a look at the small Phi models from Microsoft. Readers can find all this and many other topics in the April issue, which is available now in the heise store or at newsagents.
In the series "Three questions and answers", iX aims to get to the heart of today's IT challenges – regardless of whether it is the user's view in front of the PC, the manager's view or the everyday life of an administrator. Do you have any suggestions from your day-to-day work or that of your users? Whose tips on which topic would you like to read in a nutshell? Then please write to us or leave a comment in the forum.
(axk)