Why Deepseek R1 does not mean the end of OpenAI and Meta
The Chinese AI model R1 surprises with its performance. But the lack of chips continues to slow down China's AI progress.
(Image: issaro prakalung / Shutterstock.com)
The Chinese AI model R1 from Deepseek has shocked the big US companies. However, the Chinese have not won the race for technological supremacy in AI.
For the US public, the release of the Chinese AI model R1 is a déjà vu of the unpleasant kind. It is reminiscent of the humiliation suffered by the Soviet Union in 1957, which demonstrated its technical superiority over the USA with the flight of the first space satellite. Because R1 is not only as good as o1 from OpenAI, it is also much more efficient, costing only a fraction for training – it is also completely open source.
Panic and denial get us nowhere
Initial reactions to the model range from naked panic – shares in AI chip manufacturer Nvidia plummeted – to fatalism – the Chinese have won the AI race, the US export restrictions were pointless – to denial of reality. For example, tech investor Neal Khosla is spreading the word that the release of R1 was just a Chinese psychological warfare operation aimed at destabilizing the US economy. However, a sober look at the technical background shows that Deepseek's R1 alone will not herald the end of Silicon Valley. There are several reasons for this.
Deepseek is technically well done, but not a breakthrough
The paper published on the model has been praised by experts. On a technical level, however, it does not contain any fundamental innovations. All the technologies used in it are already known.
R1 is a reasoning model with a large language model at its core. This breaks down tasks into smaller subtasks, which it processes point by point – in a chain of thought. The software then searches for the best/most accurate/fastest chain and publishes its result as an answer. To learn which is the best chain of thought in each case, the software was presented with many examples of “good” chains of thought during training.
Tracked AI chips helped with training
The efficiency of the training and execution of the model is presumably the result of previous work, in which less efficient large models generate heuristic procedures and training data that can then be used to efficiently train smaller models. Meta took a similar approach with his Llama models.
This was only possible because the training of the Deepseek models had already begun when the export restrictions for AI chips were still relatively soft. Numerous Chinese companies – including Deepseek – were still stocking up on large quantities of Nvidia's powerful A100 chips at the time.
Sanctions increase pressure to innovate
It cannot be concluded from the publication of R1 that the US export restrictions have achieved nothing. Such measures always take effect with a certain time lag. On the other hand, the model shows that the pressure of sanctions also serves as an accelerator for technical development. The lack of powerful hardware creates an enormous incentive to come up with other solutions.
Openness will not remain
The fact that Deepseek has now published the model completely freely was probably actually a political decision intended to weaken OpenAI. The Chinese government is also banking on the fact that rapid AI development within the country will lead to a surge in productivity. In China, however, there is a phenomenon that characterizes politics there and thus also the population as well as the economy: Fang-Shou, derived from the two verbs Fang (to recover) and Shou (to tense). In other words, a phase of opening and relaxation is always followed by a phase of tightening or increased control – and vice versa.
Videos by heise
After the Chinese government was initially very suspicious of generative AI and wanted to regulate it heavily, it is currently focusing more on dynamism. In the long term, however, the former OpenAI developer Miles Brundage also argues that the Chinese government has no interest in a kind of AI Wild West in China. So sooner or later, openness will also come to an end.
Renewed success of DeepSeek, initially unlikely
Even though the computing power required for reasoning models is much higher than for normal language models, the amount of AI chips currently available at Deepseek appears to be enough to keep the model online for now. This is an enormous prestige success for the Chinese AI scene and hurts the US companies that want to make money with AI.
But it doesn't have to stay that way, writes China Talk, an online service specializing in AI in China. Because if the model is really useful, demand for it will rise sharply not only abroad but also at home. Then someone will have to prioritize: Will the existing computing capacity be used to train new models, or make greater use of old models?
The lack of powerful AI chips will therefore continue to slow down and hinder the Chinese AI scene for the foreseeable future. If the sanctions remain in place or even – are tightened further as planned – – the Chinese AI economy will need one or two fundamental technical breakthroughs to repeat the success of R1. At the moment, however, nobody knows if and when this will happen.
This article first appeared on t3n.
(olb)