AI Model DeepSeek v4: New Generation with 1.6 Trillion Parameters

The Chinese startup DeepSeek has released a new AI generation. The model features a new architecture and offers a larger context window.

listen Print view
Mobile screen with DeepSeek logo

(Image: Runrun2 / Shutterstock.com)

4 min. read

A year ago, the Chinese AI startup DeepSeek caused a shock in the AI industry: The AI model DeepSeek-R1 showed comparable performance to top US models at a significantly lower price, causing a stock market tremor. As later became known, the training of DeepSeek-R1 cost less than 300,000 US dollars. Now, a new generation, DeepSeek v4, has been released as a preview. The new flagship model remains available as open source free of charge and comes in a Pro and a Flash variant.

The big shock might not happen this time. While DeepSeek is once again at the forefront of open source, experts place its performance about three to six months behind the absolute top models on the market, not on par. However, the significant price advantage remains. The Pro model is considerably pricier for API calls than DeepSeek v3.2. But it is still far below the prices charged by OpenAI and Anthropic. For example, OpenAI's GPT-5.5, according to the company's benchmark data, costs double for comparable coding tasks. The competition sprint might now turn into a marathon. An overview of Chinese open-source AI shows how the Chinese open-source AI is developing overall after the DeepSeek shock.

A lot has happened under the hood: V4 is a true generational change with an entirely new architecture, an eight-times longer context window, and, according to the documentation provided by DeepSeek, a noticeably better coding and math level.

V3.2 had 685 billion parameters; V4-Pro reaches 1.6 trillion – more than double. The new model can process up to one million tokens of context – meaning very long documents, codebases, or conversations – and requires only a fraction of the computing power of previous DeepSeek models for this. For comparison, V3.2 supported a maximum of 128,000 tokens of context. The predecessor introduced “DeepSeek Sparse Attention” (DSA) as a key innovation – a more efficient attention architecture for long texts. V4 builds on this and combines two new mechanisms.

Videos by heise

There are apparent weaknesses in general knowledge – other top models are said to be significantly better here. The model's reasoning capabilities can now be controlled in three instead of two stages: Non-Think, Think High, and Think Max, instead of the previous Thinking and Non-Thinking. DeepSeek is apparently speculating primarily on developers as customers: In the self-representation of the new model, coding benchmarks, reasoning, and agentic tasks are prominently featured. OpenAI is also increasingly focusing on developers as a target group and has restructured its ChatGPT plans around the coding tool Codex. The potential savings compared to US models are likely to interest many here.

DeepSeek-V4-Pro costs 1.74 US dollars per million input tokens and 3.48 US dollars per million output tokens. The Flash variant costs 0.14 US dollars per million input tokens and 0.28 US dollars per million output tokens. The US business medium Bloomberg reports that DeepSeek is currently experiencing a capacity bottleneck for the Pro model due to a shortage of computing power. New Huawei Ascend 950 clusters are expected to alleviate the shortage in the second half of the year. Then prices might drop.

(mki)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.