SK Hynix manufactures HBM4 stacks with over 2 TByte/s in series production

The first HBM4 stack memories, which far exceed the JEDEC specifications, are ready earlier than announced. Nvidia needs them for AI accelerators.

Chip stack with HBM4 from SK Hynix.

(Image: SK Hynix.)

Sep 14, 2025 at 8:28 pm CEST

3 min. read

By

Nico Ernst

SK Hynix from South Korea, which recently became the world's largest DRAM manufacturer , announced the series production of its HBM4 stacks last Friday. In May, the company was still assuming that it would not be able to achieve this goal until October – now it is a few weeks earlier.

In contrast to HBM3E, which is currently mainly used in AI accelerators and is technically the fifth generation of high-bandwidth memory, HBM4 is expected to be at least twice as fast. Although the bandwidths per pin only increase moderately, the width of the bus for each of the stacks doubles: from 1024 bits with HBM3E to 2048 with HBM4. It was previously unclear how high the HBM manufacturers Micron, Samsung and SK Hynix could push the bandwidth per pin. Only the latter have now made concrete announcements.

The standardization committee JEDEC specifies 8 Gbit/s per pin for HBM4. According to a statement from SK Hynix, however, "over 10 Gbit/s" can also be achieved in mass production. The company does not specify the speed more precisely. Even at exactly 10 Gbit/s, this results in over 2.5 TByte/s for the entire chip stack. The first samples from SK Hynix six months ago consisted of 12 layers with 24 Gbit capacity each, i.e. 36 gigabytes of capacity per stack.

144 GByte GPU memory with four stacks

With just four of these stacks, a future GPU could therefore be equipped with 144 gigabytes of RAM, which is also more than twice as fast as before. And, according to SK Hynix, 40 percent more energy efficient. The power and heat budget saved could then be allocated to the GPU.

Videos by heise

The greatest demand for high capacities and, above all, bandwidths is in AI applications, where fast GPU memory is essential for large models. According to unconfirmed reports from South Korea, Nvidia in particular is said to have repeatedly demanded particularly high bandwidths from DRAM manufacturers for its "Rubin" GPU architecture, which is expected next year. Expectations are so high that SK Hynix shares briefly rose by over seven percent after the announcement and have continued to rise since then.

It also became clear this week that the increased bandwidths, together with more GPU computing power, will allow new approaches to inferencing. Nvidia announced a special design called Rubin CPX for the end of 2026. It should allow both a mixture of experts and contexts of different lengths in one system. The background can be found in our report on Rubin CPX.