15,000 watts: AI accelerator's thirst for power continues to rise rapidly
Korean chip researchers explain their roadmap for ever faster HBM memory and more powerful AI computing chips.
SK Hynix explains technical improvements to several generations of High Bandwidth Memory (HBM).
(Image: SK Hynix)
In around ten years, AI accelerator modules for data centers are expected to consume up to 15,000 watts of power. This is what researchers at the Terabyte Interconnection and Package Laboratory (Teralab) at the Korean institute KAIST expect.
Of these 15 kW, just under 10 kW will presumably be accounted for by eight AI processor chiplets, each of which consumes 1.2 kW. The remaining 5 kW require 32 memory chip stacks, each of which consists of 24 individual DRAM dies with a capacity of 80 gigabits each. This seventh-generation high bandwidth memory (HBM7) is intended to provide a total of 6 TByte of AI memory with a data transfer rate of around 1 petabyte per second (PByte/s).
AI accelerators with a power consumption of around 15 kW are already available as a special solution, namely the wafer scale engines from Cerebras. However, these are fundamentally different to the more common AI accelerators from Nvidia, AMD and other companies.
Videos by heise
The current HBM roadmap of the KAIST Teralab does not aim to precisely predict the release dates of future AI accelerators. Rather, such a roadmap presents foreseeable technical challenges and potential solutions. In this sense, the HBM roadmap is an estimate of how the capacity and data transfer rate of DRAM will develop, which new technologies for chip packaging will emerge and the approximate power consumption of the combined chips that can be expected.
From this, the researchers in turn deduce which cooling methods are required. Some will have to be newly developed to achieve the targeted packing densities.
(Image:Â KAIST Teralab)
Chiplet puzzle
The researchers at KAIST Teralab are using the Nvidia roadmap for AI accelerators as a basis. Nvidia is already using almost the maximum size for a single chip that lithography systems can expose for chip production. Experts expect this “reticle limit” to shrink somewhat in the future, possibly due to restrictions on high-NA-EUV lithography.
Future generations of AI accelerators will then no longer consist of just two GPU chips (2025: Blackwell/B200, 2026: Rubin/R200), but from 2028/2029 (Feynman, F400) four and in ten years (2035) perhaps even eight.
The power consumption per GPU chiplet will increase from 800 to 1200 watts during this time.
HBM plan
(Image:Â KAIST Teralab)
To be able to supply each GPU chiplet with a sufficient amount of data quickly enough, the capacity and speed of HBM must increase. From HBM to HBM2 and HBM3 to the current HBM3E, this was achieved using several established parameters: more capacity per chip and more chips per stack, which have to be ground thinner. And higher clock frequencies, which in turn require the supply and data signal voltage to be reduced to keep power consumption in check. The demands on signal processing also increase when more and more chips are connected to a single line despite ever shorter clock cycles.
HBM4 will also double the number of data signal lines per stack from 1024 to 2048. This therefore requires changes to the memory controllers in the GPU chips, to the number of connections per GPU chip and to the silicon interposers on which the chiplets are mounted and in which the connection lines run.
In addition, there are more and more HBM stacks per GPU, so instead of four, there will soon be eight and later 16 or 32.
Hot stacks
According to KAIST Teralab, a current HBM3E stack with eight or twelve layers of 24 gigabit chips (i.e., 24 or 36 GB capacity) already converts up to 32 watts into waste heat. With HBM4 with the same capacity but twice the speed, the figure is already 43 watts, and with 48 GBytes, it is as high as 75 watts.
The stacking methods must therefore improve not only the packing density but also the heat dissipation.
The experts at KAIST Teralab present version 1.7 of their HBM roadmap on YouTube, and a PDF version of the roadmap is also available on Google Drive.
(ciw)