HRLS Hunter: First German supercomputer with AMD's giant APU MI300A

The Hunter supercomputer is ready for use in Stuttgart. It primarily uses AMD's Instinct MI300A combined processor.

Save to Pocket listen Print view
Image of the Hunter from the HLRS

(Image: Julian Holzwarth / HLRS)

3 min. read

Everything on schedule: The High Performance Computing Center Stuttgart (HLRS) inaugurates its Hunter supercomputer. It is now available to German researchers and corporate partners, for example for weather and climate modeling, biomedical research, materials science and engineering simulations.

Compared to other modern supercomputers, Hunter is quite small with a peak computing power of 48 petaflops. However, it is exciting from a technical perspective: most of the computing power comes from nodes without independent processors. Instead, the system relies on AMD's Instinct MI300A, which combines CPU cores, a GPU accelerator and high-bandwidth memory (HBM3) on one carrier.

Hunter is intended as a transitional system to the Herder exascale supercomputer, which is due to go online in 2027.

The main computing power is provided by 752 MI300A accelerators. This puts Hunter above our estimate, but at 48 petaflops it is also faster than announced. The value refers to complex FP64 calculations. AI algorithms with INT8 and similar data formats run much faster.

Each individual accelerator integrates 24 Zen 4 cores and 228 compute units – because they are not designed for 3D rendering, AMD does not call the 14,592 computing units they contain shaders, as is the case with graphics cards, but rather stream processors. In addition, there are 128 GBytes of HBM3, which transfer 5.3 TByte/s. A large amount of fast memory is particularly useful for AI algorithms, for which the system is explicitly designed.

In addition to the Instinct hardware, HLRS also operates pure CPU nodes in the Hunter. A total of 512 AMD Epyc 9374F with a total of 16,384 cores are located in 256 nodes. Each individual processor only uses 32 cores, but has a 256 MByte level 3 cache. This makes the CPUs particularly suitable for latency-critical applications. 768 GByte DDR5-4800 RAM per node round off the system.

HPE builds Hunter on the basis of its Cray EX4000 platform with liquid cooling. A Cray Clusterstor E2000 storage rack contains 2120 disks with a total capacity of 25 petabytes – typically a mix of HDDs and SSDs.

HLRS emphasizes that Hunter is almost twice as fast as the previous Hawk supercomputer. However, the facility only compares the old main system consisting of 8192 Epyc 7742 with 64 Zen 2 cores each and excludes the retrofitted GPU cluster with 192 Nvidia A100s.

A pleasing fact: Hunter reduces the energy requirement by 80 percent to 560 kilowatts compared to the previous Hawk supercomputer. In addition to a leap in efficiency thanks to newer technology, the GPUs help with more computing power per watt thanks to their massive parallelization.

Dynamic power limitation also increases efficiency: many applications usually run in parallel on the supercomputer. In the case of a computationally intensive application, the processor or accelerator responsible is given the highest possible power budget. For memory-intensive but not computationally intensive applications, the chips are clocked down to save energy. On average, the supercomputer complies with the intended overall limit. According to HLRS, this reduces the electrical power consumption by around 20 percent, with negligible performance losses.

Empfohlener redaktioneller Inhalt

Mit Ihrer Zustimmung wird hier ein externer Preisvergleich (heise Preisvergleich) geladen.

Ich bin damit einverstanden, dass mir externe Inhalte angezeigt werden. Damit können personenbezogene Daten an Drittplattformen (heise Preisvergleich) übermittelt werden. Mehr dazu in unserer Datenschutzerklärung.

(mma)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.