Supercomputers: HPE Discovery to follow Top500 runner-up Frontier in 2027

The next generation of Exascale supercomputers will arrive from late 2027 and will feature AMD Epyc server CPUs of the "Venice" type and AMD's Instinct MI430X.

listen Print view
Artistic representation of the Discovery supercomputer

Artistic representation of the Discovery supercomputer

(Image: HPE/ORNL)

4 min. read
Contents

The supercomputer Discovery, costing around 500 million US dollars through a tender, is set to be housed at Oak Ridge National Laboratory (ORNL) and, like its predecessor Frontier, will be built by HPE. The submission deadline for the OLCF-6 project expired just over a year ago. HPE has now been named the winner of the tender.

The new supercomputer is expected to be operational in late 2027 or early 2028. Like the current Top500 runner-up Frontier, it will feature AMD main processors flanked by Instinct MI430X AI accelerators. The exact number of units to be installed is not yet known. Interestingly, the APU configuration apparently does not offer sufficient advantages here either, and the classic approach of separate CPUs and accelerators is being pursued. However, with the Zen 6 architecture, we expect a successor to the MI300A APU, perhaps as the MI400A.

Update

AMD advised in the meantime, that Discovery is scheduled for a 2028 delivery and will go operational in 2029. The Lux AI cluster however will be delivered in the next six months.

Videos by heise

However, it is certain that Discovery will be based on the HPE Cray Supercomputing GX5000 and will include an AI-optimized storage system (HPE Cray Supercomputing Storage Systems K3000). This system uses pure flash memory and, according to the manufacturer, is the first commercially produced storage system based on Distributed Asynchronous Object Storage (DAOS).

Previously, DAOS on Intel's Aurora supercomputer was used. In Aurora, the 230 PByte storage system already achieved a transfer rate of more than 31 TByte/s, whereas the Lustre storage system "Eagle" transferred at least 650 GByte/s for the large amounts of data. IOPS performance increased from 54 to 75 million IOPS compared to the E2000 storage rack.

The fact that HPE is bringing this technology to the mass market is also due to the entire DAOS team moving from Intel to HPE. The K3000 racks will be based on ProLiant server solutions, which HPE, however, did not want to specify precisely in advance.

The new server rack generation relies on improved liquid cooling and a more compact design. The new cooling system covers not only CPUs, accelerators, and storage, but also virtually all significantly heat-generating components, including the network infrastructure. This allowed HPE to reduce the width of the racks from 2.1 meters to 1.35 meters for each compute and cooling rack, thus aiming to fit 25 percent more racks into existing spaces. The individual cabinets accommodate hardware with an electrical power of 400 to 600 kW each. Currently, a maximum of about 150 kW is common.

Furthermore, HPE was able to increase the possible power consumption per compute blade from 11 to over 25 kilowatts. The water no longer needs to be cooled down to 32 degrees Celsius before re-circulation, but only to 40 degrees Celsius. This increases the energy efficiency of the entire system – also a requirement of the OLCF-6 tender and, according to HPE, an important point especially for European customers.

Moreover, the new cooling system now also allows for mixed configurations of individual cabinets with compute blades of different (electrical) power, as the flow rate per compute blade can be regulated separately.

According to HPE, the first systems based on the GX5000 are expected in early 2027.

Somewhat less exciting is the also approved ORNL AI cluster "Lux". It uses already available technology, including AMD Epyc CPUs, Pensando network cards (also from AMD), and MI355X accelerator cards. Lux will use HPE's Proliant Compute Server XD685.

That Lux is not designed as a groundbreaking system is already evident in its announcement. While Discovery is intended to enable new scientific horizons, Lux is merely mentioned as enabling more scientists to access specialized AI resources.

(csp)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.