Computex

Nvidia GB200 NVL2: Rack server for large AI models

Nvidia's Blackwell accelerators now available in dual-pack rack insert with up to 40 petaflops. Rubin with HBMe4 memory to follow in 2026.

Jensen Huang with GB200 module.

Jun 2, 2024 at 10:14 pm CEST

2 min. read

c't Magazin

By

Christian Hirsch

This article was originally published in German and has been automatically translated.

Nvidia CEO Jensen Huang presented another building block of the Blackwell computing accelerator family at his keynote in the run-up to Computex. Below the GB200 NVL72 server rack presented in March 2024, Nvidia is offering the GB200 NVL2 rack plug-in unit with two GB200 processors for AI model training. The rack is manufactured in MGX format and therefore fits into existing data centers.

The two GB200s soldered inside each consist of a Blackwell double chip and a Grace CPU with 72 ARM cores. According to Nvidia, the overall system should achieve a computing power of 40 petaflops, although the specification applies to the FP4 data type used in AI applications with only 4-bit accuracy.

The rack server offers a total of 1.3 TByte of RAM, 384 GByte of which is attached to the Blackwell chips as fast HMB3e RAM. The throughput of the HBM3e memory is 8 TByte/s in each case. The GB200s communicate with each other via NVLink at 2 x 900 GByte/s. This makes the GB200 NVL2 suitable for training large models for generative AI applications.

Blackwell Ultra and Rubin follow Blackwell

Finally, Jensen Huang gave an outlook on the upcoming AI accelerator generations. The improved version Blackwell Ultra with 12 HBM3e stacks is set to follow in 2025, before the Rubin platform with HMB4 memory arrives in 2026. This will also include the ARM CPU Vera and new network chips to connect the high-bandwidth computing accelerators.