Huawei competes against Nvidia with many chips
With a sheer mass of Ascend 910C accelerators, Huawei is building servers that are said to be faster than Nvidia's Blackwell systems in some cases.
Symbol image with rendered racks.
(Image: Gorodenkoff/Shutterstock.com)
Huawai has been making headlines since last week with its new Cloudmatrix 384 server system. In terms of raw performance, the Cloudmatrix 384 is set to overtake Nvidia's current top systems such as the GB200 NVL72. The number of AI accelerators used also gives Huawei its name – in this case 384 Ascend 910C.
The well-connected website Semianalysis compares the two systems. A single Ascend 910C is considerably slower than Nvidia's Blackwell accelerator. Huawei achieves the high raw performance through the sheer number of AI accelerators. In the Bfloat16 data format, for example, 384 Ascend 910Cs are said to be capable of 300 trillion computing operations per second (300 petaflops). Nvidia's GB200 NVL72 stands at 180 petaflops, but achieves even higher values in even narrower data formats such as FP4, FP8 and INT8. For four-bit computing operations, for example, a GB200 NVL72 achieves 1.44 exaflops (1440 petaflops) according to the data sheet.
Complex networking and high power consumption
The high number of accelerators makes networking particularly difficult. Huawei relies on 400 Gbit/s optical connections so that the chips can train AI models together. The hardware is divided into a total of 16 racks: 12 for the computing accelerators and four for the network switches.
In addition to the high level of complexity, the setup has one other main disadvantage: the electrical power consumption. A Cloudmatrix 384 system is said to consume around 560 kilowatts – almost four times as much as a GB200 NVL72. This in turn makes it more difficult to cool the components.
Videos by heise
Only possible with foreign help
In addition, the Cloudmatrix 384 is not a purely Chinese Nvidia competitor. Most of the compute chips for the Ascend 910C are said to come from the Taiwanese chip contract manufacturer TSMC. Huawei is said to have temporarily circumvented TSMC's delivery stop via an intermediary company and hoarded several million chiplets. The Chinese chip contract manufacturer SMIC can also produce the chiplets using 7-nanometer technology, but presumably only in comparatively small quantities so far.
The stacked memories of the High-Bandwidth Memory (HBM2e) type apparently come from Samsung. The USA did not impose export restrictions on HBM to China until the end of 2024. According to Semianalysis, however, Chinese companies are creative in circumventing trade restrictions.
The restrictions apply to HBM sold individually and to fast AI accelerators. To circumvent this, exporters allegedly deliberately combine slow and cheap chips with HBM2e on a carrier and sell them to China, where the memory chips are then soldered down.
Some of the development tools and chemicals are also said to come from abroad.
Empfohlener redaktioneller Inhalt
Mit Ihrer Zustimmung wird hier ein externer Preisvergleich (heise Preisvergleich) geladen.
Ich bin damit einverstanden, dass mir externe Inhalte angezeigt werden. Damit können personenbezogene Daten an Drittplattformen (heise Preisvergleich) übermittelt werden. Mehr dazu in unserer Datenschutzerklärung.
(mma)