Cisco Live 2025:New server model for data center AI and cloud management options

Cisco wants to serve the AI trend in data centers with a new server model and concepts for its own AI cluster.

listen Print view

(Image: thetahoeguy/Shutterstock.com)

5 min. read
By
  • Jens Söldner
  • Marco Brinkmann
Contents

In front of around 17,000 participants at the Amsterdam edition of its in-house exhibition “Cisco Live”, network specialist Cisco presented a whole series of announcements relating to AI clusters on the podium. It began with the presentation of the new UCS C845A M8 rack server with four height units optimized for AI workloads, which the manufacturer intends to use for retrieval, augmented generation and inference tasks.

The server is based on Nvidia's modular MGX reference design and can be flexibly equipped: it can be fitted with two, four, six, or a maximum of eight Nvidia GPUs. This is intended for customers who cannot yet foresee how many GPUs they will really need for their AI projects at the outset and therefore want retrofitting options.

In addition to use for generative AI, the manufacturer also sees the server as being equipped for render farms or demanding VDI tasks. According to the data sheet, customers can currently choose between Nvidia's GPU models H100 NVL, H200 NVL or L40S, but will be able to upgrade to future GPUs. Systems based on Nvidia's H100/H200 NVL cards also come with a five-year license for the AI Enterprise open-source software package curated by Nvidia.

On the CPU side, the server is based on AMD's Turin series and can currently use up to two AMD Epyc 9655 processors with 96 cores each in the largest configuration. The maximum main memory configuration is 32 96 GB DDR5 modules. If required, the server can be equipped with SmartNICs from Nvidia to secure the network traffic with additional security offloading through data processing units.

Setting up a network infrastructure suitable for AI workloads is complicated: the design of the infrastructure is usually quite complex, components from different manufacturers have to work together smoothly and high demands are placed on the bandwidths in the network and storage system. The manufacturer offers customers who want to rely on Cisco products three basic options: “Product”, “Solution” or “Consumption”.

With the first strategy (“Product”), customers procure the necessary components and install the environment together with a knowledgeable partner according to their design. Those who want a more structured approach can take the “solution” route. For this purpose, Cisco provides the “AI Pod”, a detailed guide based on Cisco Validated Designs – comparable to a longer IKEA construction plan – which is known to work well. However, the implementation is still up to the customer. Cisco has already introduced similar concepts in the virtualization environment, such as the FlexPod architecture.

The third option, which will also be available for AI infrastructures later this year, is intended for customers who want to keep the operating expenses of their environment as low as possible: Cisco's Nexus HyperFabric AI Cluster. The manufacturer already announced it at its in-house exhibition in the USA in the summer of 2024 –. However, only the non-AI-capable HyperFabric is available so far; the variant for AI infrastructure is expected in mid or late 2025.

The Nexus HyperFabric AI Cluster is a service developed together with Nvidia. The offering combines Cisco's server hardware in Nvidia's MGX architecture, Ethernet switches and fiber optics from Cisco, Nvidia's GPUs and DPUs, the Nvidia AI Enterprise software package and the VAST Data Platform as a storage and database backend. In addition, there are hypervisor and container management platforms, such as those supplied by VMware, Nutanix, or Red Hat's OpenShift.

The management of the overall system is similar to Cisco's Meraki approach: the hardware is owned and located by the customer, while the management interface comes from the cloud, currently from the US AWS cloud.

Videos by heise

HyperFabric for AI is probably Cisco's attempt to build a bridge between the economy of scale with outsourcing to the cloud and the secure environment under its control. However, to increase the acceptance of this approach in the German market, it would make sense to have a “sovereign” instance of the management environment hosted in Germany. The regular HyperFabric service for the management of conventional infrastructures (without AI) was announced at Cisco Live Amsterdam as now available.

(mki)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.