re:Invent 2025: AWS Ignites Next Level in Custom Silicon and AI Hardware
At the re:Invent conference, AWS presents a parade of new hardware and offers a preview of the upcoming Graviton5 processors.
(Image: Jens Söldner/heise Medien)
- Jens Söldner
At this year's re:Invent in Las Vegas, Amazon Web Services (AWS) flexed its infrastructure muscles and announced a massive overhaul of its hardware portfolio. The common thread in the hardware keynotes was clear: specialization. Instead of "one-size-fits-all," AWS is increasingly relying on custom-built chips for specific tasks – from in-house AI development and high-frequency computing to specialized Apple environments.
The lineup begins with in-house developments from Annapurna Labs, which now form the backbone of AWS's efficiency strategy. With the new EC2 M9g instances, AWS offers a first glimpse into the performance of its Graviton5 processors. These general-purpose instances, currently available as a preview, promise a significant performance leap of up to 25 percent compared to the recently established Graviton4 generation.
AWS has massively increased packing density, now fitting up to 192 physical cores onto a single socket, flanked by a fivefold increase in L3 cache. For customers, this means not only more computing power but also potentially lower operating costs due to increased energy efficiency.
Ultra Servers with Trainium3 Architecture
While Graviton handles the bread-and-butter workloads, AWS is targeting the booming market for generative AI training with its Trainium3 architecture. The EC2 Trn3 Ultra Servers, available generally, mark a technological milestone as the chips are manufactured using the 3-nanometer process for the first time. AWS has gone all out here: a single Ultra Server bundles the computing power of up to 144 Trainium3 chips. In direct comparison to the previous Trn2 generation, performance increases by 4.4 times, making these systems the preferred choice for training massive large language models (LLMs).
Of course, AWS continues to rely on market leader NVIDIA in the AI space. The partnership has been reinforced with the immediate availability of the EC2 P6e-GB300 UltraServers. These are based on NVIDIA's GB300-NVL72 platform (Blackwell architecture) and are specifically optimized for the inference – i.e., the execution – of AI models. With one and a half times the GPU memory and FP4 compute power compared to the GB200 predecessors, these servers address the challenge of running trillion-parameter models in real-time and cost-effectively.
Beyond AI accelerators, there were also important updates for the classic x86 architecture, with AMD's 5th generation EPYC processors ("Turin") playing a prominent role. For compute-intensive standard tasks, the EC2 C8a instances are now available, delivering around 30 percent more performance than the C7a series.
Computing Power for the Niche
However, the new M8azn instances (Preview) are more exciting for niche applications. These are tuned for maximum speed, reaching the currently highest clock frequency in the cloud at up to 5 GHz. Each vCPU in an M8a or M8azn instance corresponds to a physical CPU core. AWS deliberately omits Simultaneous Multithreading (SMT) here to guarantee extremely low and constant latencies – a critical feature for high-frequency trading or multiplayer gaming servers. The AMD portfolio is complemented by the new X8aedz instances, which combine high clock speeds with enormous amounts of RAM, specifically targeting memory-intensive Electronic Design Automation (EDA) or the operation of large relational databases.
Intel also remains an important partner, especially for memory-hungry enterprise applications. The EC2 X8i instances, presented in preview, utilize Intel Xeon 6 processors and are primarily aimed at operators of in-memory databases like SAP HANA. They offer 50 percent more memory capacity and significantly boosted bandwidth compared to the X2i generation, minimizing data bottlenecks.
Videos by heise
Rounding off the hardware parade is an offering for the Apple ecosystem. With the EC2 M4 Max Mac instances (Preview), AWS integrates the performance of current Mac Studio hardware into the cloud. Developers benefit from the M4 Max architecture, which offers double the number of GPU cores and more than two and a half times the Unified Memory compared to the Pro models. This likely drastically reduces build times for complex iOS and macOS apps.
(mki)