Meta's AI accelerators to overtake Nvidia & Co. in 2027
Four AI accelerators in two years, that is Meta's goal. In 2027, the MTIA 500 will arrive with 1700 watts of electrical power consumption.
Meta's MTIA 400. Six silicon components apparently serve only to stabilize the construct.
(Image: Meta)
The AI accelerator MTIA 300 is already running productively in Meta data centers; now Meta is also introducing MTIA 400, which has completed lab tests and is close to field deployment. That's not all: as early as 2027, Meta wants to bring two more generations into its data centers and, in some application scenarios, outperform AI accelerators from AMD and Nvidia.
While MTIA stands for “Meta Training and Inference Accelerator”, according to its announcement, Meta intends to focus on inference in the future. This involves the chips executing already trained AI models, for example, to answer user chat requests. Meta identifies inference for generative AI as the most important application area for its chips. MTIA 400 is intended to be the last “general-purpose” accelerator without this focus.
Chipletization with RISC-V
The MTIA 400 consists of a total of five chiplets plus four memory stacks of the High-Bandwidth Memory (HBM) type (Meta does not specify the generation). The two largest chiplets house the compute units. A so-called Processing Element (PE) is based on two RISC-V cores for management. They execute code and offload certain tasks to specialized circuits via a command processor, but can also perform SIMD-like (Single Instruction Multiple Data) calculations themselves via their vector units.
Metas KI-Beschleuniger bis 2027 (4 Bilder)

MTIA 300
Meta
)In addition, there are matrix units (Dot Product Engines), Reduction Engines for accumulation calculations and communication with other PEs, as well as DMA Engines (Direct Memory Access) for data movements. Each PE has local cache, and all PEs share a common SRAM cache. These compute chiplets also contain the memory controllers for the HBM.
Two other chiplets contain network controllers for a total of twelve 800 Gbit/s connections, via which Meta interconnects up to 72 AI accelerators. A System-on-Chip die includes PCI Express controllers and a higher-level Control Core Processor (CCP) consisting of multiple RISC-V cores for controlling the entire AI accelerator.
An MTIA 400 consumes 1200 watts, comes with 288 GB of HBM, and achieves 12 quadrillion four-bit floating-point operations per second (12 FP4-petaflops). Meta's MX4 designation refers to the so-called Microscaling Formats, which the Open Compute Project (OCP) specifies based on FP4.
(Image:Â Meta)
MTIA 450 and 500 approach 2000 watts
The MTIA 450 uses revised compute chiplets and faster HBM. The doubling of the transfer rate to 18.4 TB/s while maintaining a capacity of 288 GB suggests a new generation, possibly HBM4.
Meta primarily aims to increase FP4 speed here. The talk is of 21 petaflops, an increase of 75 percent. Electrical power consumption rises by 17 percent to 1400 watts. The MTIA 450 is expected to be ready for deployment by early 2027.
Later in 2027, the MTIA 500 will follow. Meta is targeting a performance increase of over 40 percent. The compute chiplets will be divided into four parts instead of two from then on. Additionally, memory capacity will increase to 384 to 512 GB; the transfer rate to 27.6 TB/s. Meta is targeting an energy budget of 1700 watts for this.
(Image:Â Meta)
Shorter Development Times
With the chiplet approach, Meta aims to specifically shorten the development of new AI accelerators. The remaining surrounding hardware is also designed for quick changes: all four generations are intended to run in the same servers. This is how Meta wants to accommodate the rapid progress in AI development. Nevertheless, the company intends to continue using AI accelerators from other manufacturers.
Videos by heise
Broadcom assists Meta with the designs, as it does for many other hyperscalers with their AI accelerators. Meta does not comment on the process node widths; however, 2- or 3-nanometer manufacturing technology from TSMC appears logical.
Empfohlener redaktioneller Inhalt
Mit Ihrer Zustimmung wird hier ein externer Preisvergleich (heise Preisvergleich) geladen.
Ich bin damit einverstanden, dass mir externe Inhalte angezeigt werden. Damit können personenbezogene Daten an Drittplattformen (heise Preisvergleich) übermittelt werden. Mehr dazu in unserer Datenschutzerklärung.
(mma)