AMD Battling Nvidia with the Launch of MI325X, a 288GB AI Accelerator

AMD's MI325X Raises the Bar in AI Acceleration

Details have emerged of AMD's newest contribution to the AI accelerator Technosphere: the MI325X. With high-bandwidth capabilities, this powerful device is set to compete head-to-head with Nvidia's H200. The multinational firm aims to match Nvidia's yearly release cadence for its 'Instinct' series of AI accelerators, with the first entry, the MI325X, arriving later this year.

Inside the MI325X

The Instinct MI325X bears some similarities to Nvidia's H200, such as the use of HBM3e technology, which we first saw detailed during AMD's Advancing AI event in December 2023. The MI325X is composed of eight compute, four I/O, and eight memory chiplets, and uses 2.5D and 3D packaging technologies. Preliminary analysis indicates the CDNA 3 GPU tiles inside make it similar in many ways to AMD's previous offerings, with comparative FLOPS performance. On its own, the MI325X is faster than the H200, regardless of precision.

Memory Improvements

Significantly increasing its memory capacity is another remarkable feature of the MI325X. In comparison to the Nvidia counterparts, the MI300X had more than double the HBM3 of the H100 and a sizable 51GB advantage over the soon-to-be-released H200. The MI325X now pushes the envelope further, stretching the memory capacity to 288GB, more than twice that of the H200 and a whopping 50% more than Nvidia's Blackwell chips, revealed earlier this spring at GTC. Time will tell if the MI325X's HBM3e enhancement can live up to expectations, potentially elevating memory bandwidth to 6TB/sec, a significant rise from the MI300X's 5.3TB/sec.

Optimized for AI Inferencing

The MI325X is poised to directly address memory-capacity and bandwidth limitations that frequently bottleneck AI inferencing work. In our present AI landscape, about 1GB of memory is needed for every billion parameters running at 8-bit precision. Ideally, with the MI325X's capabilities, it could support a 250-billion parameter model, or even a 2T-billion parameter model in an eight GPU system while leaving room for storing key value caches. However, AMD's continued focus on FP16 processing means that a model which operates at FP8 on Nvidia's H200 would need twice the memory on the MI325X, negating the potential advantages of its 288GB capacity.

For more information and the latest updates, click here to explore more on Google News.