Those stats are a 50% improvement over currently shipping HBM3 solutions and with a 2.5 times performance per watt improvement over previous generations, Micron is claiming that its HBM3 Gen2 is setting new records for AI data centre metrics such as performance, capacity and power efficiency.
According to Micron these improvements reduce training times of large language models like GPT-4 and beyond, deliver more efficient infrastructure use for AI inference and provide better total cost of ownership (TCO).
At the heart of Micron’s high-bandwidth memory (HBM) solution is the company’s 1β (1-beta) DRAM process node, which allows a 24Gb DRAM die to be assembled into an 8-high cube within an industry-standard package dimension.
Micron’s 12-high stack with 36GB capacity will begin sampling in the first quarter of calendar 2024.
The HBM3 Gen2’s improved power efficiency is possible because of advancements such as doubling of the through-silicon vias (TSVs) over competitive HBM3 offerings, thermal impedance reduction through a five-time increase in metal density, and an energy-efficient data path design.
Micron is a member of TSMC’s 3DFabric Alliance and as part of the HBM3 Gen2 product development effort, Micron and TSMC are collaborating on providing a smooth introduction and integration in compute systems for AI and HPC design applications.
TSMC has received samples of Micron’s HBM3 Gen2 memory and is working closely with Micron for further evaluation and tests.
The Micron HBM3 Gen2 solution addresses increasing demands in the world of generative AI for multimodal, multitrillion-parameter AI models. With 24GB of capacity per cube and more than 9.2Gb/s of pin speed, the training time for large language models has been reduced by more than 30%.
Additionally, the Micron offering unlocks a significant increase in queries per day, enabling trained models to be used more efficiently. Micron HBM3 Gen2 memory’s best-in-class performance per watt drives tangible cost savings for modern AI data centres. For an installation of 10 million GPUs, every five watts of power savings per HBM cube is estimated to save operational expenses of up to $550 million over five years.
“Micron’s HBM3 Gen2 technology was developed with a focus on unleashing superior AI and high-performance computing solutions for our customers and the industry,” said Praveen Vaidyanathan, vice president and general manager of Micron’s Compute Products Group. “One important criterion for us has been the ease of integrating our HBM3 Gen2 product into our customers’ platforms. A fully programmable Memory Built-In Self Test (MBIST) that can run at the full specification pin speed positions us for improved testing capability with our customers, creates more efficient collaboration and delivers a faster time to market.”