The 3.5D XDSiP integrates more than 6000 mm² of silicon and up to 12 high bandwidth memory (HBM) stacks in one packaged device to enable high-efficiency, low-power computing for AI at scale.
According to Broadcom, it has managed to achieve a significant milestone by developing and launching the industry's first Face-to-Face (F2F) 3.5D XPU.
The immense computational power required for training generative AI models relies on massive clusters of 100,000 growing to 1 million XPUs which, in turn, demand increasingly sophisticated integration of compute, memory, and I/O capabilities to achieve the necessary performance while minimising power consumption and cost.
Traditional methods like Moore's Law and process scaling are struggling to keep up with these demands. Therefore, advanced system-in-package (SiP) integration is becoming crucial for next-generation XPUs.
Over the past decade, 2.5D integration, which involves integrating multiple chiplets up to 2500 mm² of silicon and HBM modules up to 8 HBMs on an interposer, has proven valuable for XPU development. However, as new and increasingly complex LLMs are introduced, their training necessitates 3D silicon stacking for better size, power, and cost. Consequently, 3.5D integration, which combines 3D silicon stacking with 2.5D packaging, is poised to become the technology of choice for next-generation XPUs in the coming decade.
Broadcom’s 3.5D XDSiP platform achieves significant improvements in interconnect density and power efficiency compared to the Face-to-Back (F2B) approach.
This innovative F2F stacking directly connects the top metal layers of the top and bottom dies, which provides a dense and reliable connection with minimal electrical interference and exceptional mechanical strength.
Broadcom’s 3.5D platform includes IP and proprietary design flow for efficient correct-by-construction of 3D die stacking for power, clock and signal interconnects.
Key benefits of Broadcom's 3.5D XDSiP include:
Enhanced Interconnect Density: Achieves a 7x increase in signal density between stacked dies compared to F2B technology.
Power Efficiency: Delivers a 10x reduction in power consumption in die-to-die interfaces by utilizing 3D HCB instead of planar die-to-die PHYs.
Reduced Latency: Minimises latency between compute, memory, and I/O components within the 3D stack.
Compact Form Factor: Enables smaller interposer and package sizes, resulting in cost savings and improved package warpage.
Broadcom’s lead F2F 3.5D XPU integrates four compute dies, one I/O die, and six HBM modules, leveraging TSMC's cutting-edge process nodes and 2.5D CoWoS packaging technologies.
Broadcom's proprietary design flow and automation methodology, built upon industry-standard tools, has ensured first-pass success despite the chip’s immense complexity. The 3.5D XDSiP has demonstrated complete functionality and performance across critical IP blocks, including high-speed SerDes, HBM memory interfaces, and die-to-die interconnects.
“Advanced packaging is critical for next generation XPU clusters as we hit the limits of Moore’s Law. In close collaboration with our customers, we have created a 3.5D XDSiP platform on top of the technology and tools from TSMC and EDA partners,” said Frank Ostojic, Senior Vice President and General Manager, ASIC Products Division, Broadcom. “By stacking chip components vertically, Broadcom's 3.5D platform enables chip designers to pair the right fabrication processes for each component while shrinking the interposer and package size, leading to significant improvements in performance, efficiency, and cost.
With more than five 3.5D products in development, a majority of Broadcom’s consumer AI customers have adopted the 3.5D XDSiP platform technology with production shipments starting February 2026.