“Companies and countries are partnering with Nvidia to shift the trillion-dollar traditional data centres to accelerated computing and build a new type of data centre – AI factories – to produce a new commodity: artificial intelligence,” Huang said.
Huang announced a range of new semiconductors, software and systems to power data centres, factories, consumer devices, and robots which he said would help to drive a new industrial revolution.
“Generative AI is reshaping industries and opening new opportunities for innovation and growth and we’re at the cusp of a major shift in computing,” Huang told a packed conference.
Huang revealed a roadmap for new semiconductors that will arrive on a one-year rhythm. Revealed for the first time, the Rubin platform will succeed the upcoming Blackwell platform, featuring new GPUs, a new Arm-based CPU - Vera - and advanced networking with NVLink 6, CX9 SuperNIC and the X1600 converged InfiniBand/Ethernet switch.
“Our company has a one-year rhythm. Our basic philosophy is very simple: build the entire data centre scale, disaggregate and sell to you parts on a one-year rhythm, and push everything to technology limits,” Huang explained.
According to Huang, NVIDIA is driving down the cost of turning data into intelligence.
“Accelerated computing is sustainable computing,” he emphasised, outlining how the combination of GPUs and CPUs can deliver up to a 100x speedup while only increasing power consumption by a factor of three, achieving 25x more performance per Watt over CPUs alone.
A growing number of computer manufacturers have embraced NVIDIA GPUs and networking solutions including the likes of ASRock Rack, ASUS, GIGABYTE, Ingrasys, Inventec, Pegatron, QCT, Supermicro, Wistron and Wiwynn, which are creating cloud, on-premises and edge AI systems.
The NVIDIA MGX modular reference design platform now supports Blackwell, including the GB200 NVL2 platform, and is designed for optimal performance in large language model inference, retrieval-augmented generation and data processing.
AMD and Intel are supporting the MGX architecture with plans to deliver, for the first time, their own CPU host processor module designs. Any server system builder can use these reference designs to save development time while ensuring consistency in design and performance.
In networking, Huang unveiled plans for the annual release of Spectrum-X products to cater to the growing demand for high-performance Ethernet networking for AI.
NVIDIA Spectrum-X is the first Ethernet fabric built for AI and is intended to enhance network performance by 1.6x more than traditional Ethernet fabrics. It accelerates the processing, analysis and execution of AI workloads and, in turn, the development and deployment of AI solutions.
CoreWeave, GMO Internet Group, Lambda, Scaleway, STPX Global and Yotta are among the first AI cloud service providers embracing Spectrum-X to bring extreme networking performance to their AI infrastructures.
With NVIDIA NIM, the world’s 28 million developers will now be able to create generative AI applications. NIM — inference microservices that provide models as optimised containers — can be deployed on clouds, data centres or workstations.
NIM also enables enterprises to maximize their infrastructure investments. For example, running Meta Llama 3-8B in a NIM produces up to 3x more generative AI tokens on accelerated infrastructure than without NIM.
Nearly 200 technology partners, including Cadence, Cloudera, Cohesity, DataStax, NetApp, Scale AI, and Synopsys are integrating NIM into their platforms to speed generative AI deployments for domain-specific applications, such as copilots, code assistants, and digital human avatars.
According to NVIDIA, its RTX AI PCs, powered by RTX technologies, will revolutionise consumer experiences with over 200 RTX AI laptops and more than 500 AI-powered apps and games.
The RTX AI Toolkit and newly available PC-based NIM inference microservices for the NVIDIA ACE digital human platform help to underscore NVIDIA’s commitment to AI accessibility.
Project G-Assist, an RTX-powered AI assistant technology demo, was also announced, showcasing context-aware assistance for PC games and apps.
Microsoft and NVIDIA are collaborating to help developers bring new generative AI capabilities to their Windows native and web apps with easy API access to RTX-accelerated SLMs that enable RAG capabilities that run on-device as part of Windows Copilot Runtime.