The funds will be used to accelerate the deployment of the company’s NR1 AI Inference Solution and to help the company move from an early deployment phase to growth in other markets, regions, and generative AI.
The new funding brings NeuReality’s total funding to $70 million and marks a significant step in the delivery of its 7nm AI inference server-on-a-chip, the NR1 NAPU (Network Addressable Processing Unit) from TSMC and the key component of its overall NR1 AI Inference Solution.
NeuReality’s AI-centric system architecture is intended to enable companies to run generative AI applications and large language models (LLMs) without overinvesting in scarce and underutilised GPUs.
“In order to mitigate GPU scarcity, optimisation at the system and datacentre level are key,” said Naveen Rao, VP of Generative AI at Databrick, a NeuReality Board member and early investor in the startup. “To enable greater access to compute for generative AI, we must remove market barriers to entry with a far greater sense of urgency. NeuReality's innovative system represents that tipping point.”
Enterprises face big challenges in deploying trained AI models and apps, known as the AI Inference process. Running live AI data can be complex and costly, with a record of poor scalability from AI accelerators and system bottlenecks caused by CPUs.
“Our disruptive AI Inference technology is unbound by conventional CPUs, GPUs, and NICs,” said NeuReality’s CEO Moshe Tanach, pointing to the 30-40 percent utilisation rate of AI accelerators. “Investing in more and more DLAs, GPUs, LPUs, TPUs…won’t address your core issue of system inefficiency. NeuReality provides an express lane for large AI pipelines, seamlessly routing tasks to purpose-built AI devices and swiftly delivering responses to your customers, while conserving both resources and capital.”
NeuReality's NR1-M and NR1-S systems, which are full-height PCIe cards that are integrated into server racks, drive 100 percent AI accelerator utilisation. Each system houses internal NAPUs that run on any AI accelerator and operate independently from the CPU, eliminating a CPU requirement altogether. By connecting directly to Ethernet, NR1 can efficiently manage AI queries from vast data pipelines originating from millions of users and billions of devices.
Compatible server configurations were demonstrated by AMD, IBM, Lenovo, Qualcomm, and Supermicro at NeuReality's product launch at the SC23 international conference in Denver last November.
NeuReality is driving early AI deployments with select cloud service customers and enterprise customers in financial services, business services, and government, focusing on current natural language processing, automated speech recognition, recommendation systems, and computer vision. The additional $20 million will help to propel NeuReality towards broader AI Inference deployment as both conventional and generative AI applications surge in demand.
The investment of the EIC Fund, the venture arm of the European Commission’s EIC Accelerator program, shows its support for the firm’s solution and addresses two important industries for Europe – advanced semiconductors and AI, both of which are expected to be major drivers of economic growth in the coming years.
NeuReality had already secured a substantial grant from the EIC Accelerator program last year to support various development steps of the firm’s innovation aimed at solving the cost and complexity of AI Inference at its core architectural level.