The system, code-named Hala Point, has initially been deployed at Sandia National Laboratories, and uses Intel’s Loihi 2 processor. It is intended to support research into future brain-inspired artificial intelligence (AI), as well as tackling challenges that are related to the efficiency and sustainability of today’s AI.
Hala Point advances Intel’s first-generation large-scale research system, Pohoiki Springs, with architectural improvements to achieve over 10 times more neuron capacity and up to 12 times higher performance.
“The computing cost of today’s AI models is rising at unsustainable rates. The industry needs fundamentally new approaches capable of scaling. For that reason, we developed Hala Point, which combines deep learning efficiency with novel brain-inspired learning and optimisation capabilities,” said Mike Davies, director of the Neuromorphic Computing Lab at Intel Labs.
Hala Point is the first large-scale neuromorphic system to demonstrate state-of-the-art computational efficiencies on mainstream AI workloads. Characterization shows it can support up to 20 petaops, with an efficiency exceeding 15 trillion 8-bit operations per second per watt (TOPS/W) when executing conventional deep neural networks.
According to Intel, Hala Point’s capabilities could enable future real-time continuous learning for AI applications such as scientific and engineering problem-solving, logistics, smart city infrastructure management, large language models (LLMs) and AI agents.
Researchers at Sandia National Laboratories plan to use Hala Point for advanced brain-scale computing research and will focus on solving scientific computing problems in device physics, computer architecture, computer science and informatics.
Currently, Hala Point is a research prototype that will advance the capabilities of future commercial systems. Intel said that it anticipates that such lessons will lead to practical advancements, such as the ability for LLMs to learn continuously from new data as well as to significantly reduce the unsustainable training burden of widespread AI deployments.
Recent trends in scaling up deep learning models to trillions of parameters have exposed daunting sustainability challenges in AI and have highlighted the need for innovation at the lowest levels of hardware architecture.
Neuromorphic computing is a fundamentally new approach that draws on neuroscience insights that integrate memory and computing with highly granular parallelism to minimise data movement.
Advancing on its predecessor, Pohoiki Springs, with numerous improvements, Hala Point now brings neuromorphic performance and efficiency gains to mainstream conventional deep learning models, notably those processing real-time workloads such as video, speech and wireless communications.
Loihi 2 neuromorphic processors, which form the basis for Hala Point, apply brain-inspired computing principles, such as asynchronous, event-based spiking neural networks (SNNs), integrated memory and computing, and sparse and continuously changing connections to achieve orders-of-magnitude gains in energy consumption and performance. Neurons communicate directly with one another rather than communicating through memory, reducing overall power consumption.
Hala Point packages 1,152 Loihi 2 processors produced on Intel 4 process node in a six-rack-unit data centre chassis the size of a microwave oven. The system supports up to 1.15 billion neurons and 128 billion synapses distributed over 140,544 neuromorphic processing cores, consuming a maximum of 2,600 watts of power. It also includes over 2,300 embedded x86 processors for ancillary computations.
Hala Point integrates processing, memory, and communication channels in a massively parallelized fabric, providing a total of 16 petabytes per second (PB/s) of memory bandwidth, 3.5 PB/s of inter-core communication bandwidth, and 5 terabytes per second (TB/s) of inter-chip communication bandwidth. The system can process over 380 trillion 8-bit synapses and over 240 trillion neuron operations per second.
Applied to bio-inspired spiking neural network models, the system can execute its full capacity of 1.15 billion neurons 20 times faster than a human brain and up to 200 times faster rates at lower capacity. While Hala Point is not intended for neuroscience modelling, its neuron capacity is roughly equivalent to that of an owl brain or the cortex of a capuchin monkey.
Loihi-based systems can perform AI inference and solve optimisation problems using 100 times less energy at speeds as much as 50 times faster than conventional CPU and GPU architectures.
By exploiting up to 10:1 sparse connectivity and event-driven activity, early results on Hala Point show the system can achieve deep neural network efficiencies as high as 15 TOPS/W2 without requiring input data to be collected into batches, a common optimisation for GPUs that significantly delays the processing of data arriving in real-time, such as video from cameras.
The delivery of Hala Point to Sandia National Labs marks the first deployment of a new family of large-scale neuromorphic research systems that Intel plans to share with its research collaborators.
Further development will enable neuromorphic computing applications to overcome power and latency constraints that limit AI capabilities' real-world, real-time deployment.