The GenAI NPU retains the key features of its predecessor, the GenAI v1 – that is offline operation and autonomous functionality. In addition, it is now fully stand-alone by embedding all Large Language Models (LLMs) operations directly into its hardware, eliminating the need for CPUs.
Thanks to its fully hardware-based design, the GenAI NPU manages to achieve what the company has referred to as ‘unprecedented levels of efficiency, unattainable by hybrid designs’.
According to RaiderChip CTO Victor Lopez, “By eliminating latency caused by hardware-software communication, we achieve superior performance while removing external dependencies, such as CPUs. The performance that you see is what you will get, regardless of the target electronic system where the accelerator is integrated.
“This improves energy efficiency and ensures fully predictable performance - advantages which make the GenAI NPU the ideal solution for embedded systems.”
Furthermore, the new design optimises token generation speed per available memory bandwidth, multiplying it by 2.4x, while enabling the use of more cost-efficient memories like DDR or LPDDR without relying on expensive options such as HBM to achieve improved levels of performance.
The device also delivers equivalent results with fewer components, reducing size, cost, and energy consumption. These features allow for the development of more affordable and sustainable generative AI solutions, with faster return on investment and seamless integration into a variety of products tailored to different needs.
With this innovation, RaiderChip is continuing to deliver on its strategy of offering optimised solutions based on affordable hardware, designed to bring generative AI to the Edge.
These solutions ensure complete privacy and security for applications thanks to their ability to operate entirely offline and on-premises, while eliminating dependence on the cloud and recurring monthly subscriptions.