The line has been designed for edge and end-point inference and includes the first generation of AI/ML hardware accelerator intellectual property (IP), the AndesAIRE AnDLA I350 (Andes Deep Learning Accelerator), and the neural network software tools and runtimes, the AndesAIRE NN SDK
AI/ML applications are surging in popularity so there is a critical need for high performance-efficient deep learning solutions that can operate at the edge and end-point, where the strict power and energy constraints make it difficult to only rely on CPU architecture.
The AnDLA I350 is specifically designed to address this challenge by delivering efficiency, low power, and small area, making it suitable for a wide range of edge inference applications, from smart IoT devices and smart cameras to smart home appliances and robotics.
The AndesAIRE AnDLA I350 supports popular deep learning frameworks, such as TensorFlow Lite, PyTorch, and ONNX, and performs versatile neural network operations such as convolution, fully-connect, element-wise, activation, pooling, channel padding, up-sample, concatenation, etc. in the int8 data type.
It also features an internal Direct Memory Access (DMA) and local memory, utilising the best computing power of the hardware engines.
Operation fusion techniques are also adopted in the AnDLA I350 to perform common operator sequences more efficiently. The key configurable parameters of AnDLA I350 include the MAC number from 32 to 4096, and SRAM size from 16KB to 4MB, and provide flexible computing power from 64 GOPS to 8 TOPS (at 1 GHz) for a wide range of applications.
The AndesAIRE NN SDK is a comprehensive set of software tools and runtimes for end-to-end development and deployment and includes the following components:
- AndesAIRE NNPilot: a neural network optimisation tool suite
- AndesAIRE TFLM for AnDLA: An AnDLA-optimised inference framework running on a host based on TensorFlow Lite for Microcontrollers
- AnDLA driver and runtime
The NNPilot can automatically analySe input NN models, apply model pruning and quantization, and generate AnDLA executable based on its configuration to perform inference together with the TFLM framework.
The NNPilot also generates sample host C code to invoke the AnDLA driver in the bare metal environment.
To address advancing AI technologies, Andes is driving the development of an AI subsystem, which integrates AndesAIRE AnDLA, AndesCore RISC-V CPU, and Andes Custom Extension (ACE). In such a subsystem, most structural and time-consuming parts of the AI workloads can be computed efficiently in the AnDLA while less structural computations such as non-linear functions can be offloaded to the powerful and flexible RISC-V CPUs with DSP/SIMD or Vector extensions.
The ACE plays a key role for efficient data movement between the CPU and the AnDLA so as to reduce significant memory bandwidth and power consumption while increasing hardware utilization.
The ACE can facilitate even faster processing by generating customised instructions for domain-specific applications, such as data pre- or post-processing. In addition to the extensibilities from hardware IP’s, Andes has committed to the continuous advancement of the AndesAIRE NN SDK and AndesAIRE NN Library for the mass-production SoCs to adopt the future AI algorithms.
Andes had added more than one hundred compute library APIs yearly since 2021 and said that it would keep optimising and adding new functions into NN SDK and NN library in the future.
“AndesAIRE product line projects our vision for the AI/ML market,” said Simon TC Wang, Senior Technical Marketing Manager of Andes Technology. “By combining the advantages from RISC-V CPUs, the AnDLA, and the ACE into an AI subsystem, the performance, power consumption, and area can be well-balanced for customers to deliver competitive solutions. Furthermore, the flexibility is guaranteed by RISC-V CPUs and NN software stack, and the extensibility is affluent for customers to differentiate their unique value for target AI/ML applications.”