Siemens is describing Catapult AI NN as a complete solution that starts with a neural network description from an AI framework, converts it into C++ and synthesizes it into an RTL accelerator in Verilog or VHDL for implementation in silicon.
This solution brings together hls4ml, an open-source package for machine learning hardware acceleration, and Siemens' Catapult HLS software for High-Level Synthesis.
Developed in close collaboration with Fermilab, a US Department of Energy Laboratory, and other contributors to hls4ml, Catapult AI NN is intended to address the requirements associated with machine learning accelerator design for power, performance, and area on custom silicon.
“The handoff process and manual conversion of a neural network model into a hardware implementation is very inefficient, time consuming and error-prone, especially when it comes to creating and verifying variants of a hardware accelerator tailored to specific performance, power and area,” said Mo Movahed, Vice President and General Manager for High-Level Design, Verification and Power, Siemens Digital Industries Software.
“By empowering scientists and AI experts to leverage industry-standard AI frameworks, such as neural network model design, and by synthesizing these models into hardware designs optimised for power, performance, and area (PPA), we're opening a whole new realm of possibilities for AI and machine learning software engineers.
“Our new Catapult AI NN solution allows developers to automate and implement their neural network models for optimal PPA concurrently during the software development process, ushering in a new era of efficiency and innovation in AI development.”
As runtime AI and machine learning tasks migrate from the datacentre into everything from consumer appliances to medical devices, there is a growing requirement for "right-sized" AI hardware to minimise power consumption, lower cost and maximize end-product differentiation.
However, most machine learning experts are more comfortable working with tools such as TensorFlow, PyTorch or Keras, rather than synthesizable C++, Verilog or VHDL, while there has traditionally been no easy path for AI experts to accelerate their machine learning applications in a right-sized ASIC or SoC implementation.
The hls4ml initiative is intended to help bridge this gap by generating C++ from a neural network described in AI frameworks such as TensorFlow, PyTorch or Keras. The C++ can then be deployed for an FPGA, ASIC or SoC implementation.
Catapult AI NN extends the capabilities of hls4ml to ASIC and SoC design. It includes a dedicated library of specialized C++ machine learning functions that are tailored to ASIC design. Using these functions, designers can optimise PPA by making latency and resource trade-offs across alternative implementations from the C++ code.
Moreover, designers can now evaluate the impact of different neural net designs to determine the best neural network structure for hardware.
Catapult AI NN is available for early adopters now and will be available to all users in the fourth quarter of 2024.