Artificial Intelligence (AI) has the capability to make embedded systems for the Industrial Internet of Things (IIoT) far more responsive and reliable and the technology is already being used to monitor the condition of machinery and identify if a failure is imminent, in addition to scheduling routine maintenance work in a more cost-effective way.
When it comes to the deployment of AI technology in embedded systems consideration as to where the bulk of this data processing takes place is critical. AI algorithms vary widely in the computing performance that they require, and have a strong influence on what is required to process the algorithm and where that processing is done.
There are three clear approaches for system designers developing embedded AI-based systems including using a cloud-based AI service, deploying a system with built-in AI or creating their own algorithms, generally building on open source software.
A Deep Neural Network (DNN) architecture is an example of an algorithm that is particularly compute-intensive especially during the training phase, where billions of floating-point calculations are needed each time the model is updated. Due to the intense demand of DNN, the typical approach is to send data to the cloud to be processed remotely. AI-enabled devices in industrial control can take advantage of this remote processing as well as the tools and frameworks created to work with cloud services, many of which are provided in open-source form.
A popular example is Google’s TensorFlow, which provides multiple levels of abstraction for engineers experienced in creating AI algorithms but also for those just getting started. The Keras API, which forms part of the TensorFlow framework, makes it easy to explore machine-learning techniques and get applications up and running.
A drawback with cloud-based processing, however, lies in the communications bandwidth required. A reliable internet connection is essential to maintain service and it is worth noting that many consumer applications of cloud AI rely on broadband connections. Machine tools in a factory may not have access to the data rates required to update a remote AI model in real-time.
By doing more processing locally, it is therefore possible to scale back bandwidth requirements, sometimes dramatically. For many industrial applications, the amount of data needed can be scaled back dramatically by paying attention to the content. In an application that monitors environmental variables, many of them do not change for long periods of time. What is important to the model lies in changes above or below certain thresholds. Even though a sensor may need to analyse sensor inputs millisecond by millisecond, the update rate for the cloud server may be on the order of a few updates every second, or even less frequently.
Building AI software
For more complex forms of data, such as audio or video, a greater degree of pre-processing will be required. Performing image processing before passing the output to an AI model may not only save on communications bandwidth but help improve the system’s overall performance. For example, de-noising before compression will often improve the efficiency of the compression algorithms. This is particularly relevant to lossy compression techniques that are sensitive to high-frequency signals. Edge detection can be used with image segmentation to focus the model only on objects of interest. This reduces the amount of irrelevant data that needs to be fed to the model both during training and inference.
Although image processing is a complex field, in many cases developers can process algorithms locally, taking advantage of readily available libraries and eliminating the requirement for high-bandwidth internet connections. A popular example is the open-source computer-vision Library OpenCV, which is used to pre-process data for AI models. Written in C++ for high performance, developers can call it from C++, Java, Python and Matlab code, supporting easy prototyping before porting the algorithms to an embedded target.
By employing OpenCV and processing data locally, integrators also eliminate security risks associated with transmitting and storing data in the cloud. A major concern among end users is the privacy and security of data as it is passed up to the cloud. Condition monitoring and industrial inspection are critical processes that need data analysis to be as good as possible, but can contain information advantageous to unscrupulous competitors. Although cloud operators have put in place measures to prevent data from being compromised, data that is kept within each device (as much as possible) limits the risk of exposure in the event of a successful hack.
In addition to support for image processing, recent releases of OpenCV incorporate direct support for machine-learning models built using a number of popular frameworks that include Caffe, PyTorch and Tensorflow. One method that has demonstrated success for many users is to use the cloud for initial development and prototyping before porting the model to the embedded platform.
Performance is the primary concern for any machine-learning model that is ported to an embedded device. As training data has a very high requirement for performance, one option is to have this performed on local or cloud servers (depending on privacy concerns), with inferencing – when a trained model is fed with real-time data – performed at the device itself.
Where local performance is a requirement, a possible solution is the Avnet Ultra96-V2, which incorporates the Xilinx Zynq UltraScale+ ZU3EG MPSoC. The combination of Arm processor cores with embedded signal-processing engines and a fully programmable logic array provides effective support for DNN models as well as image-processing routines. Reconfiguration provides the ability to handle training locally as well as inference where the application has high-throughput demands.
Inference incurs a lower overhead than training and, for sensor rather than image streams, a microcontroller running the DNN kernel in software may be satisfactory. But lower data rate streams may still be too much for a low-power device to handle. Some teams employ optimisation techniques to reduce the number of calculations needed for inferencing even though it increases development complexity. AI models often contain a high degree of redundancy. By pruning the connections between neurons and reducing the precision of calculations to 8-bit integer or even lower resolution, it is possible to make significant savings in processing power.
Edge Device with AI built-in
An alternative option is to have the inferencing offloaded to a local gateway device. One gateway could handle the inferencing duties for a number of sensor nodes if the per-node throughput is comparatively low. The need to distribute workloads, port and optimise models from cloud-oriented frameworks all increase development complexity, so another option is to employ a framework that is already optimised for embedded use. The Brainium platform developed by Octonion provides a complete development framework aimed at embedded systems. Its software environment directly supports prototyping using cloud systems with deployment on IoT devices and gateways built using Avnet’s SmartEdge Agile hardware.
The Brainium software environment coordinates the activities of the device, gateway and cloud layers to form a holistic environment for AI. To make it possible to scale applications to deeply embedded nodes, the environment has support for several AI techniques that are less compute-intensive than those employed in DNNs. The gateway software can be deployed on readymade hardware such as the Raspberry Pi or any platform able to run Android or iOS. Where higher performance is needed, Brainium’s cloud layer can be deployed on AWS, Microsoft Azure or custom server solutions.
Schneider Electric and Festo, have also incorporated local AI support into control products for specific applications. The former offers the Predictive Analytics application to identify subtle changes in system behaviour that affect performance. In 2018, Festo acquired the data-science specialist Resolto and its SCRAITEC software learns the healthy state of a system in order to detect every anomaly.
Which approach an original equipment manufacturer or integrator takes when deploying AI will depend on individual circumstances. As well as available processing power, there will be other factors that encourage the use of cloud computing, building new software and/or integrating an edge device to manage AI. For example, as users try to exploit big-data analytics, they may want to pull the information from many systems into a larger database, and therefore, favour the use of cloud services. Others will want to ensure high levels of privacy for their data. Where processing offload is a key factor, there are ways to approach it from the use of local gateway-based offload engines to the extensive use of cloud computing. What is important is that there are numerous environments that enable easy prototyping and deployment to the architecture of your choice.
Author details: Cliff Ortmeyer is Global Head of Technical Marketing, Farnell