While the majority of AI technology seems focused on the cloud, there is a growing amount of intelligence gathering at the edge, interfacing with the physical world, we’ll call this ‘Edge AI’.
Two primary factors are driving this so-called Edge AI paradigm: the generation of virtually unlimited amounts of data and an increasing availability of computing resources, even down at the microcontroller level. By shifting the computing to the edge, we are able to drastically decrease or eliminate the latency of pushing data to the cloud and alleviate the growing concerns of privacy by localizing the generated data. Whilst advanced computing resources are a prerequisite for edge AI, as well as AI in the cloud, software is the key for unlocking its real potential. AI and its subset, Machine Learning (ML), have been around since 1959, its recent insurgence has been driven by exponential advances in the algorithms and various software tools and frameworks that support these technologies.
One size does not fit all
From a hardware standpoint, all AI computing platforms have one thing in common – resource limitations. Even in datacentres, where performance seems infinite, energy and cost dictate reality. These limitations are even more exacerbated in Edge AI for reasons that we are all very familiar with. As a result, we recognise a three-axis trade-off between cost, decision accuracy, and inference time (user experience).
Whilst cost is a factor that we have encountered in many domains, accuracy and inference time are unique to AI, although all these factors are very interrelated. Specifically, increased accuracy implies increased training time as well as the need for more data during training. Increased accuracy could mean even more complex AI models, which in turn requires higher performing devices, more memory, and ultimately more energy. On the other hand, a decrease in inference time (i.e. enhanced user experience) may also require higher performing devices but perhaps a more lax requirement on accuracy.
Each of these factors - cost, accuracy, and inference time - are highly dependent on the application in which the AI is being applied. Consider for example, Edge AI doing food recognition in a microwave oven. This is a case where the inference time can be in the range of 200-500 milliseconds (or even longer), far within the acceptable range for human tolerance.
A high-performance microcontroller would be well-suited for this type of application. And although a compromise on accuracy for a microwave would not result in dire consequences, having an increased memory capacity would allow a larger model that ‘stores’ more food items (i.e. classifications) and minimizes the chance that the user will receive a ‘food item not recognized’ message.
A doorbell security system can suffice with inference times on the same order of magnitude as the microwave. In this application, the Edge AI system recognises a person - accuracy is extremely critical. In addition to requiring larger memory capacity, increased accuracy and more classifications to ‘decide’ on could also yield an increased number of computations, thereby requiring a higher performance applications processor to meet the acceptable inference times.
The security doorbell represents a good example where an OEM might have low- and high-end products, varying by the capability to recognise a specific number of faces within the acceptable time window.
At the other end of the product spectrum, Edge AI is involved in life and death matters. Consider the different requirements for applications such as autonomous driving that demands many accurate decisions to be simultaneously made every second to protect the vehicle’s occupants and those in the surrounding environment. Inside the car, another Edge AI application monitors drivers’ reactions through their eyes – a real-time function that requires high performance computing available from an applications processor.
For Edge AI applications, computing performance requirements can and will vary widely, implying the need for a wide range of processing devices from MCUs to high-end application processors, but what shouldn’t change is the software environment used to build those applications.
Software is the synapse of AI
For Edge AI to proliferate and become mainstream technology, it cannot remain the bastion of the Ph.D.’s in math. We must simplify Edge AI to a level that models can be trained and inference engines can be developed and deployed with a python script or pull-down menu, rather than burdening the developer with having to create complex mathematical algorithms.
The good news is that there are many ‘programming’ options available or coming available. You might be asking, why the above word programming is emphasised. The interesting aspect about programming for AI, often referred to as Software 2.0, is that it’s not about employing traditional programming methods, instead it’s about using neural networks and classical ML libraries that someone else (individuals, open source groups, or even corporations [e.g. Google with TensorFlow]) have developed. In Software 2.0, the programming is more about determining weights and parameters.
Whilst the growing number of options (most are open source) indicates that Edge AI is headed towards mainstream deployment, there must be some consolidation to make it easier for the typical user to select an approach, yet software technologies for Edge AI are expanding almost on a daily basis and will continue to do so into the foreseeable future.
We are witnessing this expansion in many forms including model frameworks, inference engines, neural network optimisation techniques, convertor tools and data augmentation methods.
TensorFlow probably leads the pack in terms of popularity and functionality (and has become a de facto standard), but others include: TensorFlow Lite, MXNet, PyTorch, Caffe 2, Keras, and the list goes on. The convertor tools allow users to transition to their favourites, for example converting from TensorFlow to ONNX (Open Neural Net Exchange format) or NNEF (Neural Network Exchange Format), both of which are industry standard formats intended to reduce fragmentation.
On the inference engine and framework front, there are also a wide variety of open source options, depending on the target application. For example, on NXP’s i.MX products running Android, users can take advantage of the Android NN API. For users working with Arm-based platforms (mobile or embedded), Arm NN parses and translates neural network frameworks into a high performance inference engine, using Arm Compute Library (also open source) to pull in its optimised software functions.
Bringing the Cloud to the Edge
The key takeaway is to proceed cautiously, as the options are at various stages of maturity and capability. We will always need the mathematical gurus to take care of the complex functions ‘under the hood’ - for example, craftily implementing advanced neural network optimisations or creating better ways of accurately training models.
But to reach its full potential, embedded system developers building Edge AI products need Software 2.0 tools to provide a comprehensive ML development environment; such environments must be specifically targeted, not just at the computational units (e.g. processor cores, AI accelerators), but also at the SoC architecture level, to provide the most efficient implementations.
We’re now at the crossroad where Edge Artificial Intelligence meets genuine intelligence – where the mathematics meets practical deployment.
Author details: Geoff Lees is SVP and GM, Microcontrollers Business, NXP