According to Mythic, the M1108 AMP heralds an exciting new era for AI, delivering for the first time an analogue compute solution that achieves best-in-class performance and power with accuracy comparable to digital devices.
The M1108 AMP opens up possibilities for device deployment at the edge in a broad range of applications and markets, including smart home, AR/VR, drones, video surveillance, smart city, and automation on the factory floor.
Running AI models at the edge rather than in the cloud offers significant advantages to manufacturers: designs can be simpler, privacy is significantly enhanced, and the overall user experience is superior. However, it also means the deployment of millions of devices, which creates another set of challenges, such as the need for low-cost, small form-factor devices with low latency, high performance, and low power. Digital inference solutions based on SOCs, TPUs, CPUs, GPUs, and FPGAs can meet some of these challenges, but inherent limitations due to memory, clock speeds, and process technology create numerous and challenging trade-offs, with the result that only processors capable of high-performance analogue compute, such as the M1108 Mythic AMP, can address them all, according to Mythic.
“This is a significant inflection point in the industry. We are delivering technology that was previously thought to be impossible,” said Mike Henry, co-founder, CEO, and chairman of Mythic. “Our Analog Compute Engine eliminates the memory bottlenecks that plague digital solutions by efficiently performing matrix multiplication directly inside the flash memory array itself. The high performance and low power of the Mythic AMP combine to open up AI technology to broader application areas and address product categories that are currently inaccessible to comparable digital solutions.”
The M1108 integrates 108 AMP tiles, each with a Mythic Analog Compute Engine (Mythic ACE) featuring an array of flash cells and ADCs, a 32-bit RISC-V nano-processor, a SIMD vector engine, SRAM, and a high-throughput Network-on-Chip (NOC) router. In addition, four control tiles provide a high-bandwidth PCIe 2.0 interface to a system host processor.
With 108 AMP tiles, the M1108 delivers up to 35 Trillion-Operations-per-Second (TOPS) enabling the power-efficient execution of complex AI models such as ResNet-50, YOLOv3, and OpenPose Body25 on a single-chip with extremely-low latency.
The typical power consumption of the M1108, while running complex AI models at peak throughput, is approximately 4W. And, with the inherent cost advantages of utilizing mature 40nm technology and not requiring any external DRAM or SRAM, the M1108 has up to 10-times cost advantage over comparable digital architectures.