Designed as a dedicated neural-network-optimised DSP, the Vision C5 is said to accelerate all neural network computational layers – convolution, fully connected, pooling and normalisation. This, says Cadence, free the main vision/imaging DSP to run image enhancement applications independently, while the Vision C5 DSP runs inference tasks.
Pulin Desai, product marketing director for Cadence’s IP group, said: “The C5 is a parallel machine with special instruction set to take care of neural network processing. We looked at all the instructions and created new ones specifically for neural networks.”
Alongside the computational capacity of 1TMAC/s, the device can support 1024 8bit MACs or 512 16bit MACs and boasts a VLIW SIMD architecture with 128way 8bit SIMD or 64way 16bit SIMD.
Described as ‘flexible and future proof’, the Vision C5 supports variable kernel sizes, depths and input dimensions, as well as several coefficient compression/decompression techniques. “There will be places people want to compress coefficients and need to do on the fly decompression,” Desai added.
Occupying less than 1mm2 of silicon in a 16nm process, the core is said to perform up to six times faster than commercially available GPUs when running the AlexNet CNN performance benchmark and to run the Inception V3 CNN benchmark up to nine times faster.