Along with these new devices the company has also introduced a new engagement program called the Cortex-X Custom program, which is intended to give the company’s partners the option of having more flexibility and scalability when it comes to increasing performance.
The Cortex-X Custom Program allows for customisation and differentiation beyond the traditional roadmap of Arm Cortex products, according to the company.
The Cortex-X1 is the program’s first CPU and is, according to Arm, its most powerful Cortex processor to date, bringing 30% peak performance improvements over the current Cortex-A77 CPU. It also offers 22% single-thread integer performance improvements over the Cortex-A78.
“This short high-performance burst is best for reactivity and responsiveness when using devices, enabling the highest performance ever for smartphones and large screen devices,” said Arm. “The Cortex-X1 offers 2x machine learning performance improvements over the Cortex-A77. This is part of our wider push for more local compute performance.”
The Cortex-A78 CPU looks to deliver a combination of performance gains with more efficiency.
According to Arm, “The Cortex-A78 is our most efficient Cortex-A CPU ever designed for mobile device, providing double digit improvements for sustained performance”. While it has the same architecture as the Cortex-A77 it has a modified micro-architecture to increase performance/W and performance/area and provides a sustained 20 percent performance improvement over the Cortex-A77 within a 1W power budget.
Arm said that it has maximized efficiency through reducing structures that have low performance and area, such as on the L1-I and L1-D caches. As a result, “We have optimised existing structures to consume less power, such as the brand prediction structures. This leads to 4% less power for performance and 5% less area,” said Arm.
At a cluster level, the DynamIQ cluster of 4X Cortex-A77CPUs and 4X Cortex-A55 CPUs can be upgraded to 4x Cortex-A78 CPUs and 4X Cortex-A55 CPUs, to provide a 20 per cent sustained performance improvement in 15 per cent less space. By introducing the Cortex-X1, however, it is possible to boost peak performance by over 30 per cent.
The Cortex-X1 has been designed as a higher performance processor and, due to various microarchitecture upgrades, it has more resources compared to the Cortex-A78 - its decode bandwidth is increased by 25% to five instructions decoded per cycle, and MOP cache throughput has been increased by 33 % to 8 MOP/cycle. On Cortex-X1, the Neon engine gets two additional pipes, doubling its compute capacity over Cortex-A78, then Cortex-X1 supports 64kbyte L1 and up to 1Mbyte L2 cache.
.While last year saw the release of the Mali-G77, the new Mali-G78 - with Arm’s Valhall architecture - ,is said to deliver a further 25% improvement in graphics performance.
The device supports up to 24 cores and these advances have been made possible via asynchronous top level, tiler enhancements, and improved fragment dependency tracking, according to ARM.
The company has also announced a new sub-tier of GPUs. The move, in response to demand from partners, sees the introduction of the Mali-G68. According to Arm, “It supports up to six cores and inherits all the latest Mali-G78 features.”
To address expanding demand for machine learning (ML) the Ethos-N78 is a neural networking processor. “Compared to the Ethos-N77 it is capable of delivering greater on-device ML capabilities but with up to 25% more performance efficiency,” said the company. “The Ethos-N78 also offers unprecedented levels of configurability, with available configurations starting at 1Top/s on up to 10Top/s.”