Power optimisation for low power SoCs targeting mobile devices
4 mins read
Mobile devices are packing in more features than ever, pushing performance and bandwidth requirements to new levels. A plethora of new and evolving mobile devices, such as netbooks, tablets and smartphones, has introduced a convergence of computer, communication and consumer functionalities into a single device.
The advanced baseband and application SoCs which power these mobile devices need to offer ever higher processing horsepower while still being powered by a battery. Reducing the power consumed by SoCs is essential in order to continue and add performance and features and continue the advancement of mobile devices. As a result, power optimisation has become a top priority in SoC design.
The total power consumed by a digital design consists of dynamic power and static (leakage) power. Dynamic power – the power consumed during device operation – is associated directly with processing load. Static power is the power consumed by the existing logic without any relation to the operation being performed.
For modern SoCs, power optimisation requires close attention to both dynamic power and leakage (see fig 1).
Key optimisation techniques include:
• Efficient architecture: the most important step in reducing power is to support the most demanding use case efficiently. An efficient architecture will result in a better dynamic and leakage power.
• Clock gating: shutting off unused logic.
• Mixed threshold voltages (Vt): a mix of fast and slow cells reduces leakage.
• Multivoltage: using different voltage levels for different modules inside the SoC reduces both dynamic and leakage power.
• Dynamic voltage control: turning off unused functional blocks reduces leakage. Changing voltage levels adaptively reduces both dynamic and leakage power.
An efficient architecture is the place to start. For computationally- intensive applications like wireless baseband and multimedia, architectural efficiency is largely related to application specific instructions and parallelism. For that reason, on one hand modern processors are becoming more and more specific towards target applications: for instance communication processor, graphic processor and video processor.
On the other hand, these modern processors offer higher instruction-level parallelism by processing wider data elements at once through new architecture types, such as vliw (very long instruction word), simd (single instruction, multiple data) and vector processors. For instance: modern vliw dsps, such as TI's C64x+ and CEVA-X1643, support the execution of up to eight instructions in every processor cycle.
The challenge with a highly parallel architecture is that not all execution units are needed at all times. To keep power down, you want to limit power consumed by idle logic. Thus, modern architectures disable the clock signal for unused execution units – a process known as clock gating. The clock distribution network can be responsible for up to 70% of dynamic power consumption, so clock gating can bring considerable savings by turning clocks off when they are not required.
The effectiveness of clock gating depends on the processor's pipeline. Well-designed pipelines detect the required execution units in the initial pipeline stages and allow the insertion of clock gating without changing functionality, hence maximising the benefit of the approach. The pipelines also spread memory accesses across several pipeline stages, enabling the use of slower (and thus lower-power) memories.
Mixed Vt designs offer another way to save power. Logic cells libraries typically come in three variants: fast, but leaky, low Vt cells; slow, but low leakage, high Vt cells; and standard Vt cells that balance speed and leakage. Since this method is very effective and does not involve any design changes (only implementation techniques), it becomes common. However, when moving to smaller geometries (65nm and beyond) it becomes less effective.
The general idea for power reduction is to minimise the usage of fast and leaky low Vt cells. SoC designers can optimise their design by using these power hungry cells only for critical paths, while using power efficient cells everywhere else. This approach can achieve dramatic power savings. For example, CEVA reduced the leakage power of its CEVA-X1622 core by 81% simply by moving to a mixed-Vt design without any significant effect on performance (assuming the core is built on a 65nm LP process).
There are also ample opportunities to cut power by modifying the operating voltage. In the past, an entire SoC would be tuned to the needs of the most critical component. Modern technologies allow us change this requirement. Today, different modules inside an SoC can be powered by multiple voltage supplies with different operating voltages.
Lowering the voltage for the modules where speed is less critical can lead to significant reductions in both dynamic and leakage power. As a decrease in voltage leads to higher power reduction compared to equivalent frequency reduction, multivoltage design is becoming a key method.
Designing a SoC for multivoltage operation is challenging, with new complications created at every step in the process. Taking advantage of multivoltage technology is much easier when SoC designers use a solution like the CEVA-XC323, which features multiple voltage domains. The CEVA-XC323 is equipped with an innovative power scaling unit, allowing different voltage supplies to different processor units and making full use of a multivoltage design approach (see fig 2).
Introducing multiple power domains into the design opens up a new world of power saving techniques allowing dynamic control of the voltage supply into the different SoC blocks. The advantage of such dynamic methods is the ability it gives to the SoC power management unit to minimise the power in run-time. Turning off these modules can reduce leakage significantly, however it can take hundreds of clock cycles to turn off and restore a module.
SoC designers must take this delay into account when implementing power down capabilities. SoC designers must also consider how they will go about disabling and enabling modules. Several hardware and software approaches are available, but each approach involves tradeoffs. For example, a hardware based approach can increase a design's footprint by 20%.
The final power optimisation technique is adaptive voltage and frequency scaling. This technique modulates the operating frequency and voltage of each module on a use case basis. The idea is straightforward: different use cases exercise different parts of the hardware to a greater or lesser degree. When a use case puts a low load on a module, that module can have its frequency and voltage reduced.
Although the concept is simple, implementing this technique is difficult. SoC designers must consider how many voltage levels to support, how to move between different frequencies and voltages and how to manage the transition time required to switch levels. Power scaling also makes it more difficult to perform timing analysis, because a slow clock tree leaf in one voltage level may become the fastest leaf in other voltage levels.
SoC designers can save themselves a tremendous amount of effort by using a solution that is predesigned with frequency and voltage scaling support.
Eyal Bergman is director of product marketing, Ran Snir is vlsi department manager, with CEVA.