Floating point enabled microcontrollers in embedded designs
4 mins read
Floating point units (fpu) can increase the range and precision of mathematical calculations or enable greater throughput in less time, making it easier to meet real time requirements. Or, by enabling systems to complete routines in less time and to spend more time in sleep mode, it can save power and extend battery life.
With floating point numbers – in the form A x 10^B – the decimal point in the mantissa (A) is free to float; you can place it anywhere, according to what best suits the calculation being carried out (see fig 1). The value of the exponent (B) can then be adjusted to keep the magnitude of the number unchanged. For example, 1.234 x 10^6 is identical with 1234 x 10^3. It is common to present and store numbers in a normalised form, with the decimal point following the first non zero digit.
The value of using floating point maths for processing lies not in the freedom to place the decimal point, but in the range of numbers the notation can represent. The most frequently employed standard is IEEE754, in which a so called single precision number can have a value within a (decimal) range of approximately –10^39 to +10^39. No matter where the number lies in the overall range, the mantissa always has some 23bit of resolution, making it a good match for signal content such as 24bit audio.
The fundamental design challenges in an audio signal chain have not changed since the move from analogue to digital domains. Audio signals have large dynamic ranges with information critical to faithful content reproduction, which must remain uncompromised at both extremes of the signal range. Throughout this path, the signal may be filtered, mixed, level shifted or amplified in multiple processing steps. When the design task was in the analogue domain, designers had to monitor signal levels to keep them above the noise floor, while ensuring that peaks in the content did not get too close to amplifiers' maximum levels.
An analogous situation exists in the digital domain. The data values that represent the content must stay within the overall number range if information is not to be lost by overflow or truncation. Multiple stages of signal processing, especially filtering, involve successive mathematical operations – especially multiplication – that can shift the absolute value of the data over wide ranges. With a limited number range, care has to be taken that the 'sliding window' of relative values stays well within the available range – using the analogue analogy, out of the noise floor, but below the voltage rail.
Conventionally, floating point dsps have been more expensive than fixed point devices. A frequently used methodology has been to develop a signal processing project in the floating point environment, allowing numerical values to range freely.
Then, when the algorithms are working according to specification, the prototype is converted to fixed point hardware. Part of that process involves inspecting the numerical values the working product generates along the signal chain and introducing scaling factors at appropriate points to keep the values within the usable number ranges. The reverse approach is also valid; project managers can opt to host designs on more expensive target hardware to shrink the development process and costs.
Both approaches assume that floating point coprocessors require substantial amounts of silicon and are expensive. However, today's silicon fabrication technology means this is no longer necessarily true. It is now possible to pair a 32bit microprocessor core with a fully IEEE754 compliant fpu in an economically viable device. For example, Atmel's AVR UC3 microcontrollers offer high dsp performance, with fixed point and integer arithmetic support.
The addition of a single precision floating point unit changes the options open to designers in a number of ways. The first is freedom from concerns about number ranges and scaling. Through successive processing steps, the signal value can be allowed to broadly take whatever absolute value it needs; the essential information will always be contained within the 24bit representation of the mantissa. The need to attend constantly to scaling disappears.
However, the benefits extend beyond design freedom; there is also a gain in throughput as the fpu can perform in a handful of clock cycles, operations that consume many tens of cycles in an unaugmented core. In the AVR UC3, the fpu performs most 32bit floating point instructions in one clock cycle and a 32bit MAC in two cycles, rather than the 30 to 50 cycles required without an fpu. That additional throughput could be exploited to increase the amount of signal processing that a microcontroller can accomplish or to reduce the power needed to achieve a given output.
There are numerous examples of 'mission critical' system designs in which code must meet real time scheduling deadlines. These benefit from the ability to carry out high precision calculations in few cpu cycles. An engine management unit has to handle sensor inputs with very large dynamic ranges, but the time available to complete each computation cycle is defined by the engine's mechanical rotation.
In precision electric motor control, the ability to handle large number ranges is also valuable because algorithms require that a number of complex transformations is applied sequentially without loss of data through truncation. However, the time for the computation is set by the motor's rotation period.
The availability of floating point arithmetic in this class of processor alters the dynamics of device selection for many types of design; in particular, consumer audio. The audio signal chain demonstrates the benefit to the designer of employing a floating point architecture. In most cases, the signal source will be a digital representation that can be converted to the required number format in a few cycles. At the end of the signal chain, the audio will be returned to the analogue domain; after normalisation, a d/a converter will operate on the fractional numbers represented in the mantissa that carries all the data it needs. Whatever excursions the signal has taken through the full floating point number range are discarded; the exponent is not used and maximum precision will have been maintained.
Designers working with signal capture and processing systems in sectors such as industrial systems or medical instrumentation will find the integration of a general purpose mcu core with an fpu changes their options. Previously, a signal path with processing implemented in floating point inevitably meant a pairing of mpu or mcu with a dedicated dsp, with cost, size and power implications. The option of placing the design on a dsp alone would often be ruled out because the dsp lacked the ability to host the necessary control functions.
The markets usually associated with microcontrollers have not previously had access to floating point mathematics. The very term fpu has implied complex, high end and often dsp based systems. Now that process technology has made fpus accessible to a wider range of projects, engineers will gain access to the benefits of computational precision, wide dynamic range and simplified coding that floating point math brings.
Haakon Skar is AVR marketing director with Atmel.