Virtualisation in the embedded systems market
8 mins read
Chris Edwards outlines the benefits of virtualisation in the embedded systems market.
Embedded operating system vendors have spent the past few years adding support for hardware virtualisation to their portfolios. They have anticipated a move to more complex systems, where two or more operating systems have to run side by side, without conflicting unnecessarily over I/O and memory.
Conceptually, virtualisation dates back to the mid 1960s, when IBM worked with the Massachusetts Institute of Technology to develop one of the first time sharing computer systems. This research worked resulted in the VM/370 operating system, which could not only host its own applications, but also other operating systems – including IBM's big money spinner MVS. VM/370 provided the image of a complete hardware mainframe to a guest operating system that, potentially, had more memory and resources than the hardware on which VM/370 ran.
The trade off was time. As VM/370 would have to keep spooling the virtual memory to disk, the actual performance would be nothing like the real thing. But in many respects, the guest OS saw a 'real' machine. As interest in mainframes died away, so too did virtualisation – until recently. The reason for its rebirth in computing is because of concerns over data centre costs. Virtualisation makes it more economical to consolidate multiple server computers into one blade.
Because many servers run at less than full capacity, consolidation not only cuts costs, but also improves power consumption. As each server draws close to full power, even when running an idle loop, bringing multiple environments onto the same machine is more energy efficient. A similar principle has been applied to low end smartphones, where virtualisation, invisibly to the consumer, is being used to cut the bill of materials.
Normally, a smartphone splits responsibilities between at least two microprocessors. An RTOS running on the baseband processor coordinates the interface between the phone and the wireless network; this OS is rarely encountered by the user. The other processor, usually a higher performance model, is intended to run user visible OSs, such as Android, Apple's iOS or Windows Phone 7.
Similar to the situation for compute servers, the baseband processor is often underused in a multicore handset, leaving spare capacity for the applications oriented OS. According to Open Kernel Labs, a supplier of hypervisors for mobile phones, it is feasible to run an Android installation on a 200MHz ARM926 with 128Mbyte of memory alongside the baseband RTOS.
A further argument for using virtualisation in embedded systems is software portability. Application writers can develop code for a generic canonical machine and expect that code to run on any platform that can emulate the behaviour of the original machine, even if the physical implementation has completely different registers and memory maps. This should reduce the cost of porting code to new hardware as only low level firmware needs to change – something on which phone vendors with large hardware portfolios are keen.
Conventionally, embedded systems have primarily used forms of virtualisation for resilience. An example is the ARINC 653 executive for avionics systems (NE, 22 February 2011), in which guest OSs run in time slices to ensure one environment does not starve another of processor time.
Because it offers the ability to seal off guest OSs from physical I/O, virtualisation has acquired the image of making embedded systems more secure. This is only true if the hypervisor itself is secure. But it is easy to introduce 'sneak paths' that allow attacks, particularly through device drivers.
Because drivers need low level access to peripherals, they often need to run in a privileged mode. If they are not sufficiently protected against attacks such as buffer overflows, they can provide a means to take control of the entire machine by acting as a trap door into the hypervisor functions.
One advantage that a hypervisor has in terms of security is that its codebase can often be much smaller than that of a full OS, making it more amenable to intensive testing and formal verification. And the hypervisor can block calls to peripherals that are deemed potentially harmful or can remove access to ports that should not be available to unprivileged code.
By locking down parts of the hardware, it becomes possible to use a wider range of software in systems that previously only had access to a highly restricted approved list of OSs and applications. For example, industrial control and military systems can make use of the user interface development tools available for Android and similar products, but leave vital functions to a dedicated RTOS.
Green Hills' CTO David Kleidermacher uses the example of a field radio system, where the bulk of the operator's interface can be built using Android, but control over the encryption keys is maintained by other software, so they can be zeroed out reliably if there is a danger the unit will fall into enemy hands. Hypervisors divide into Type 1 and Type 2 implementations.
Type 1 is when the hypervisor runs directly on the hardware; a Type 2 hypervisor needs a guest OS to provide access to the I/O hardware and other services. For many embedded applications, the Type 1 hypervisor makes more sense and is likely to become more common, particularly as chip vendors add more support for the technique. However, a Type 2 implementation can be easier to use if the hypervisor is not needed for security.
It allows a guest OS, such as Linux, to be added as a task underneath a conventional RTOS without too many changes to the scheduling scheme. There is, however, a further consideration: the execution model. You need a way to isolate the guest code from the actual hardware. The simplest way to do this is through emulation. This makes it extremely difficult for code running inside a guest OS to 'break out' and affect the machine as all instructions have to be fed to a software interpreter to run.
However, this approach is also much slower, because every operation needs to be mapped to the target processor. In the desktop environment, virtualisation products such as VMware use dynamic binary translation, where software is compiled 'on the fly' into safe native machine code. The initial compilation takes longer but, once translated and cached, repeated loops of the code will run more quickly.
Calls to the OS are translated into direct calls to the hypervisor. Running the guest code directly on the processor hardware is clearly going to be faster. When the code makes an OS call using, say, an int instruction on an x86, the hypervisor traps the software interrupt and emulates the function on behalf of the guest, mapping any accesses to registers to the actual hardware, assuming the procedure is allowed by the hypervisor's policies.
The 'trap and emulate' approach demands hardware support from the processor as there needs to be a way to prevent the guest code from accessing sensitive memory addresses directly. These can be protected using virtual memory techniques – any attempt to read or write to an area in privileged space can be trapped by not mapping it directly into the guest's memory. Instead, the memory management unit (MMU) registers a fault when the memory block is accessed, passing control to the hypervisor.
There are clear performance penalties to the trap and emulate technique, although it is generally better than emulation. Take the example of accesses to a network card. First, the OS prepares a packet to transmit in a memory buffer. It writes the start address of the buffer to a register in the network card and its length to a second one. It then issues the 'go' command to a third register. The network card should then read the packet, send it and then send a 'ready' interrupt to the OS.
Each register access, which is actually a memory location in the hypervisor's space, triggers a trap to the hypervisor. This implies a task and mode switch into the hypervisor, which then accesses the relevant register in the actual network card. And each interrupt slows the system down. Paravirtualisation, more commonly found in embedded environments than in the desktop world, allows the guest OS to make calls directly to the hypervisor, instead of relying on software traps.
This demands the OS be tweaked subtly to remove direct hardware calls and replace them with accesses to the hypervisor. This method can be more efficient for some tasks, but only works where you have access to the OS code or if a vendor decides to support a hypervisor directly. However, paravirtualisation has the benefit of runtime speed.
In the case of the network interface, instead of triggering a software interrupt for every single register access, it can assemble all the necessary data in one buffer, such as start address and packet length, and then call the hypervisor. But there is another way to handle I/O: map the hardware directly into the guest's memory space so that it can have direct access.
This has the benefit of speed – it avoids trapping into the hypervisor – but removes many of the benefits of virtualisation, such as the ability to share I/O resources between guests and to have a monitoring function watching these sensitive parts of the machine. However, in the context of an embedded system, if an RTOS based application needs direct access to a network port or part of the user interface and there is no reason to share that resource, then this can make sense.
The higher performance resulting from allowing direct access to I/O is why chipmakers have steadily been adding support for virtualisation: to allow this access and still support virtualisation. In the x86 environment, Intel and AMD have subtly different implementations, but broadly similar approaches.
ARM has recently added hardware virtualisation support to its v7 architecture, implemented on processors such as the Cortex-A15. Freescale has introduced virtualisation features to its QorIQ processors and made it possible for code running in a guest OS to perform functions directly, such as locking data in cache lines to aid performance if the hypervisor allows it.
Interrupts can be tied to virtual machines so they are held until the hypervisor allows the guest OS to run. One of the main techniques being used today is the I/O MMU. As its name suggests, the I/O MMU borrows the concept of address virtualisation from memory management techniques.
Again, it's not new; although it has only recently appeared on hardware platforms such as the x86 or QorIQ. It's a concept that dates back to the mid 1970s – envisaged then for the failed Multics OS that provided the inspiration for the way in which privileged modes are implemented in Intel's x86 architecture.
The I/O MMU allows the addresses used for functions such as DMA to be mapped into the guest's address space so the OS can access them directly. The hardware takes care of the address translation from the guest's space to that of the physical hardware. Using an I/O MMU, requests can be controlled so that only certain memory ranges are accessible from a particular guest. Any access outside that range will be blocked and force a trap to the hypervisor.
There is a problem with this technique: only one device can have access to a particular peripheral, because there is nothing to manage transactions outside the guest OS and to prevent different applications from conflicting with each other.
For example, with no knowledge of the environment outside it, a guest OS might issue a command to disable interrupts so that it can write atomically to a network card. However, that call will only disable interrupts routed to that guest. It could be interrupted at any time by the hypervisor in favour of another guest, which might attempt to access the same hardware.
If it writes a start address to the network and control passes back to the original guest, which then issues the 'go' command, the packet will be copied from entirely the wrong place. The answer, called single root I/O virtualisation, is to carry the hardware virtualisation concept further – and this has happened with PCI Express.
The bus can provide virtual copies that are isolated from one another, but which are mapped to a single interface that is managed by a hardware I/O controller whose job is to prevent the type of race conditions that could present themselves with a simple I/O MMU structure. The big problem with single root virtualisation is the patchy support for it: typically, you will only find support in higher end hardware and rarely in silicon aimed at the embedded market.
However, as virtualisation becomes more common, you can expect a number of these features to migrate down because of the impact they have on performance; particularly in multicore processors, where OSs tied to different cores have to talk to shared I/O resources, such as networking ports. A hypervisor can be used to manage the machine, providing near bare metal access to I/O for the RTOS partition and with a greater degree of I/O control over OSs such as Linux or Windows that are intended for running user visible applications.
The hypervisor makes it possible to partition chips with more than two cores into virtual multiprocessors; an RTOS might run on one core, with multiprocessor aware environments such as Windows running on a virtual dual core machine the hypervisor presents to it. As large multicore processors proliferate, you can expect virtualisation to become much more prevalent in the embedded systems space.