The last non volatile technology to have made it big is flash, which evolved from earlier work on ultraviolet erasable (eprom) and electrically erasable read only memories (eeproms). Although the precise mechanisms for reading and erasing data differ between these types of memory, they all rely on the ability to trap electrons on the gate of a transistor.
In 1967, Dawon Kahng of Bell Laboratories filed the first patent on a technique to make electrons stay for long periods on the floating gate of a transistor and so act as a memory device, although the precise structure he used would never become commercially feasible. A year before, Stanford Ovshinsky of Energy Conversion Devices filed the first patent on phase change materials – alloys that can flip from amorphous to crystalline states, depending on how they are heated and cooled. The technology attracted the attention of Intel's Gordon Moore. The fledgling company had started as a memory manufacturer, rather than a processor supplier, and phase change memory seemed to be a superior alternative to the magnetic core memory used at the time in mainframe computers. While both were non volatile, the entirely solid state phase change architecture could be made far more cheaply than the mechanically intensive magnetic memories – if it worked.
Because they depend on being heated by current, phase change memories are potentially power hungry. And the energy that passes through the memory cell over the long term threatens to destroy it. The metalloid alloys in phase change memories are disrupted by the heat and current that pass through them every time they are programmed or read.
To 'reset' a phase change element to its amorphous, high resistance state – which is used as a logic 'zero' – a relatively high pulse of current is needed. Because this pulse is of the order of hundreds of microamps and is forced through a cell that is just tens of nanometres across, the current density is immense: close to 109A/cm2. But this state is metastable – the material tends to return to a crystalline phase if heated to a lower level for longer periods. This potentially limits retention time, as the memory cells will, over time, revert to a conductive state. Careful choice of materials can limit this effect, but the high temperatures needed to reset a phase change memory lead to movement of the individual metalloids in the alloy. And as the materials are highly prone to electromigration, over time the memory elements become harder to reset and fail.
Despite the fact that problems with phase change memories were encountered in the 1970s, they have never quite gone away. Yet, with concern growing over the ability of flash to continue scaling much past the 22nm generation, a technology such as phase change has the advantage of being much easier to use: it behaves much more like an sram in terms of reads and writes. Phase change memory does not need the complex erase and programming sequences that flash does, although on chip circuitry has to generate the right set and reset pulses.
The problem for flash is there is less and less space on which to trap enough electrons to produce a reliably readable memory, especially as the memory cells are now so close together that they easily disturb each other when programmed or even read. Kinang Kim of Samsung estimated at IEDM in late 2010 that flash would reach its ultimate limit by the 10nm generation. While that is almost a decade away, the tricks needed to get flash to perform well beyond 20nm could become prohibitively expensive.
One option is to say goodbye to 2d scaling and start to build flash transistors on top of each other.
At the VLSI Technology Symposium in 2007, Toshiba unveiled a vertical structure for NAND flash devices: placing multiple transistors on top of each other along a string that reaches up from the surface of the wafer. Samsung developed an experimental vertical flash array two years later, tagging it the 'terabit cell array transistor'. Hynix unveiled its own approach to the vertical flash string at last year's IEDM, claiming it overcomes problems with the previous architectures in which charges can 'spread' from one flash transistor to another, potentially corrupting stored bits. The downside of the Hynix structure is that demands a larger area on the die. To achieve the same density as the Samsung or Toshiba memories, it will have to be built much taller than either.
Whichever vertical structure you pick, it is not going to be easy or cheap to make: it involves laying down multiple layers of silicon and oxide or nitride insulators, flattening each one using chemical-mechanical polishing. Narrow cylinders then need to be etched out of the sandwich, lined with thin barrier layers and filled with a conductive silicon core to form the channel for each transistor.
The advantage of the vertical structure is that – in principle – dense memories can be made on existing production lines, saving memory makers the cost of investing in the more sophisticated lithography equipment they would need to continue 2d scaling. If 3d flash makes it to production, it could force commercial phase change memory into a niche.
The only phase change memory found in a mass-market design so far is a 512Mbit drop in replacement for NOR flash that has shipped in a small number of Samsung GSM phones, according to teardowns by Chipworks and TechInsights. The NOR flash layout does not have a 3d counterpart being developed, so is more vulnerable to competition from new memory technologies.
Some of the lifetime issues with phase change may be solved by work conducted by IBM's Almaden laboratories. The researchers there have analysed how the elements within the phase change alloy move around under stress. For example, tellurium – used in one of the main alloys – will tend to move away from the main current path. However, the team found that a type of current pulse that is normally bad for these memories will partially reverse this process and extend the lifetime of a memory cell.
Despite shipping at least some phase change parts, Samsung's Kim was more enthusiastic over the prospects for another heat triggered memory: metal oxide resistive ram (ReRAM). He is not alone: researchers around the world have seized on ReRAM as a potentially better bet than phase change or flash memory.
A big attraction is material simplicity. Resistance switching was first found in materials such as lead zirconium titanate (PZT) – a material that has been used in ferroelectric memories, another one time contender for low cost non volatile storage. Unfortunately, PZT is hard to combine with cmos processes – the main reason why ferroelectric memories are today restricted to low density, specialised applications, such as data storage for energy meters.
Once device researchers observed similar resistive switching in binary oxides such as tantalum pentoxide – which has been used in dram capacitors – they became much more enthusiastic about the prospects for ReRAM. Things seemed to get better with the realisation that ReRAMs could be formed into 3d stacks. Although Intel and Numonyx have demonstrated that phase change memories should also be stackable –metal stack capacitors are already commonplace in cmos – ReRAM looks to be a simpler, more manufacturable option.
The surge of interest among researchers in resistive metal oxide memory has puzzled some of those working in other areas. IBM researcher Geoffrey Burr wryly observed at IEDM last year: "Phase change seems to have fallen a little off the excitement radar. But I'm not sure where the excitement over resistive ram comes from."
Seemingly more manufacturable, ReRAM is facing plenty of problems. One is that nobody is entirely sure how it works. On the one hand, this is attractive. There seems to be a variety of switching mechanisms at work in different material combinations – extending all the way to memristor behaviour.
"Memristor memories are all part of a wide spectrum of options," claims Dmitri Strukov of the University of California at Santa Barbara and a member of the team at Hewlett-Packard Laboratories that found a way to realise practical memristor devices almost 40 years after first being proposed by Professor Leon Chua of the University of California at Berkeley.
The problem is that making progress towards a reliable memory suitable for long term storage is tough if you do not know how it works. And further work is needed before ReRAM becomes viable commercially. Many of today's research level ReRAMs can hold data only for hours or days at room temperature.
The most common switching mechanism seems to be based on changes to the resistivity of cone shaped conductive filaments running almost all the way through an insulating oxide. According to work presented by Sematech at last year's IEDM, both temperature and electric fields seem to act on the filaments, triggering small scale chemical reactions that release oxygen from the metals to create more conductive forms or, for the reverse reaction, reoxidise the metals.
Similar to phase change memories, the high temperatures that the filaments reach could prove a problem for the long term reliability of ReRAMs. Better understanding of what causes reaction in ReRAMs may make it possible to reset a cell to its oxidised, high resistance, state without resorting to high temperatures induced by current.
If both phase change and ReRAM fail, there is still magnetic ram (MRAM). This uses the magnetic properties of a sandwich of metal oxides to switch between high and low resistance states. Like ferroelectric memory, MRAM was once a promising replacement for dram and flash – especially when integrated onto SoCs. But Freescale Semiconductor spinout EverSpin is one of a very select group of manufacturers still involved in MRAM development. EverSpin sells its MRAMs into military and aerospace applications, primarily because the technology is highly resistant to upsets caused by radiation.
A big problem for the type of MRAM produced by EverSpin is that it does not scale well to smaller geometries because of the way that programming disturbs adjacent bits. And the energy needed to write bits into the MRAM array remains higher than sram, although it compares well with flash.
In 2005, Grandis and Sony discovered that a slightly different mechanism based on spin polarisation could be used to control the magnetic orientation of layers in an MRAM cell. The spin-torque transfer MRAM was born and it kickstarted development of a new generation of devices that may prove to be the answer for the long awaited unified memory.
The spin torque transfer MRAM faces big problems. The materials degrade at the temperatures needed to lay down the dielectric and metal layers used for on chip interconnect, making the memory hard to integrate on SoCs. The relative size of the memory cell will limit its use in standalone memories and, if other 3d techniques take over, threaten its use on chip.
Meanwhile, 3d integration is already reshaping the industry. Stacked memory packages are commonplace in mobile phones as gigabytes of both non volatile RAM and DRAM are squeezed into a tiny space.
A limiting factor in the interface between a processor and off chip memory is bandwidth. Transactions have to be forced through a small number of relatively long circuit board traces and package wire bonds: their impedance and capacitance limits the maximum clock speed these traces use.
One option is to drill holes through chips and fill them with conductive copper to provide a more direct route for signals that cross from one chip to another – or even pass straight to another in the stack. High performance processor makers see this as a viable alternative to putting large quantities of embedded dram on-chip. Commercial dram is denser and cheaper. Using these so called through silicon vias, they can achieve processor to memory bandwidths as high as those from on chip memory and get the boost of much larger third or fourth level caches.
But there is a catch. Heat does not dissipate well through the stack. While the processor can sit on top, so it gets the benefit of a heatsink, the memories underneath get warmer than they would otherwise, which could reduce reliability. The situation is not necessarily better for low power processors because they do not have the benefit of the forced air cooling that servers employ.
So, if they do not need to pack as much as possible into the smallest space, manufacturers are looking again at the multichip module, last used in volume 15 years ago by Intel and others when their need for larger caches outpaced Moore's Law for a while. The terminology has changed in the meantime. Now, the talk is of silicon interposers, rather than substrates.
The interposers may be passive or include active components such as transistors. To maximise the number of connections, they will be made on slightly older cmos processes than the main chips in the stack. Because interposers allow chips to be mounted side by side, cooler running memory devices can sit in one stack on the interposer, while the processor takes up the rest of the area.
How the rise of 3d stacking will affect memory will depend on relative cost. Antun Domic, general manager of the implementation group at Synopsys, reckons the incremental cost of stacking as the technique moves into volume will be around $150 per wafer – or around 5% of the cost of a single fully processed wafer. Some, such as Jan Vardaman, president of Techsearch, think this could be an underestimate. However, if 3d is to be used in volume, its incremental cost will need to be a lot less than 10% per stacked wafer.
If manufacturing costs come down – and memory stacking for mobile phones has demonstrated how cheaply it can be done – it will give memory makers another degree of freedom. It may be cheaper to stick with flash and simply use more chips than to bear the cost of moving to completely new memory technologies that, because of thermal problems during manufacture or low yields, never quite become clear winners.