As we approach the 100th anniversary of the invention of the dynamic driver speaker, it’s as good a time as ever to reflect on a ubiquitous and familiar, coil-based transducer that reproduces the sound in nearly every TWS earbud that’s currently on the market.
This venerable technology has come a long way thanks to advancements in physics, materials, and simulation. However, the transduction concept, or put more simply, the manner in which sound is produced in speakers, has remained exactly the same as it was nearly a century ago - long before the first earbud was ever created.
In a dynamic driver speaker, the speaker diaphragm moves at audible frequencies and directly displaces air to create the sound we hear. Other transducer innovations - electrostatic, planar magnetic, and now even solid-state MEMS transducers - have moved the industry forward, but they all share the same “push-air” concept, along with its associated product design and performance limitations.
A few startups, xMEMS included, have leveraged the superior characteristics of silicon and piezoelectric materials to enable previously unachievable bandwidth, performance, and musical detail in micro speakers.
But it's what’s on the horizon for MEMS transducers that promises to be the most revolutionary yet.
Sound from Ultrasound
Sound from ultrasound has been a research topic since the 1960s but has never achieved the acoustic performance required for broad commercial appeal, until now.
The key innovation is an ultrasonic amplitude modulation transduction principle, which involves converting analogue audio signals into ultrasonic air pulses that our ears hear as rich, detailed, bass-heavy, high-fidelity sound. To do so, a companion controller/amplifier ASIC is needed to modulate the audio into ultrasonic carrier signals, which drive the MEMS speaker to create acoustic air pulses.
These air pulses generate variable pressure within the ear canal that we then hear as sound in the audible frequency range due to the high acoustic impedance of the ear.
Better Sound, a more flexible package
When this companion design is dropped in a TWS earbud, the most noteworthy benefit is the ability to reach 140dB SPL output at 20Hz - the target sound pressure level for high performance Active Noise Cancelation (ANC) applications in vented, or leaky, TWS designs.
The vented design is quite common in the TWS space because they prevent the occlusion effect, which occurs when a blockage of the ear canal produces internal resonance and amplification of sounds the wearer’s body makes such as walking, running, talking or chewing.
One downside of vented designs is decreased SPL at low frequencies, sometimes as much as 20dB. Because an SPL of 120dB at 20Hz is most desirable for ANC performance, 140dB SPL occluded (not vented) is needed to account for the 20dB loss.
Today’s MEMS transducers, despite significantly improving sound quality over dynamic drivers, can only achieve 120dB SPL at 20Hz on their own, meaning they would still need to be paired with a dynamic driver to cancel out low frequency noise. Ultrasonic MEMS speakers have no such limitations.
In addition, ultrasonic speakers have extremely low latency - theoretically as low as 3µs - and a flat phase response with virtually no delay across the entire frequency band. This low latency and flat phase response, along with the high SPL output, result in a speaker that can more accurately reproduce today’s advanced sound formats, including high-resolution and spatial audio.
Also, one less appreciated aspect of ANC performance is the requirement of a relatively soft speaker diaphragm in a dynamic driver to reach 140dB SPL at 20Hz. The downside of this soft surface is that it's not effective at blocking ambient noise on its own.
In contrast, an ultrasonic MEMS speaker generates ultrasonic airflow pulses that are unaffected by external noise at ambient frequencies. Furthermore, they are made from extremely stiff silicon with a resonance frequency more than 5x the highest audible frequency. This means an ultrasonic MEMS speaker can effectively act as a form of Passive Noise Isolation by minimising noise ingress - enhancing ANC instead of detracting from it.
Ultrasonic speakers also have a very low acoustic THD of below 1% from 20Hz to 20kHz, and below 3% out to 40kHz. The low THD, low latency, and flat phase response ensure clearer and more precise sound in TWS earbuds, with each instrument and voice sounding more distinct and natural.
On the design front, the speaker/controller combination is actually smaller in diameter and a fraction of the thickness than comparable dynamic drivers, meaning brands can deliver improved audio performance without compromising form factor. On top of that, the ultrasonic speaker does not have any acoustic back volume requirements, which reduces mechanical design and acoustic tuning time. The result will be TWS earbud packages shrinking in size or including new features that were previously too large to integrate.
Above: xMEMs' Cypress Earbud
The Real Deal?
I know all these improvements sound great on paper, but many designers will probably be wondering if there’s any real proof behind them. Is this a real product or some pie-in-the-sky theoretical project? It’s the former and it’s called xMEMS Cypress.
Cypress is driven by a companion IC, called Alta. Operating closer to AM radio frequency than audible sound, Cypress and Alta work in concert to convert analogue voltage into acoustic air pulses. Alta modulates the incoming baseband signal into a Dual Sideband Suppressed Carrier (DSB-SC) AM signal, and Cypress demodulates in the acoustic domain, generating high frequency air pulses.
xMEMS’ Cypress is sampling to tier-1 customers now and was demoed at CES 2024. Alta prototype silicon is available now and will start sampling to customers in July 2024. Cypress and Alta will be combined in an easy-to-integrate System-in-Package (SiP) that acts as a drop-in replacement for typical dynamic drivers found in TWS earbuds. The solution is scheduled to be in mass production in the first half of 2025.
In short, ultrasonic speakers beat out dynamic counterparts in SPL, frequency response, phase response, and THD, and let’s not forget greater design flexibility. They provide superior audio performance for both audiophile level listening as well as the most demanding ANC applications. They also improve upon the size of dynamic drivers, and they do it all by creating sound from ultrasonic airflow pulses.
It’s this sound from ultrasound principle that I believe will lead the next revolution in personal audio, in particular for the TWS earbuds many of us use every day.
Author details: Robb Zimmerman, Principal Systems Engineer at xMEMS Labs