Delivering a post-microphone audio chain that draws just 25µA of current when always-listening and collecting preroll data, Aspirity’s front-to-back system combines the latest generation of its Reconfigurable Analog Modular Processor (RAMP) chip with two of STMicroelectronics’ microcontrollers (MCUs). According to the company, the solution provides a range of opportunities for a new wave of always-on, high-accuracy voice-first devices with extended battery lifetimes.
“From smart speakers and hearables to voice-controlled TV remotes, consumers love using voice to interact with the always-on smart electronic products in their lives,” said Tom Doyle, founder and CEO, Aspinity. “But there’s a serious problem facing system designers today: Legacy system architectures that digitize all the sound data up front are notoriously inefficient, clogging the cloud and wasting battery life on irrelevant data. We designed our RAMP IC to address this power challenge right at the root by enabling a more intelligent, data-efficient, system-level edge architecture.”
Unlike other digital-only solutions using low-power MCUs or DSPs, the company's RAMP IC is a neuromorphic analogue processing chip that designers can simply drop into the front end of a typical wake word-driven system and immediately improve the system’s efficiency by focusing downstream data processing on the important data.
By detecting voice at the earliest point in the signal chain — when the data is still analogue — the RAMP IC eliminates the digitization and analysis of irrelevant non-voice data and reduces system power by up to 10x.
Aspinity has demonstrated its low-power voice-activity detection (VAD) plus preroll solution in combination with two ST MCUs. Combining its RAMP IC with the STM32H7 and the STM32L4 ultra-high performance and power-efficient MCUs for connected and non-connected applications, Aspinity is able to provide “analyze-first” always-listening systems that it claims deliver an order-of-magnitude improvement in battery life for users while maintaining wake word accuracy.
Testing these VAD systems with both the Amazon and the Sensory TrulyHandsfree wake word engines (WWE) the company was able to demonstrate that its preroll compression and reconstruction algorithm does not affect WWE accuracy.
“Voice has become the preferred human-machine interface for device interaction, and we are seeing the voice revolution extend to low-power devices like wearables, hearables, TV remote controls and more,” said Todd Mozer, CEO, Sensory. “Combining Aspinity’s analogue voice activity detection and preroll collection with Sensory’s TrulyHandsfree voice control algorithms brings a new level of low-power, on-device voice solutions, focusing system processing only on voice. This will allow device designers to deliver secure, high-performance and efficient battery-operated voice-first solutions that never send wake words to the cloud.”