Voice recognition features are part of most smartphones, but they have shortcomings. Sometimes the feature won’t activate, mainly if a person solely uses a vocal cue to turn on the recognition feature instead of a smartphone keyboard command. That’s because the surrounding environment is too loud, and the microphone can’t detect the request.
In other cases, the phone’s microphone wrongfully interprets a person’s voice as a command at potentially embarrassing times, such as when they’re in a meeting. Problems can also occur if an individual speaks too softly in a setting with many other, more prominent sounds.
The microphones inside devices with voice recognition technology typically convert variations in pressure — sounds — into electrical signals that get analyzed.
However, sounds other than a person’s voice frequently affect those microphones, causing undesirable performance. The microphones recognize voices via changes in air vibration, but mechanical resonance and the damping effect decrease the sensitivity of a microphone.
A pair of researchers from Pohang University of Science & Technology came up with a technology that’s superior to current voice recognition options, and it could lead to more accuracy even when people use voice recognition in potentially noisy areas, like train stations or shopping malls.
Voice recognition through neck skin vibrations
If a person puts their hand against their throat while speaking, it’s easy to feel the vibrations associated with the voice. The researchers took that into account while developing their voice recognition sensor. It’s a wearable device that recognizes a person’s voice according to how their neck skin vibrates. That approach means things like ambient noise or the volume of a person’s speech do not risk making it harder to decipher.
The scientists determined sound pressure is proportional to the acceleration of the neck skin’s vibration at certain sound levels, and that they could use that knowledge to create a sensor that qualitatively measured the voice. They made a device comprised of a slim polymer film, plus a diaphragm featuring tiny holes.
The film attaches to the skin and is flexible enough to adhere to the natural curves in a person’s neck. Then, the diaphragm moves up and down as the individual speaks.
Potential use cases for the sensor
In the full-text paper describing the team’s work, the authors detail a real-world test they carried out that involved using the voice recognition sensor on a person who attempted to get through a locked door by speaking a voice command. The researchers published a video demonstrating how the voice recognition sensor could either allow or deny access to a restricted area.
They also proved the sensor continues to function as expected in spite of things that could distort a person’s voice, such as if they’re wearing a mask. Those results are promising for using the technology for security-related voice recognition purposes. It’s also possible that the sensor could work well for a disabled person who wears medical equipment over their mouth to help them breathe.
The researchers clarified it was possible to set certain security levels for particular commands. For example, the sensor could respond to a group of people who say a “wake up” command to activate the sensor, but only one person who utters a personalized phrase to enter an area.
Partnering with nonprofits and volunteers, Project Euphonia is a @GoogleAI research effort to help people with speech impairments communicate faster and gain independence → https://t.co/JAzC1aMNZg #io19 pic.twitter.com/SBg4lru3RW
— Google (@Google) May 7, 2019
Google is already working on something called Project Euphonia, which seeks to improve the company’s voice recognition technology for people with speech impediments. Perhaps the combination of Google’s efforts and the innovation of this sensor could break new ground in the assistive technology sector.
Or, the sensor could bring convenience to health care workers who want to dictate patient notes in the perpetually busy environment of hospitals. Another feasible use case could be if a manager wants to record the things they say to a new hire while training them because they know having an audio record will help them ensure they cover everything the person needs to know.
Exciting possibilities
It’s evident the worthwhile ways to use this sensor span far beyond the few mentioned here. People rely on voice recognition technology when composing emails, chatting with friends through messaging services, and more. And, regardless of why individuals turn to voice recognition tools, better accuracy that resists compromise from ambient noises could benefit everyone.
This article has been provided by a guest contributor. Caleb Danziger writes about science and technology on his blog, The Byte Beat. You are invited to visit the blog and read more posts from Caleb over there.
Photo credit: The feature image “the voice of RI” has been done by hnt6581.
Source: Pohang University of Science & Technology (Phys.org) / Siyoung Lee, Junsoo Kim, Inyeol Yun, Geun Yeol Bae, Daegun Kim, Sangsik Park, Il-Min Yi, Wonkyu Moon, Yoonyoung Chung, Kilwon Cho (Springer Nature) / Ruth Umoh (Forbes)