Voice recognition features are part of most smartphones, but they have shortcomings. Sometimes the feature won’t activate, mainly if a person solely uses a vocal cue to turn on the recognition feature instead of a smartphone keyboard command. That’s because the surrounding environment is too loud, and the microphone can’t detect the request.
In other cases, the phone’s microphone wrongfully interprets a person’s voice as a command at potentially embarrassing times, such as when they’re in a meeting. Problems can also occur if an individual speaks too softly in a setting with many other, more prominent sounds.
The microphones inside devices with voice recognition technology typically convert variations in pressure — sounds — into electrical signals that get analyzed.
View this post on Instagram
#화학공학과 조길원 교수‧박사과정 이시영 씨, #전자전기공학과 정윤영 교수팀은 기존의 마이크보다 훨씬 민감도가 높으면서도 목에 붙여 소음과 마스크 등의 방해물에 영향 받지 않는 음성인식 ‘피부 부착형 고성능 진동감지 유연센서’를 개발하는데 성공했습니다. 이 센서는 소음이 있는 환경, 목소리가 거의 들리지 않는 가스마스크를 착용한 상황에서도 목소리를 왜곡 없이 정확하게 감지할 수 있다고 하는데요, 앞으로 음성을 인식할 수 있는 전자피부, 휴먼-머신 인터페이스, 성대 헬스케어 모니터링 웨어러블 기기 등 다양한 분야에서 활용될 것으로 기대된다고 합니다! #POSTECH #포스텍 #포항공과대학교 #연구성과
However, sounds other than a person’s voice frequently affect those microphones, causing undesirable performance. The microphones recognize voices via changes in air vibration, but mechanical resonance and the damping effect decrease the sensitivity of a microphone.
A pair of researchers from Pohang University of Science & Technology came up with a technology that’s superior to current voice recognition options, and it could lead to more accuracy even when people use voice recognition in potentially noisy areas, like train stations or shopping malls.
Voice recognition through neck skin vibrations
If a person puts their hand against their throat while speaking, it’s easy to feel the vibrations associated with the voice. The researchers took that into account while developing their voice recognition sensor. It’s a wearable device that recognizes a person’s voice according to how their neck skin vibrates. That approach means things like ambient noise or the volume of a person’s speech do not risk making it harder to decipher.
The scientists determined sound pressure is proportional to the acceleration of the neck skin’s vibration at certain sound levels, and that they could use that knowledge to create a sensor that qualitatively measured the voice. They made a device comprised of a slim polymer film, plus a diaphragm featuring tiny holes.
The film attaches to the skin and is flexible enough to adhere to the natural curves in a person’s neck. Then, the diaphragm moves up and down as the individual speaks.
Potential use cases for the sensor
In the full-text paper describing the team’s work, the authors detail a real-world test they carried out that involved using the voice recognition sensor on a person who attempted to get through a locked door by speaking a voice command. The researchers published a video demonstrating how the voice recognition sensor could either allow or deny access to a restricted area.
They also proved the sensor continues to function as expected in spite of things that could distort a person’s voice, such as if they’re wearing a mask. Those results are promising for using the technology for security-related voice recognition purposes. It’s also possible that the sensor could work well for a disabled person who wears medical equipment over their mouth to help them breathe.
The researchers clarified it was possible to set certain security levels for particular commands. For example, the sensor could respond to a group of people who say a “wake up” command to activate the sensor, but only one person who utters a personalized phrase to enter an area.
Partnering with nonprofits and volunteers, Project Euphonia is a @GoogleAI research effort to help people with speech impairments communicate faster and gain independence → https://t.co/JAzC1aMNZg #io19 pic.twitter.com/SBg4lru3RW
— Google (@Google) May 7, 2019
Google is already working on something called Project Euphonia, which seeks to improve the company’s voice recognition technology for people with speech impediments. Perhaps the combination of Google’s efforts and the innovation of this sensor could break new ground in the assistive technology sector.
Or, the sensor could bring convenience to health care workers who want to dictate patient notes in the perpetually busy environment of hospitals. Another feasible use case could be if a manager wants to record the things they say to a new hire while training them because they know having an audio record will help them ensure they cover everything the person needs to know.
It’s evident the worthwhile ways to use this sensor span far beyond the few mentioned here. People rely on voice recognition technology when composing emails, chatting with friends through messaging services, and more. And, regardless of why individuals turn to voice recognition tools, better accuracy that resists compromise from ambient noises could benefit everyone.
This article has been provided by a guest contributor. Caleb Danziger writes about science and technology on his blog, The Byte Beat. You are invited to visit the blog and read more posts from Caleb over there.
Photo credit: The feature image “the voice of RI” has been done by hnt6581.
Source: Pohang University of Science & Technology (Phys.org) / Siyoung Lee, Junsoo Kim, Inyeol Yun, Geun Yeol Bae, Daegun Kim, Sangsik Park, Il-Min Yi, Wonkyu Moon, Yoonyoung Chung, Kilwon Cho (Springer Nature) / Ruth Umoh (Forbes)