Development of an mpAB solution that automatically distinguishes and removes noise and user voice without separate tuning while freely changing microphone placement.
– Applicable to various fields such as automobiles, robots, smart homes, kiosks, and home IoT.
– Provision of the 'Clean Ear' app for the hearing impaired
– Widely used in everyday life, including barrier-free kiosks, video conferencing, and meeting minutes.
Sometimes, when I give commands to my AI speaker, nothing happens. What could be the problem?
Conversations with AI have become commonplace. While AI's voice recognition capabilities have improved, practical applications still face challenges. Even with AI's superior voice recognition capabilities, real-world environments are often filled with various noises, making them useless if these environments cannot be controlled.
Speech recognition research tests performance in controlled environments, i.e., clean situations with virtually all noise removed. However, real-world speech environments are quite different. Background noises like the TV in a living room, the murmur of a cafe, the mechanical hum of a factory, and the roar of a car engine are picked up by the microphone, distorting the signal.
There's a company solving the problem of AI voice interfaces failing to function properly in noisy real-life environments. mpWAV (mpWAV) focuses on preprocessing technology to create an environment where AI can function properly, rather than "better speech recognition AI."
CEO Park Hyung-min received his Ph.D. in speech signal processing from KAIST and worked as a researcher at the Language Technology Research Institute at Carnegie Mellon University. After being appointed a professor in the Department of Electrical Engineering at Sogang University in 2007, he has been researching signal processing technologies that overcome speech signal distortion in real-world environments. He successfully developed commercially viable source technology and founded mpWAV.
We met with CEO Park Hyung-min at Sogang University's research lab to learn about how voice enhancement and recognition technologies, as well as voice preprocessing technologies, can change our lives in complex, noisy environments.

Extract only the desired sound, even in the noisiest and most complex environments.
So, how can mpWAV cleanly capture only the audio you want to hear in a noisy environment?
mpAB, mpWAV's core solution, integrates multi-channel echo cancellation technology (mpAEC, Acoustic Echo Canceller) and beamforming technology (mpBeamforming). mpWAV won the Prime Minister's Award at the 2024 Korea Invention and Patent Exhibition for mpAB and received New Excellent Technology (NET) certification in 2025.
The phenomenon of sound from a device being re-recorded into the microphone is called echo. For example, when an AI speaker says, "I'll tell you today's weather," the speaker's voice is re-recorded into the microphone and played back. This is called echo. Typically, multiple microphones and speakers are used, and the sound from the speaker and the echo received by the microphone are different.
Existing multi-channel echo cancellation technology must determine whether the user is speaking. If this determination is incorrect, the user's voice is also removed.
Existing multi-channel echo signal removal technology must determine whether the user is speaking, but often fails to do so and removes the user's voice as well. Detecting the user's voice also causes the system to halt training, extending training time and degrading performance. In particular, multi-channel echo removal can be problematic because some sounds from multiple speakers are the same and others are different, making echo removal from the microphone signal extremely confusing and drastically reducing performance.
mpWAV's multi-channel echo signal removal technology can effectively identify and remove the complex relationships of multi-channel echoes, and enables fast and excellent echo removal without interruption in learning depending on whether the user speaks.
mpWAV's beamforming technology automatically optimizes sound based solely on the signal, without requiring pre-configured microphone positions. Beamforming combines signals from multiple microphones to amplify sounds from specific directions and weaken sounds from other directions.
Conventional beamforming technology required knowing the precise locations of each microphone in advance and inputting all of their precise locations. For example, microphone 1 had to be set to a "10cm location," microphone 2 to a "15cm location," and so on. This is because only by knowing the microphone locations can the signal combination method and the weighting of each microphone signal be determined.
"Even the slightest change in product design requires engineers to re-calibrate all settings, as the microphone position changes. This leads to time-consuming re-tuning with each new product release, increasing costs, and, most importantly, locking manufacturers into the hands of technology suppliers."
mpWAV's beamforming technology automatically optimizes the signal mix by analyzing only the signals coming from the microphones. The core of mpWAV's beamforming technology is its ability to selectively select only the sound of the target sound source in real time, without any additional tuning, even when the microphone placement is freely changed.
mpAB combines these two technologies. mpAB operates based on whether the final output signal resembles the user's voice, making it independent of microphone placement. Even if the position or number of microphones change due to product design changes, mpAB automatically optimizes based solely on the microphone signal, maintaining voice quality without the burden of retuning.
The real problem is not technology, but the environment.
mpWAV was selected for this year's "Super Gap Startup 1000+" project (DeepS). They are developing technology that, simply by installing a module, enables a variety of functions, including voice and cued word recognition, language processing, and speech synthesis, all within the device. Last September, they were selected for the "AI Startup Accelerator" hosted by SK Telecom.
mpWAV is collaborating with several major corporations and national research institutes. With A Electronics, the solution was applied to home robots and TVs, significantly improving voice command recognition rates in real-world living room environments. With B Automobile, the technology was applied to a store guide robot. Even in environments with multiple speakers and background music, such as showrooms and dealerships, voice recognition functioned stably, enabling natural conversations between customers and robots. With C, the robot was equipped with mpWAV's preprocessing solution and demonstrated at an academic conference, demonstrating the robot's ability to accurately recognize and respond to human voices.
Every time voice recognition is integrated into devices like guide robots, home robots, and voice ordering systems in kiosks, development teams face the same challenges. Park argues that the real problem isn't the technology, but the environment. Background noise in the store, other people's voices, the robot's own motor, and air conditioning noise—all of these things interfere with AI's voice recognition. Voice recognition AI has already achieved a certain level of performance. The problem lies in the complex acoustic environment of real life.
"Noise removal typically involves distortion of the target voice, which inevitably reduces speech recognition performance. However, mpAB removes noise without distortion, making it readily applicable to any customer speech recognition engine without performance degradation. We offer full implementation support, from software to embedded porting and SoC chip fabrication, enabling us to meet a wide range of customer requirements."
Clean Ear, a hearing assistance app for the hearing impaired
mpWAV also provides a Clean Ear app for people with hearing loss as well as those who need clear conversations and meetings in noisy environments.
It's estimated that approximately 2.5 billion people worldwide will suffer from hearing loss by 2025, with over 700 million of these individuals needing hearing aids or other assistive devices. While the number of registered hearing impaired people in Korea was 440,000 as of 2024, the actual number of people with hearing loss is estimated to be much higher.
The problem lies with hearing aids. They are extremely expensive. Because of their price and usability, over 90% of hearing-impaired people in Korea don't use them. An even bigger problem is that hearing aids' primary function is to amplify sound.
Instead of amplifying voices, Clean Ear removes background noise and makes speech clearer. All you need is your smartphone's microphone and earphones. No additional equipment is required, and it's affordable.
Clean's success has already been proven. It won two Innovation Awards at CES 2024 in the "Digital Health" and "Mobile Device" categories. It also won an "AccessABILITY Award" from USA TODAY's review site Reviewed. Selected for the Seoul Metropolitan Government's Technology Development Support Project for the Underprivileged, it successfully completed demonstrations at two senior welfare centers.
Used in various places where voice is needed
With the rapid development of generative AI, it's clear that voice interfaces will spread across all devices. The market is rapidly expanding as the range of commercialization technologies that meet user expectations continues to expand. In the long term, virtually every smart device will have a voice interface.
The voice interface market is projected to grow from approximately $30.2 billion in 2025 to $76.1 billion in 2030, at an average annual growth rate of over 20%. mpWAV's technology is expected to be applied to a wide range of sectors, including automobiles, robots, smart homes, kiosks, and home IoT.
These technologies can be used in a variety of ways in everyday life.
Barrier-free kiosk: People with visual impairments can order without assistance from a clerk, even amidst background noise, counter voices, and music.
Recording Meeting Minutes: Even when multiple people are speaking simultaneously in a conference room, each speaker's voice is separated and recognized in real time. You can immediately check that your remarks during the meeting are accurately recorded.
Video Conferencing: Previously, if you were in a cafe, you had to turn off your microphone because of the background noise, but with the mpWAV solution, the cafe noise is removed and only the voice is transmitted.

"I hope that our technology can make the world a better place."
When asked about mpWAV's ultimate goal, CEO Park Hyung-min responded as follows:
mpWAV's slogan is "Masterpiece Wave for Humanity." It signifies the company's commitment to improving the quality of life, realizing social value through voice interface solutions, and spreading technologies that connect people with people and people with technology.
While voice AI recognition rates are improving, real life is still filled with noise. If only people could hear the voices they need, even in noisy environments, everyone would be able to communicate comfortably. mpWAV is creating that world.
You must be logged in to post a comment.