2pNS8 – Noise Dependent Coherence-Super Gaussian based Dual Microphone Speech Enhancement for Hearing Aid Application using Smartphone
Nikhil Shankar– nxs162330@utdallas.edu
Gautam Shreedhar Bhat – gxs160730@utdallas.edu
Chandan K A Reddy – cxk131330@utdallas.edu
Dr. Issa M S Panahi – imp015000@utdallas.edu
Statistical Signal Processing Laboratory (SSPRL)
The University of Texas at Dallas
800W Campbell Road,
Richardson, TX – 75080, USA
Popular Version of Paper 2pNS8, “Noise dependent coherence-super Gaussian based dual microphone speech enhancement for hearing aid application using smartphone” will be presented Tuesday afternoon, May 8, 2018, 3:25 – 3:40 PM, NICOLLET D3
175th ASA Meeting, Minneapolis
Records by National Institute on Deafness and Other Communication Disorders (NIDCD) indicate that nearly 15% of adults (37million) aged 18 and over report some kind of hearing loss in the United States. Amongst the entire world population, 360 million people suffer from hearing loss.
Over the past decade, researchers have developed many feasible solutions for hearing impaired in the form of Hearing Aid Devices (HADs) and Cochlear Implants (CI). However, the performance of the HADs degrade in the presence of different types of background noise and lacks the computational power, due to the design constraints and to handle obligatory signal processing algorithms. Lately, HADs manufacturers are using a pen or a necklace as an external microphone to capture speech and transmit the signal and data by wire or wirelessly to HADs. The expense of these existing auxiliary devices poses as a limitation. An alternative solution is the use of smartphone which can capture the noisy speech data using the two microphones, perform complex computations using the Speech Enhancement algorithm and transmit the enhanced speech to the HADs.
In this work, the coherence between speech and noise signals [1] is used to obtain a Speech Enhancement (SE) gain function, in combination with a Super Gaussian Joint Maximum a Posteriori (SGJMAP) [2,3] single microphone SE gain function. The weighted union of these two gain functions strikes a balance between noise suppression and speech distortion. The theory behind the coherence method is that the speech from the two microphones is correlated, while the noise is uncorrelated with speech. The block diagram of the proposed method is as shown in Figure 1.
For the objective measure of quality of speech, we use Perceptual Evaluation of Speech Quality (PESQ). Coherence Speech Intelligibility Index (CSII) is used to measure the intelligibility of speech. PESQ ranges between 0.5 and 4, with 4 being high speech quality. CSII ranges between 0 and 1, with 1 being high intelligibility. Figure 2 shows the plots of PESQ and CSII versus SNR for two noise types, and performance comparison of proposed SE method with the conventional Coherence and LogMMSE SE methods.
Along with Objective measures, we perform Mean Opinion Score (MOS) tests on 20 normal hearing both male and female subjects. Subjective test results are shown in Figure 3, which illustrates the effectiveness of the proposed method in various background noise.
Please refer our lab website https://www.utdallas.edu/ssprl/hearing-aid-project/ for video demos and the sample audio files are as attached below.
Audios: Audio files go here:
Noisy |
Enhanced |
Key References:
[1] N. Yousefian and P. Loizou, “A Dual-Microphone Speech Enhancement algorithm based on the Coherence Function,” IEEE Trans. Audio, Speech, and Lang. Processing, vol. 20, no.2, pp. 599-609, Feb 2012.
[2] Lotter, P. Vary, “Speech Enhancement by MAP Spectral Amplitude Estimation using a super-gaussian speech model,” EURASIP Journal on Applied Sig. Process, pp. 1110-1126, 2005.
[3] C. Karadagur Ananda Reddy, N. Shankar, G. Shreedhar Bhat, R. Charan and I. Panahi, “An Individualized Super-Gaussian Single Microphone Speech Enhancement for Hearing Aid Users With Smartphone as an Assistive Device,” in IEEE Signal Processing Letters, vol. 24, no. 11, pp. 1601-1605, Nov. 2017.
*This work was supported by the National Institute of the Deafness and Other Communication Disorders (NIDCD) of the National Institutes of Health (NIH) under the grant number 5R01DC015430-02. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. The authors are with the Statistical Signal Processing Research Laboratory (SSPRL), Department of Electrical and Computer Engineering, The University of Texas at Dallas.
[embeddoc url=”https://acoustics.org/wp-content/uploads/2018/05/Shankar-LLP-2.docx” viewer=”microsoft”]