计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2015年
17期
222-227
,共6页
曹龙涛%李如玮%鲍长春%吴水才
曹龍濤%李如瑋%鮑長春%吳水纔
조룡도%리여위%포장춘%오수재
语音增强%助听器%噪声估计%二值掩蔽
語音增彊%助聽器%譟聲估計%二值掩蔽
어음증강%조은기%조성고계%이치엄폐
speech enhancement%hearing aids%noise estimate%binary masking
针对现有的助听器语音增强算法在非平稳噪声环境下,残留大量背景噪声的同时还引入了“音乐噪声”,致使增强语音可懂度和信噪比不理想等问题。提出了一种基于噪声估计的二值掩蔽语音增强算法,该算法利用人耳听觉感知理论,结合人耳的听觉特性和耳蜗的工作机理。采用最小值控制递归平均(Minima-Controlled Recursive Averaging,MCRA)算法获得估计噪声和初步增强语音;将估计噪声和初步增强语音分别通过可以模拟人工耳蜗模型的gammatone滤波器组进行滤波处理,得到各自的时频表示形式;利用人耳的听觉掩蔽特性,计算含噪语音在时频域的二值掩蔽;利用二值掩蔽得到增强语音。实验结果表明:该算法很大程度上去除了谱减法引入的“音乐噪声”,与基于MCRA谱减法相比,增强语音的语言可懂度指数(Speech Intelligibility Index,SII)、主观语音质量评估(Perceptual Evaluation of Speech Quality,PESQ)和信噪比(Signal to Noise Ratio,SNR)都得到了提高。
針對現有的助聽器語音增彊算法在非平穩譟聲環境下,殘留大量揹景譟聲的同時還引入瞭“音樂譟聲”,緻使增彊語音可懂度和信譟比不理想等問題。提齣瞭一種基于譟聲估計的二值掩蔽語音增彊算法,該算法利用人耳聽覺感知理論,結閤人耳的聽覺特性和耳蝸的工作機理。採用最小值控製遞歸平均(Minima-Controlled Recursive Averaging,MCRA)算法穫得估計譟聲和初步增彊語音;將估計譟聲和初步增彊語音分彆通過可以模擬人工耳蝸模型的gammatone濾波器組進行濾波處理,得到各自的時頻錶示形式;利用人耳的聽覺掩蔽特性,計算含譟語音在時頻域的二值掩蔽;利用二值掩蔽得到增彊語音。實驗結果錶明:該算法很大程度上去除瞭譜減法引入的“音樂譟聲”,與基于MCRA譜減法相比,增彊語音的語言可懂度指數(Speech Intelligibility Index,SII)、主觀語音質量評估(Perceptual Evaluation of Speech Quality,PESQ)和信譟比(Signal to Noise Ratio,SNR)都得到瞭提高。
침대현유적조은기어음증강산법재비평은조성배경하,잔류대량배경조성적동시환인입료“음악조성”,치사증강어음가동도화신조비불이상등문제。제출료일충기우조성고계적이치엄폐어음증강산법,해산법이용인이은각감지이론,결합인이적은각특성화이와적공작궤리。채용최소치공제체귀평균(Minima-Controlled Recursive Averaging,MCRA)산법획득고계조성화초보증강어음;장고계조성화초보증강어음분별통과가이모의인공이와모형적gammatone려파기조진행려파처리,득도각자적시빈표시형식;이용인이적은각엄폐특성,계산함조어음재시빈역적이치엄폐;이용이치엄폐득도증강어음。실험결과표명:해산법흔대정도상거제료보감법인입적“음악조성”,여기우MCRA보감법상비,증강어음적어언가동도지수(Speech Intelligibility Index,SII)、주관어음질량평고(Perceptual Evaluation of Speech Quality,PESQ)화신조비(Signal to Noise Ratio,SNR)도득도료제고。
In order to solve the residual background noise and the musical noise resulted by the existing speech enhance-ment algorithm for hearing aids, a speech enhancement algorithm based on noise estimation of binary masking is proposed in this paper. The estimated background noise and initial enhanced speech are obtained by using the minima-controlled recursive averaging algorithm. The estimated noise and the initial enhanced speech are processed by gammatone filter and inner cells model and time-frequency representation is obtained. Binary masking of noisy is calculated. The binary mask-ing is used to synthesize enhanced speech by utilizing human auditory masking in time-frequency domain. Experimental results show that the proposed algorithm is compared with the MCRA algorithm, Speech Intelligibility Index(SII), Per-ceptual Evaluation of Speech Quality(PESQ)and SNR are improved.