声学技术
聲學技術
성학기술
Technical Acoustics
2015年
5期
424-430
,共7页
张建伟%陶亮%周健%王华彬
張建偉%陶亮%週健%王華彬
장건위%도량%주건%왕화빈
噪声谱估计%谱减法%时频块%最小统计%短时客观可懂度%语音可懂度
譟聲譜估計%譜減法%時頻塊%最小統計%短時客觀可懂度%語音可懂度
조성보고계%보감법%시빈괴%최소통계%단시객관가동도%어음가동도
noise spectrum estimation%spectrum subtraction%time-frequency blocks%Minima Statistics(MS)%Short-Time Objective Intelligibility(STOI)%speech intelligibility
噪声谱估计是单通道语音增强算法的关键步骤,当前大部分语音增强算法旨在提高语音质量,提高语音可懂度的算法却很少.在传统的单通道语音增强算法中,语音质量的提高往往是以牺牲语音的可懂度为代价的.对目前主流的几种噪声谱估计算法对语音可懂度影响进行分析.在不同噪声背景、不同信噪比情况下进行噪声谱估计,并采用谱减法对含噪语音信号作去噪处理,对比分析不同噪声、不同信噪比下增强前后语音的短时客观可懂度 (Short-Time Objective Intelligibility, STOI)值,最后根据信噪比,对比分析了不同噪声环境下,语音增强前后语音能量高于噪声能量的时频块所占比例.实验表明,相比其他噪声估计算法,最小统计(Minima Statistics,MS)算法由于保留了更多的以语音能量为主的时频块,使得去噪后的语音有较高的可懂度.
譟聲譜估計是單通道語音增彊算法的關鍵步驟,噹前大部分語音增彊算法旨在提高語音質量,提高語音可懂度的算法卻很少.在傳統的單通道語音增彊算法中,語音質量的提高往往是以犧牲語音的可懂度為代價的.對目前主流的幾種譟聲譜估計算法對語音可懂度影響進行分析.在不同譟聲揹景、不同信譟比情況下進行譟聲譜估計,併採用譜減法對含譟語音信號作去譟處理,對比分析不同譟聲、不同信譟比下增彊前後語音的短時客觀可懂度 (Short-Time Objective Intelligibility, STOI)值,最後根據信譟比,對比分析瞭不同譟聲環境下,語音增彊前後語音能量高于譟聲能量的時頻塊所佔比例.實驗錶明,相比其他譟聲估計算法,最小統計(Minima Statistics,MS)算法由于保留瞭更多的以語音能量為主的時頻塊,使得去譟後的語音有較高的可懂度.
조성보고계시단통도어음증강산법적관건보취,당전대부분어음증강산법지재제고어음질량,제고어음가동도적산법각흔소.재전통적단통도어음증강산법중,어음질량적제고왕왕시이희생어음적가동도위대개적.대목전주류적궤충조성보고계산법대어음가동도영향진행분석.재불동조성배경、불동신조비정황하진행조성보고계,병채용보감법대함조어음신호작거조처리,대비분석불동조성、불동신조비하증강전후어음적단시객관가동도 (Short-Time Objective Intelligibility, STOI)치,최후근거신조비,대비분석료불동조성배경하,어음증강전후어음능량고우조성능량적시빈괴소점비례.실험표명,상비기타조성고계산법,최소통계(Minima Statistics,MS)산법유우보류료경다적이어음능량위주적시빈괴,사득거조후적어음유교고적가동도.
Noise spectrum estimation is a key step in single channel speech enhancement algorithms. Most of current speech enhancement algorithms are designed to improve speech quality, however, algorithms for increasing speech in-telligibility are few. The traditional speech enhancement algorithms improve speech quality, while sacrificing speech intelligibility. In this paper, classical noise spectrum estimation algorithms are evaluated for their effects on speech in-telligibility. Noise spectrum is estimated in different noise environments with SNRs between?9 dB and 3 dB. The spectral subtraction is thereafter used for speech denoising. The STOI(Short-Time Objective Intelligibility) value of the enhanced speech is computed. At last, according to the signal-to-noise ratio, the proportions of speech dominated time-frequency blocks under different noise environments are analyzed. Experimental results show that, compared with other noise estimation algorithms, the minimum statistics (MS) obtains high speech intelligibility because it retains more speech dominated time-frequency blocks after speech denoising.