计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2009年
22期
194-196,205
,共4页
高斯混合模型(GMM)%病态嗓音%Mel倒谱系数(MFCC)%小波变换
高斯混閤模型(GMM)%病態嗓音%Mel倒譜繫數(MFCC)%小波變換
고사혼합모형(GMM)%병태상음%Mel도보계수(MFCC)%소파변환
Ganssian Mixture Model(GMM)%pathological voice%Mel Frequency Cepstrum Coefficient(MFCC)%wavelet transformation
通过分析嗓音的发音机理、病态嗓音与正常嗓音在频域的表现差异,利用小波变换对信号进行分解,突出病态嗓音的特点,提出了基于多尺度分析的小渡降噪、分解的熵系数(Entropy Coefficient based on De-noise,Decomposition of Multi-scaleAnalysis,ECDDMA)作为识别的特征矢量集.并对比分析了语音识别中经典特征参数Mel倒谱系数(MFCC),分别运用这两种特征参数对242例正常嗓音和234例病态嗓音运用高斯混合模型(GMM)进行了识别.结果显示:ECDDMA系数较传统的模拟人耳听觉非线性特性的MFCC及其动态特征能更准确地表征正常与病态嗓音之间的差异,有利于同时提高病态和正常嗓音的识别率.
通過分析嗓音的髮音機理、病態嗓音與正常嗓音在頻域的錶現差異,利用小波變換對信號進行分解,突齣病態嗓音的特點,提齣瞭基于多呎度分析的小渡降譟、分解的熵繫數(Entropy Coefficient based on De-noise,Decomposition of Multi-scaleAnalysis,ECDDMA)作為識彆的特徵矢量集.併對比分析瞭語音識彆中經典特徵參數Mel倒譜繫數(MFCC),分彆運用這兩種特徵參數對242例正常嗓音和234例病態嗓音運用高斯混閤模型(GMM)進行瞭識彆.結果顯示:ECDDMA繫數較傳統的模擬人耳聽覺非線性特性的MFCC及其動態特徵能更準確地錶徵正常與病態嗓音之間的差異,有利于同時提高病態和正常嗓音的識彆率.
통과분석상음적발음궤리、병태상음여정상상음재빈역적표현차이,이용소파변환대신호진행분해,돌출병태상음적특점,제출료기우다척도분석적소도강조、분해적적계수(Entropy Coefficient based on De-noise,Decomposition of Multi-scaleAnalysis,ECDDMA)작위식별적특정시량집.병대비분석료어음식별중경전특정삼수Mel도보계수(MFCC),분별운용저량충특정삼수대242례정상상음화234례병태상음운용고사혼합모형(GMM)진행료식별.결과현시:ECDDMA계수교전통적모의인이은각비선성특성적MFCC급기동태특정능경준학지표정정상여병태상음지간적차이,유리우동시제고병태화정상상음적식별솔.
Considering the voice pronunciation mechanism,the different performances of the abnormal voice and the normal voice in the field of frequency,the paper proposes a new method for extracting characteristics that is Entropy Coefficient based on De-noise,Decomposition of Multi-scale Analysis (ECDDMA) using the wavelet decomposition to find the pathological voice's characteristics,and comparative analysis of the effective speech characteristics MFCC.242 normal voices samples and 234 abnormal samples are recognized with MFCC and the new extracted characteristics ECDDMA based on Gaussian Mixture Model (GMM).The result indicates that,the parameters of ECDDMA are more advantageous to the normal and abnormal voice recognition than the traditional MFCC and the dynamic characteristic which mimic the human ears non-linear characteristic with frequency,and improves the abnormal and normal voice's recognition result.