计算机工程
計算機工程
계산궤공정
COMPUTER ENGINEERING
2013年
11期
197-199,204
,共4页
说话人识别%发声机理%人耳感知特性%希尔伯特黄变换倒谱系数%感知线性预测倒谱系数%Relative Spectra滤波
說話人識彆%髮聲機理%人耳感知特性%希爾伯特黃變換倒譜繫數%感知線性預測倒譜繫數%Relative Spectra濾波
설화인식별%발성궤리%인이감지특성%희이백특황변환도보계수%감지선성예측도보계수%Relative Spectra려파
speaker recognition%vocal mechanism%human ear perceptual characteristic%Hilbert-Huang Transform(HHT) cepstrum coefficient%perception linear prediction cepstrum coefficient%Relative Spectra filtering
Mel频率倒谱系数(MFCC)与线性预测倒谱系数(LPCC)融合算法只能反映语音静态特征,且LPCC对语音低频局部特征描述不足。为此,提出将希尔伯特黄变换(HHT)倒谱系数与相对光谱-感知线性预测倒谱系数(RASTA-PLPCC)融合,得到一种既反映发声机理又体现人耳感知特性的说话人识别算法。HHT倒谱系数体现发声机理,能反映语音动态特性,并更好地描述信号低频局部特征,可改进LPCC的不足。PLPCC体现人耳感知特性,识别性能强于MFCC,用3种融合算法对两者进行融合,将融合特征用于高斯混合模型进行说话人识别。仿真实验结果表明,该融合算法较已有的MFCC与LPCC融合算法识别率提高了8.0%。
Mel頻率倒譜繫數(MFCC)與線性預測倒譜繫數(LPCC)融閤算法隻能反映語音靜態特徵,且LPCC對語音低頻跼部特徵描述不足。為此,提齣將希爾伯特黃變換(HHT)倒譜繫數與相對光譜-感知線性預測倒譜繫數(RASTA-PLPCC)融閤,得到一種既反映髮聲機理又體現人耳感知特性的說話人識彆算法。HHT倒譜繫數體現髮聲機理,能反映語音動態特性,併更好地描述信號低頻跼部特徵,可改進LPCC的不足。PLPCC體現人耳感知特性,識彆性能彊于MFCC,用3種融閤算法對兩者進行融閤,將融閤特徵用于高斯混閤模型進行說話人識彆。倣真實驗結果錶明,該融閤算法較已有的MFCC與LPCC融閤算法識彆率提高瞭8.0%。
Mel빈솔도보계수(MFCC)여선성예측도보계수(LPCC)융합산법지능반영어음정태특정,차LPCC대어음저빈국부특정묘술불족。위차,제출장희이백특황변환(HHT)도보계수여상대광보-감지선성예측도보계수(RASTA-PLPCC)융합,득도일충기반영발성궤리우체현인이감지특성적설화인식별산법。HHT도보계수체현발성궤리,능반영어음동태특성,병경호지묘술신호저빈국부특정,가개진LPCC적불족。PLPCC체현인이감지특성,식별성능강우MFCC,용3충융합산법대량자진행융합,장융합특정용우고사혼합모형진행설화인식별。방진실험결과표명,해융합산법교이유적MFCC여LPCC융합산법식별솔제고료8.0%。
The fusion algorithm of Mel Frequency Cepstral Coefficient(MFCC) and Linear Prediction Cepstrum Coeficient(LPCC) can only react the static characteristics of the speech and LPCC can not describe the local characteristics of the speech low frequency well. So the fusion of Hilbert-Huang Transform(HHT) cepstrum coefficient and Relative Spectra-Perception Linear Prediction Cepstrum Coefficient(RASTA-PLPCC) is proposed, getting a new speaker recognition algorithm that reflects both vocal mechanism and human ear perceptual characteristics. The HHT cepstrum coefficient reflects the human vocal mechanism, and it can reflect the dynamic characteristics of the speech, as well as better describe the local characteristics of the speech low frequency. PLPCC reflects the human ear perceptual characteristics, whose identification performance is better than the MFCC. Two features are combined with the three fusion algorithms, and the fusion feature is sent into the Gaussian mixture model to do speaker recognition. Simulation results demonstrate that compared with the fusion of LPCC and MFCC, the fusion algorithm gets higher recognition rate, and recognition rate is increased by 8.0%.