北京师范大学学报(自然科学版)
北京師範大學學報(自然科學版)
북경사범대학학보(자연과학판)
Journal of Beijing Normal University (Natural Science)
2015年
5期
469-474
,共6页
语音识别%小波分析%MFCC%MFCC_Wavelet
語音識彆%小波分析%MFCC%MFCC_Wavelet
어음식별%소파분석%MFCC%MFCC_Wavelet
speech recognition%wavelet analysis%MFCC%MFCC_Wavelet
采用 HMM模型和 MFCC参数的语音识别器对普通话中声母音素的区分度不够理想,而在基于识别的计算机辅助发音教学系统中的辅音的识别具有特别重要的意义.考虑到相同发音位置不同发音方式的声母音素变化较快且高频信息较多,本文将小波分析的方法引入到提取梅尔频率倒谱参数(MFCC)的过程当中,来提高信号高频部分的时域分辨率,提出了基于小波分析的梅尔倒谱参数 MFCC_Wavelet.结合高低频不同分帧方式的 MFCC_Wavelet 参数与HMM模型的语音识别器,本文测试了 MFCC和 MFCC_Wavelet两种参数在4类发音中的区分性,实验结果表明,在相同发音位置不同发音方式、塞音与不塞音、送气音与不送气音及擦音与不擦音4类发音错误中,MFCC_Wavelet 的总体效果好于 MFCC.
採用 HMM模型和 MFCC參數的語音識彆器對普通話中聲母音素的區分度不夠理想,而在基于識彆的計算機輔助髮音教學繫統中的輔音的識彆具有特彆重要的意義.攷慮到相同髮音位置不同髮音方式的聲母音素變化較快且高頻信息較多,本文將小波分析的方法引入到提取梅爾頻率倒譜參數(MFCC)的過程噹中,來提高信號高頻部分的時域分辨率,提齣瞭基于小波分析的梅爾倒譜參數 MFCC_Wavelet.結閤高低頻不同分幀方式的 MFCC_Wavelet 參數與HMM模型的語音識彆器,本文測試瞭 MFCC和 MFCC_Wavelet兩種參數在4類髮音中的區分性,實驗結果錶明,在相同髮音位置不同髮音方式、塞音與不塞音、送氣音與不送氣音及抆音與不抆音4類髮音錯誤中,MFCC_Wavelet 的總體效果好于 MFCC.
채용 HMM모형화 MFCC삼수적어음식별기대보통화중성모음소적구분도불구이상,이재기우식별적계산궤보조발음교학계통중적보음적식별구유특별중요적의의.고필도상동발음위치불동발음방식적성모음소변화교쾌차고빈신식교다,본문장소파분석적방법인입도제취매이빈솔도보삼수(MFCC)적과정당중,래제고신호고빈부분적시역분변솔,제출료기우소파분석적매이도보삼수 MFCC_Wavelet.결합고저빈불동분정방식적 MFCC_Wavelet 삼수여HMM모형적어음식별기,본문측시료 MFCC화 MFCC_Wavelet량충삼수재4류발음중적구분성,실험결과표명,재상동발음위치불동발음방식、새음여불새음、송기음여불송기음급찰음여불찰음4류발음착오중,MFCC_Wavelet 적총체효과호우 MFCC.
Changing rapidly over time and with higher frequency,most consonants in Chinese Mandarin need shorter analysis frame length in automatic speech recognition (ASR).In contrast,longer frame suits vowels which are comparatively stable and with lower frequency distribution.A new speech feature MFCC-Wavelet is introduced here combining wavelet analysis with Mel frequency cepstrum coefficient (MFCC).It has higher time resolution in high frequency like wavelet analysis,and possesses Mel frequency resolution of MFCC satisfying both requirements of consonant and vowel recognition.Experiments showed better performance than MFCC to differentiate plosive/non-plosive,fricative/non-fricative and aspirated/non-aspirated phonemes in Chinese Mandarin recognition.These are important specifically in ASR-based computer-assisted pronunciation teaching (CAPT).