北京生物医学工程
北京生物醫學工程
북경생물의학공정
BEIJING BIOMEDICAL ENGINEERING
2015年
4期
361-366,418
,共7页
董睿%李立峰%牛海军%史晚晴%李阳
董睿%李立峰%牛海軍%史晚晴%李暘
동예%리립봉%우해군%사만청%리양
电子喉%普通话%语音转换%语音增强
電子喉%普通話%語音轉換%語音增彊
전자후%보통화%어음전환%어음증강
electrolarynx%mandarin%voice conversion%speech enhancement
目的:电子喉是喉切除患者使用最多的语音恢复工具,但是电子喉语音存在发声机械、音调单一、辐射噪声大等缺点,本文拟运用语音转换技术改善电子喉语音的发声效果,提高语音自然度和可懂度。方法选择200句分别以自然发声和电子喉发声的标准普通话日常用语作为训练语料,采用基于混合高斯模型( Gaussian mixed model,GMM)的语音转换方法对电子喉语音进行转换,转换参数为基频轨迹和声道谱参数(0~24阶梅尔倒谱系数),然后对转换后的语音质量进行主客观评价。结果转换语音的高频辐射噪声得到了有效抑制,基频变化出现。主观分析结果显示,转换语音的自然度和可接受度有所提高,但可懂度变化不大。结论使用语音转换技术可以降低电子喉语音的高频辐射噪声,改变声调和韵律信息,提高自然度和可接受度,对改善电子喉语音的听觉质量有较大帮助。
目的:電子喉是喉切除患者使用最多的語音恢複工具,但是電子喉語音存在髮聲機械、音調單一、輻射譟聲大等缺點,本文擬運用語音轉換技術改善電子喉語音的髮聲效果,提高語音自然度和可懂度。方法選擇200句分彆以自然髮聲和電子喉髮聲的標準普通話日常用語作為訓練語料,採用基于混閤高斯模型( Gaussian mixed model,GMM)的語音轉換方法對電子喉語音進行轉換,轉換參數為基頻軌跡和聲道譜參數(0~24階梅爾倒譜繫數),然後對轉換後的語音質量進行主客觀評價。結果轉換語音的高頻輻射譟聲得到瞭有效抑製,基頻變化齣現。主觀分析結果顯示,轉換語音的自然度和可接受度有所提高,但可懂度變化不大。結論使用語音轉換技術可以降低電子喉語音的高頻輻射譟聲,改變聲調和韻律信息,提高自然度和可接受度,對改善電子喉語音的聽覺質量有較大幫助。
목적:전자후시후절제환자사용최다적어음회복공구,단시전자후어음존재발성궤계、음조단일、복사조성대등결점,본문의운용어음전환기술개선전자후어음적발성효과,제고어음자연도화가동도。방법선택200구분별이자연발성화전자후발성적표준보통화일상용어작위훈련어료,채용기우혼합고사모형( Gaussian mixed model,GMM)적어음전환방법대전자후어음진행전환,전환삼수위기빈궤적화성도보삼수(0~24계매이도보계수),연후대전환후적어음질량진행주객관평개。결과전환어음적고빈복사조성득도료유효억제,기빈변화출현。주관분석결과현시,전환어음적자연도화가접수도유소제고,단가동도변화불대。결론사용어음전환기술가이강저전자후어음적고빈복사조성,개변성조화운률신식,제고자연도화가접수도,대개선전자후어음적은각질량유교대방조。
Objective Electrolarynx(EL)is the most common assistant device to provide a voice for laryngectomees. However,EL still has several severe problems,such as the extremely unnaturalness and the non-ignorable radiation noises. In this paper,we conduct a study of enhancement of EL speech based on voice conversion(VC)technology in order to improve the naturalness and intelligibility of EL speech. Methods In this article,200 mandarin daily utterance pairs,recorded as normal speech and EL speech,were served as training data. A Gaussian mixed model(GMM)based method was used to improve the quality of EL speech, and subjective and objective estimation were used to evaluate converted speech. The converting features were F0 and spectrum parameters ( 0th through 24th Mel-cepstral coefficients ). Results The objective results demonstrated that the VC-based method could greatly reduce the radiation noises and improve the F0 contour of mandarin EL speech,closer to that of the target speech. The subjective results indicated that the naturalness and acceptability of mandarin EL speech were upgraded and the intelligibility had no significant difference after converting. Conclusions The VC technology can effectively reduce the high frequency radiation noises, complement tone and rhythm information,upgrade naturalness and acceptability of EL speech,which are greatly helpful to improve speech quality.