计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2014年
23期
21-25
,共5页
语音识别%随机段模型%声调建模%发音特征%阶层式多层感知器
語音識彆%隨機段模型%聲調建模%髮音特徵%階層式多層感知器
어음식별%수궤단모형%성조건모%발음특정%계층식다층감지기
speech recognition%stochastic segment modeling%tone modeling%articulatory feature%hierarchical multilayer perceptron classifiers
提出基于发音特征的声调建模改进方法,并将其用于随机段模型的一遍解码中。根据普通话的发音特点,确定了用于区别汉语元音、辅音信息的7种发音特征,并以此为目标值利用阶层式多层感知器计算语音信号属于发音特征的35个类别后验概率,将该概率作为发音特征与传统的韵律特征一起用于声调建模。根据随机段模型的解码特点,在两层剪枝后对保留下来的路径计算其声调模型概率得分,加权后加入路径总的概率得分中。在“863-test”测试集上进行的实验结果显示,使用了新的发音特征集合中声调模型的识别精度提高了3.11%;融入声调信息后随机段模型的字错误率从13.67%下降到12.74%。表明了将声调信息应用到随机段模型的可行性。
提齣基于髮音特徵的聲調建模改進方法,併將其用于隨機段模型的一遍解碼中。根據普通話的髮音特點,確定瞭用于區彆漢語元音、輔音信息的7種髮音特徵,併以此為目標值利用階層式多層感知器計算語音信號屬于髮音特徵的35箇類彆後驗概率,將該概率作為髮音特徵與傳統的韻律特徵一起用于聲調建模。根據隨機段模型的解碼特點,在兩層剪枝後對保留下來的路徑計算其聲調模型概率得分,加權後加入路徑總的概率得分中。在“863-test”測試集上進行的實驗結果顯示,使用瞭新的髮音特徵集閤中聲調模型的識彆精度提高瞭3.11%;融入聲調信息後隨機段模型的字錯誤率從13.67%下降到12.74%。錶明瞭將聲調信息應用到隨機段模型的可行性。
제출기우발음특정적성조건모개진방법,병장기용우수궤단모형적일편해마중。근거보통화적발음특점,학정료용우구별한어원음、보음신식적7충발음특정,병이차위목표치이용계층식다층감지기계산어음신호속우발음특정적35개유별후험개솔,장해개솔작위발음특정여전통적운률특정일기용우성조건모。근거수궤단모형적해마특점,재량층전지후대보류하래적로경계산기성조모형개솔득분,가권후가입로경총적개솔득분중。재“863-test”측시집상진행적실험결과현시,사용료신적발음특정집합중성조모형적식별정도제고료3.11%;융입성조신식후수궤단모형적자착오솔종13.67%하강도12.74%。표명료장성조신식응용도수궤단모형적가행성。
The tone model based on articulatory features is improved in this paper, and a framework is proposed which attempts to integrate the proposed tone model into stochastic segment based Mandarin speech recognition system. A set of seven articulatory features which represent the articulatory information is given. As well as prosodic features, the posteri-ors of speech signal belonging to the 35 pronunciation categories of articulatory features are used for tone modeling. The tone models are fused into the SSM-based speech recognition system after second pruning according to the property of segmental models. Tone recognition experiments conducted on“863-test”set indicate that about 3.11% absolute increase of accuracy can be achieved when using new articulatory features. When the proposed tone model is integrated into SSM system, the character error rate is reduced significantly. Thus, potential of the method is demonstrated.