电子学报
電子學報
전자학보
ACTA ELECTRONICA SINICA
2014年
4期
779-785
,共7页
朱铮宇%贺前华%奉小慧%叶婉玲%李艳雄%杨继臣
硃錚宇%賀前華%奉小慧%葉婉玲%李豔雄%楊繼臣
주쟁우%하전화%봉소혜%협완령%리염웅%양계신
时空特性%一致性分析%协惯量分析%相关度融合
時空特性%一緻性分析%協慣量分析%相關度融閤
시공특성%일치성분석%협관량분석%상관도융합
spatiotemporal characteristic%consistent analysis%coinertia analysis(CoIA)%correlation degree fusion
本文在传统发音唇动分析模型的基础上,构建一个发音唇动时空模型.提出了唇动时域特征、空域特性与语音的相关度度量方法,以及融合时空度量的语音唇动一致性检测方法.利用唇宽、唇高与音频幅度变化之间的联系获得语音唇动的时域一致性评分;通过协惯量分析法获得语音与唇部空域特征的初始相关度,并提出了针对语音、唇动自然延时的相关度修订方法;最后将时空上的得分进行融合以判断语音唇动是否一致.初步实验结果表明,对于四种不一致音视频数据,与常用的协惯量方法相比,EER(Equal Error Rate)平均下降了约8.2%.
本文在傳統髮音脣動分析模型的基礎上,構建一箇髮音脣動時空模型.提齣瞭脣動時域特徵、空域特性與語音的相關度度量方法,以及融閤時空度量的語音脣動一緻性檢測方法.利用脣寬、脣高與音頻幅度變化之間的聯繫穫得語音脣動的時域一緻性評分;通過協慣量分析法穫得語音與脣部空域特徵的初始相關度,併提齣瞭針對語音、脣動自然延時的相關度脩訂方法;最後將時空上的得分進行融閤以判斷語音脣動是否一緻.初步實驗結果錶明,對于四種不一緻音視頻數據,與常用的協慣量方法相比,EER(Equal Error Rate)平均下降瞭約8.2%.
본문재전통발음진동분석모형적기출상,구건일개발음진동시공모형.제출료진동시역특정、공역특성여어음적상관도도량방법,이급융합시공도량적어음진동일치성검측방법.이용진관、진고여음빈폭도변화지간적련계획득어음진동적시역일치성평분;통과협관량분석법획득어음여진부공역특정적초시상관도,병제출료침대어음、진동자연연시적상관도수정방법;최후장시공상적득분진행융합이판단어음진동시부일치.초보실험결과표명,대우사충불일치음시빈수거,여상용적협관량방법상비,EER(Equal Error Rate)평균하강료약8.2%.
This paper constructs a spatiotemporal lip motion model based on traditional simple pronunciation and lip motion spatial model ,and proposes methods for measuring the correlation degree between voice and the spatial ,temporal characteristic of lip motion .In addition ,a fusion scheme for the spatial and temporal correlation degree is proposed to measure the consistency of voice and lip motion .The temporal consistent score is defined as the correlation between lip shape (height and width) and the speech am-plitude .The Coinertia is used as the initial correlation degree of speech and lip spatial characteristic .Both the spatial and temporal correlation degrees are modified by audiovisual initial delay .Experimental results show that the proposed method reduces EER by about 8 .2% compared to the CoIA method .