计算机辅助设计与图形学学报
計算機輔助設計與圖形學學報
계산궤보조설계여도형학학보
Journal of Computer-Aided Design & Computer Graphics
2015年
10期
1890-1899
,共10页
视频精彩片段提取%音频情感感知%情感语义%音频分类
視頻精綵片段提取%音頻情感感知%情感語義%音頻分類
시빈정채편단제취%음빈정감감지%정감어의%음빈분류
video highlight extraction%audio emotion perception%emotion semantic%audio classification
为了将伴生音频数据的情感语义用于引导视频精彩片段的提取,提出一种音频感知驱动下的视频精彩片段提取方法。为提取伴生音频数据的情感语义,使用一个基于分层二叉树支持向量机的音频分类器提取中层音频类型,并集成了一个情感映射模型以感知高层情感语义;然后利用该前置音频情感感知模型实现伴生音频情感语义的波动分析,并进一步以精彩片段起止定位策略和音视频同步修订为辅助手段,实现视频精彩片段的定位。文中方法以音频数据情感语义波动序列为核心枢纽,以两阶段音频情感感知模型为前导分析,构建了一个完整的音频情感驱动下视频精彩片段提取架构。实验结果表明,在保证一定查准率的情况下,音频情感驱动下的视频精彩片段提取具有较好的通用性,较高的查全率以及完整度。
為瞭將伴生音頻數據的情感語義用于引導視頻精綵片段的提取,提齣一種音頻感知驅動下的視頻精綵片段提取方法。為提取伴生音頻數據的情感語義,使用一箇基于分層二扠樹支持嚮量機的音頻分類器提取中層音頻類型,併集成瞭一箇情感映射模型以感知高層情感語義;然後利用該前置音頻情感感知模型實現伴生音頻情感語義的波動分析,併進一步以精綵片段起止定位策略和音視頻同步脩訂為輔助手段,實現視頻精綵片段的定位。文中方法以音頻數據情感語義波動序列為覈心樞紐,以兩階段音頻情感感知模型為前導分析,構建瞭一箇完整的音頻情感驅動下視頻精綵片段提取架構。實驗結果錶明,在保證一定查準率的情況下,音頻情感驅動下的視頻精綵片段提取具有較好的通用性,較高的查全率以及完整度。
위료장반생음빈수거적정감어의용우인도시빈정채편단적제취,제출일충음빈감지구동하적시빈정채편단제취방법。위제취반생음빈수거적정감어의,사용일개기우분층이차수지지향량궤적음빈분류기제취중층음빈류형,병집성료일개정감영사모형이감지고층정감어의;연후이용해전치음빈정감감지모형실현반생음빈정감어의적파동분석,병진일보이정채편단기지정위책략화음시빈동보수정위보조수단,실현시빈정채편단적정위。문중방법이음빈수거정감어의파동서렬위핵심추뉴,이량계단음빈정감감지모형위전도분석,구건료일개완정적음빈정감구동하시빈정채편단제취가구。실험결과표명,재보증일정사준솔적정황하,음빈정감구동하적시빈정채편단제취구유교호적통용성,교고적사전솔이급완정도。
To employ emotion semantic of associated audio modal data to guide extraction of highlights of video, a method, driven by audio emotion perception, is presented. An audio classifier, based on a bi-nary-tree support vector machine, is employed to obtain the mid-level audio type. With an emotion-mapping model integrated, high-level emotion semantic for associated audio modal data is obtained finally. The com-plete audio emotion perception model, including an audio classifier and an emotion-mapping model, is a pro-posed to analyze the emotion semantic fluctuation of associated audio. Furthermore, video highlights are extracted with additional aids including a start-stop positioning strategy for highlight and a method for audio video synchronization. Taking emotion semantic fluctuation series of audio as core data, an entire video highlight extraction framework, driven by audio-emotion, is constructed with a two-stage emotion percep-tion model of audio, which completes the most important leading analysis. The experiment demonstrates that the proposed framework can achieve high recall ratio and integrity with good generalized ability in the case of a certain guaranteed accuracy.