软件学报
軟件學報
연건학보
JOURNAL OF SOFTWARE
2009年
10期
2679-2691
,共13页
杨跃东%郝爱民%褚庆军%赵沁平%王莉莉
楊躍東%郝愛民%褚慶軍%趙沁平%王莉莉
양약동%학애민%저경군%조심평%왕리리
动作识别%角度无关%动作图%兴趣点%Na-ve%Bayes
動作識彆%角度無關%動作圖%興趣點%Na-ve%Bayes
동작식별%각도무관%동작도%흥취점%Na-ve%Bayes
action recognition%view-invariant%action graph%interest point%Naive Bayes
针对视角无关的动作识别,提出加权字典向量描述方法和动作图识别模型.将视频中的局部兴趣点特征和全局形状描述有机结合,形成加权字典向量的描述方法,该方法既具有兴趣点抗噪声强的优点,又可克服兴趣点无法识别静态动作的缺点.根据运动捕获、点云等三维运动数据构建能量曲线,提取关键姿势,生成基本运动单元,并通过自连接、向前连接和向后连接3种连接方式构成有向图,称为本质图.本质图向各个方向投影,根据节点近邻规则建立的有向图称为动作图.通过Na-ve Bayes训练动作图模型,采用Viterbi算法计算视频与动作图的匹配度,根据最大匹配度标定视频序列.动作图具有多角度投影和投影平滑过渡等特点,因此可识别任意角度、任意运动方向的视频序列.实验结果表明,该算法具有较好的识别效果,可识别单目视频、多目视频和多动作视频.
針對視角無關的動作識彆,提齣加權字典嚮量描述方法和動作圖識彆模型.將視頻中的跼部興趣點特徵和全跼形狀描述有機結閤,形成加權字典嚮量的描述方法,該方法既具有興趣點抗譟聲彊的優點,又可剋服興趣點無法識彆靜態動作的缺點.根據運動捕穫、點雲等三維運動數據構建能量麯線,提取關鍵姿勢,生成基本運動單元,併通過自連接、嚮前連接和嚮後連接3種連接方式構成有嚮圖,稱為本質圖.本質圖嚮各箇方嚮投影,根據節點近鄰規則建立的有嚮圖稱為動作圖.通過Na-ve Bayes訓練動作圖模型,採用Viterbi算法計算視頻與動作圖的匹配度,根據最大匹配度標定視頻序列.動作圖具有多角度投影和投影平滑過渡等特點,因此可識彆任意角度、任意運動方嚮的視頻序列.實驗結果錶明,該算法具有較好的識彆效果,可識彆單目視頻、多目視頻和多動作視頻.
침대시각무관적동작식별,제출가권자전향량묘술방법화동작도식별모형.장시빈중적국부흥취점특정화전국형상묘술유궤결합,형성가권자전향량적묘술방법,해방법기구유흥취점항조성강적우점,우가극복흥취점무법식별정태동작적결점.근거운동포획、점운등삼유운동수거구건능량곡선,제취관건자세,생성기본운동단원,병통과자련접、향전련접화향후련접3충련접방식구성유향도,칭위본질도.본질도향각개방향투영,근거절점근린규칙건립적유향도칭위동작도.통과Na-ve Bayes훈련동작도모형,채용Viterbi산법계산시빈여동작도적필배도,근거최대필배도표정시빈서렬.동작도구유다각도투영화투영평활과도등특점,인차가식별임의각도、임의운동방향적시빈서렬.실험결과표명,해산법구유교호적식별효과,가식별단목시빈、다목시빈화다동작시빈.
This paper proposes a weighted codebook vector representation and an action graph model for view-invariant human action recognition. A video is represented as a weighted codebook vector combining dynamic interest points and static shapes. This combined representation has strong noise robusticity and high classification performance on static actions. Several 3D key poses are extracted from the motion capture data or points cloud data, and a set of primitive motion segments are generated. A directed graph called Essential Graph is built of these segments according to self-link, forward-link and back-link. Action Graph is generated from the essential graph projected from a wide range of viewpoints. This paper uses Naive Bayes to train a statistical model for each node. Given an unlabeled video, Viterbi algorithm is used for computing the match score between the video and the action graph. The video is then labeled based on the maximum score. Finally, the algorithm is tested on the IXMAS dataset, and the CMU motion capture library. The experimental results demonstrate that this algorithm can recognize the view-invariant actions and achieve high recognition rates.