西安交通大学学报
西安交通大學學報
서안교통대학학보
JOURNAL OF XI'AN JIAOTONG UNIVERSITY
2010年
2期
20-24
,共5页
知识融合%主题图%相似性算法
知識融閤%主題圖%相似性算法
지식융합%주제도%상사성산법
knowledge fusion%topic map%similarity algorithm
针对基于元数据或传统主题图的知识组织模式没有实现知识的多层次多粒度表示,以及知识融合过程中相似性算法准确性不高而影响融合质量的问题,结合全信息理论与扩展主题图结构特点及语义信息,提出了面向多源知识融合的扩展主题图相似性算法(ETMSC)和阈值选取的相关性、层次对应和实验确定三原则.该算法综合了语法、语义和语用的相似性,扩展了主题图元素间组成结构上的相似性,同时充分考虑了涵义及所处语境的相似性.主题图相似性的判别准则与阈值有关,阅值的确定与数据集相关.实验结果表明,ETMSC算法与单纯基于语法或语义的相似性算法相比,准确性提高了9.2%~11.1%.
針對基于元數據或傳統主題圖的知識組織模式沒有實現知識的多層次多粒度錶示,以及知識融閤過程中相似性算法準確性不高而影響融閤質量的問題,結閤全信息理論與擴展主題圖結構特點及語義信息,提齣瞭麵嚮多源知識融閤的擴展主題圖相似性算法(ETMSC)和閾值選取的相關性、層次對應和實驗確定三原則.該算法綜閤瞭語法、語義和語用的相似性,擴展瞭主題圖元素間組成結構上的相似性,同時充分攷慮瞭涵義及所處語境的相似性.主題圖相似性的判彆準則與閾值有關,閱值的確定與數據集相關.實驗結果錶明,ETMSC算法與單純基于語法或語義的相似性算法相比,準確性提高瞭9.2%~11.1%.
침대기우원수거혹전통주제도적지식조직모식몰유실현지식적다층차다립도표시,이급지식융합과정중상사성산법준학성불고이영향융합질량적문제,결합전신식이론여확전주제도결구특점급어의신식,제출료면향다원지식융합적확전주제도상사성산법(ETMSC)화역치선취적상관성、층차대응화실험학정삼원칙.해산법종합료어법、어의화어용적상사성,확전료주제도원소간조성결구상적상사성,동시충분고필료함의급소처어경적상사성.주제도상사성적판별준칙여역치유관,열치적학정여수거집상관.실험결과표명,ETMSC산법여단순기우어법혹어의적상사성산법상비,준학성제고료9.2%~11.1%.
A novel similarity algorithm of extended topic map called ETMSC for multi-resource knowledge fusion is proposed to improve the drawbacks that the knowledge organization model based on metadata or traditional topic map can not represent knowledge multi-level and multi-granularity, and the low accuracy of existing similarity algorithms. Three principles of the corre-lation, levels corresponding, and the experimental determination in selecting threshold are pres-ented. The algorithm combines the comprehensive information theory with the structure and se-mantic information of extended topic map. The syntactic matching, semantic matching, and prag-matic matching are comprehensively considered, in which not only the structural similarity of top-ic map elements are extended, but also the meaning and relevance in linguistic contexts are thor-oughly taken into account. Topic map similarity criterions are related to a threshold, and the de-termination of the threshold is associated with the data sets. Experimental results and compari-sons with the traditional algorithms that are purely based on the syntactic or semantic similarity show that the F-measure of ETMSC is improved by 9.2%-11.1%.