模式识别与人工智能
模式識彆與人工智能
모식식별여인공지능
Moshi Shibie yu Rengong Zhineng
2014年
6期
524-532
,共9页
知识发现%关联规则%兴趣度度量%信息熵
知識髮現%關聯規則%興趣度度量%信息熵
지식발현%관련규칙%흥취도도량%신식적
Knowledge Discovery%Association Rule%Interestingness Measure%Information Entropy
传统关联规则挖掘方法通常产生海量杂乱的规则,它们对用户而言是冗余的。为解决该问题,文中提出一种基于信息熵的兴趣度规则挖掘算法。通过变量相关性分析剔除原始规则集中虚假、错误的规则,并在信息熵的基础上提出度量关联规则兴趣度的框架。该算法不依赖用户先验知识,能无偏地表达数据包含的信息。在真实和仿真数据集上的实验验证该算法能有效挖掘兴趣度规则,且性能比传统算法更优。
傳統關聯規則挖掘方法通常產生海量雜亂的規則,它們對用戶而言是冗餘的。為解決該問題,文中提齣一種基于信息熵的興趣度規則挖掘算法。通過變量相關性分析剔除原始規則集中虛假、錯誤的規則,併在信息熵的基礎上提齣度量關聯規則興趣度的框架。該算法不依賴用戶先驗知識,能無偏地錶達數據包含的信息。在真實和倣真數據集上的實驗驗證該算法能有效挖掘興趣度規則,且性能比傳統算法更優。
전통관련규칙알굴방법통상산생해량잡란적규칙,타문대용호이언시용여적。위해결해문제,문중제출일충기우신식적적흥취도규칙알굴산법。통과변량상관성분석척제원시규칙집중허가、착오적규칙,병재신식적적기출상제출도량관련규칙흥취도적광가。해산법불의뢰용호선험지식,능무편지표체수거포함적신식。재진실화방진수거집상적실험험증해산법능유효알굴흥취도규칙,차성능비전통산법경우。
With the development of data collection and storage techniques, excessive and unorderly rules are generated by traditional association rule mining, which can not meet interest of users. To solve this problem, an interestingness measure of association rules based on information entropy is proposed to mine interestingness association rules. Correlation analysis for categorical variables is adopted to eliminate false and erroneous rules from the primitive set, and a framework for evaluating the interestingness degree of rules based on information entropy is proposed. Since the method does not depend on the prior knowledge of users, it can represent the information hidden in the data accurately. Simulation results on both real and synthetic datasets show that the proposed algorithm performs better than the traditional algorithms, and it discovers interestingness rules from large database efficiently.