信息网络安全
信息網絡安全
신식망락안전
Netinfo Security
2015年
11期
77-83
,共7页
胡雪%封化民%李明伟%丁钊
鬍雪%封化民%李明偉%丁釗
호설%봉화민%리명위%정쇠
数据挖掘%关联规则%频繁项集%事务数%支持计数
數據挖掘%關聯規則%頻繁項集%事務數%支持計數
수거알굴%관련규칙%빈번항집%사무수%지지계수
data mining%association rule%frequent item sets%transaction number%support counting
在当今这个信息极度发达的社会,网络数据急剧膨胀,激增的数据背后隐藏着许多重要的信息,所以对大量数据进行分析是必要的。Apriori算法是一种挖掘关联规则的频繁项集算法,其核心思想是通过候选集生成和情节的向下封闭检测两个阶段来挖掘频繁项集。可能产生大量的候选集,以及可能需要重复扫描数据库是Apriori算法的两大缺点。文中提出了一种需要更少的扫描时间的Apriori算法,在剪枝候选项集的同时也在消除冗余的子项集的产生。改进的Apriori算法通过消除数据库中不需要记录的传输有效减少了I/O所花费的时间,Apriori算法的效率得到了极大的优化。文章给出了算法实现思想及证明,并对传统的和改进的Apriori算法进行比较和分析。
在噹今這箇信息極度髮達的社會,網絡數據急劇膨脹,激增的數據揹後隱藏著許多重要的信息,所以對大量數據進行分析是必要的。Apriori算法是一種挖掘關聯規則的頻繁項集算法,其覈心思想是通過候選集生成和情節的嚮下封閉檢測兩箇階段來挖掘頻繁項集。可能產生大量的候選集,以及可能需要重複掃描數據庫是Apriori算法的兩大缺點。文中提齣瞭一種需要更少的掃描時間的Apriori算法,在剪枝候選項集的同時也在消除冗餘的子項集的產生。改進的Apriori算法通過消除數據庫中不需要記錄的傳輸有效減少瞭I/O所花費的時間,Apriori算法的效率得到瞭極大的優化。文章給齣瞭算法實現思想及證明,併對傳統的和改進的Apriori算法進行比較和分析。
재당금저개신식겁도발체적사회,망락수거급극팽창,격증적수거배후은장착허다중요적신식,소이대대량수거진행분석시필요적。Apriori산법시일충알굴관련규칙적빈번항집산법,기핵심사상시통과후선집생성화정절적향하봉폐검측량개계단래알굴빈번항집。가능산생대량적후선집,이급가능수요중복소묘수거고시Apriori산법적량대결점。문중제출료일충수요경소적소묘시간적Apriori산법,재전지후선항집적동시야재소제용여적자항집적산생。개진적Apriori산법통과소제수거고중불수요기록적전수유효감소료I/O소화비적시간,Apriori산법적효솔득도료겁대적우화。문장급출료산법실현사상급증명,병대전통적화개진적Apriori산법진행비교화분석。
In the highly developed information society, network data expand rapidly and much important information hide behind the surge of data. So it is necessary that analyze a large amounts of data. Apriori algorithm is a frequent item set algorithm for mining association rules. Its core idea is to excavate frequent item sets through two stages including generating candidate sets and closed down testing of plot. May generate a large number of candidate sets and may need to repeat scanning database are the two major drawbacks of Apriori algorithm. By eliminating unnecessary transmission of records in the database, the improved Apriori algorithm effectively reduces the time spent on I/O, greatly optimizes the efifciency of the algorithm, proves and gives the algorithm implementation thought. In this paper, an enhanced Apriori algorithm is proposed which takes less scanning time. It is achieved by eliminating the redundant generation of sub-items during pruning the candidate item sets. Both traditional and enhanced Apriori algorithms are compared and analyzed in this paper.