计算机与现代化
計算機與現代化
계산궤여현대화
Computer and Modernization
2015年
9期
1-5
,共5页
数据挖掘%关联规则%Apriori算法%频繁项集%矩阵约简
數據挖掘%關聯規則%Apriori算法%頻繁項集%矩陣約簡
수거알굴%관련규칙%Apriori산법%빈번항집%구진약간
data mining%association rules%Apriori algorithm%frequent itemsets%matrix reduction
Apriori算法在搜索频繁项集过程中,通常需要对数据库进行多次的重复扫描和产生大量无用的候选集,针对此问题提出一种基于矩阵约简的Apriori改进算法。该算法只需扫描一次数据库,将数据库信息转换成布尔矩阵,根据频繁k-项集的性质推出的结论来约简数据结构,有效地降低无效候选项集的生成规模。通过对已有算法的对比,验证该算法能有效地提高挖掘频繁项集的效率。
Apriori算法在搜索頻繁項集過程中,通常需要對數據庫進行多次的重複掃描和產生大量無用的候選集,針對此問題提齣一種基于矩陣約簡的Apriori改進算法。該算法隻需掃描一次數據庫,將數據庫信息轉換成佈爾矩陣,根據頻繁k-項集的性質推齣的結論來約簡數據結構,有效地降低無效候選項集的生成規模。通過對已有算法的對比,驗證該算法能有效地提高挖掘頻繁項集的效率。
Apriori산법재수색빈번항집과정중,통상수요대수거고진행다차적중복소묘화산생대량무용적후선집,침대차문제제출일충기우구진약간적Apriori개진산법。해산법지수소묘일차수거고,장수거고신식전환성포이구진,근거빈번k-항집적성질추출적결론래약간수거결구,유효지강저무효후선항집적생성규모。통과대이유산법적대비,험증해산법능유효지제고알굴빈번항집적효솔。
During the search for frequent itemsets of the Apriori algorithm, the database is scanned repetitively and generates a large number of useless candidate sets. For this problem, a kind of improved Apriori algorithm based on the matrix reduction is put forward. The algorithm scans the database only once, converts the database information to Boolean matrix, and reduces the data structure according to the conclusion drawn from the nature of the frequent k-itemsets, which lowers the generation scale of the invalid candidate itemsets effectively. By comparing with the existing algorithms, it is validated that this algorithm can im-prove the efficiency of mining frequent itemsets effectively.