计算机应用
計算機應用
계산궤응용
COMPUTER APPLICATION
2010年
3期
806-809
,共4页
数据挖掘%频繁闭项集%压缩频繁模式树%划分矩阵
數據挖掘%頻繁閉項集%壓縮頻繁模式樹%劃分矩陣
수거알굴%빈번폐항집%압축빈번모식수%화분구진
data mining%frequent closed itemset%Compressed Frequent Pattern Tree (CFP-Tree)%partition matrix
频繁闭项集挖掘是许多数据挖掘应用中的重要问题.为减少候选项集数量和降低支持度计算的开销,提出一种新的深度优先搜索频繁闭项集(DFFCI)的算法.将改进的压缩频繁模式树(CFP-Tree)表示的数据集信息投影到划分矩阵,使用二进制向量逻辑运算计算支持度,简化了计算过程,减少了时间开销;采用基于支持度预计算技术的全局2-项剪枝和局部扩展剪枝,有效削减了搜索空间.实验结果表明该算法的性能优于其他主流深度优先算法.
頻繁閉項集挖掘是許多數據挖掘應用中的重要問題.為減少候選項集數量和降低支持度計算的開銷,提齣一種新的深度優先搜索頻繁閉項集(DFFCI)的算法.將改進的壓縮頻繁模式樹(CFP-Tree)錶示的數據集信息投影到劃分矩陣,使用二進製嚮量邏輯運算計算支持度,簡化瞭計算過程,減少瞭時間開銷;採用基于支持度預計算技術的全跼2-項剪枝和跼部擴展剪枝,有效削減瞭搜索空間.實驗結果錶明該算法的性能優于其他主流深度優先算法.
빈번폐항집알굴시허다수거알굴응용중적중요문제.위감소후선항집수량화강저지지도계산적개소,제출일충신적심도우선수색빈번폐항집(DFFCI)적산법.장개진적압축빈번모식수(CFP-Tree)표시적수거집신식투영도화분구진,사용이진제향량라집운산계산지지도,간화료계산과정,감소료시간개소;채용기우지지도예계산기술적전국2-항전지화국부확전전지,유효삭감료수색공간.실험결과표명해산법적성능우우기타주류심도우선산법.
Mining frequent closed itemsets is a fundamental and important issue in many data mining applications. A new depth-first search algorithm for mining frequent closed itemsets called depth-first search for frequent closed itemsets (DFFCI) was proposed, which could reduce the number of candidate itemsets and the cost of support counting. DFFCI projected the dataset information stored by the improved Compressed Frequent Pattern tree (CFP-Tree) into the partition matrix, and improved the efficiency of support counting by using binary vector logic operation. Global 2-itemset pruning based on support pre-counting and local extension pruning were used to prune the search space effectively. The experimental results show that DFFCI outperforms other depth-first search algorithms.