计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2014年
2期
103-106
,共4页
频繁项目集%并行挖掘%FP-Growth%Map/Reduce
頻繁項目集%併行挖掘%FP-Growth%Map/Reduce
빈번항목집%병행알굴%FP-Growth%Map/Reduce
frequent item set%parallel mining%FP-Growth%Map/Reduce
FP-Growth算法是基于FP树挖掘频繁项目集的经典算法,为提高FP-Growth算法挖掘大规模数据频繁项目集的效率,提出了一种基于FP-Growth的频繁项目集并行挖掘算法FPPM。该算法基于Map/Reduce并行模型,在每个计算节点上首先构造局部频繁模式树,并对之进行挖掘得到局部频繁项目集,然后合并局部频繁项目集以得到全局频繁项集,由于此时得到的结果并不完备,所以对合并后未达到最小支持度阈值的项目集,重新计算其支持数。介绍了FPPM算法的设计思想,测试了其性能。实验结果表明FPPM算法具有较好的可扩展性。
FP-Growth算法是基于FP樹挖掘頻繁項目集的經典算法,為提高FP-Growth算法挖掘大規模數據頻繁項目集的效率,提齣瞭一種基于FP-Growth的頻繁項目集併行挖掘算法FPPM。該算法基于Map/Reduce併行模型,在每箇計算節點上首先構造跼部頻繁模式樹,併對之進行挖掘得到跼部頻繁項目集,然後閤併跼部頻繁項目集以得到全跼頻繁項集,由于此時得到的結果併不完備,所以對閤併後未達到最小支持度閾值的項目集,重新計算其支持數。介紹瞭FPPM算法的設計思想,測試瞭其性能。實驗結果錶明FPPM算法具有較好的可擴展性。
FP-Growth산법시기우FP수알굴빈번항목집적경전산법,위제고FP-Growth산법알굴대규모수거빈번항목집적효솔,제출료일충기우FP-Growth적빈번항목집병행알굴산법FPPM。해산법기우Map/Reduce병행모형,재매개계산절점상수선구조국부빈번모식수,병대지진행알굴득도국부빈번항목집,연후합병국부빈번항목집이득도전국빈번항집,유우차시득도적결과병불완비,소이대합병후미체도최소지지도역치적항목집,중신계산기지지수。개소료FPPM산법적설계사상,측시료기성능。실험결과표명FPPM산법구유교호적가확전성。
Algorithm FP-Growth is a classic algorithm for mining frequent item sets which is based on frequent pattern tree. In order to improve the efficiency of algorithm FP-Growth for mining association rules from massive datasets, parallel FP-Growth algorithm FPPM is presented. The algorithm is based on Map/Reduce model, and the local frequent pattern tree of each computing node is built, these local trees are mined to get local frequent item sets, and local frequent item sets are merged into global frequent item sets. After the statistics of the local frequent item sets, a complete result is got. In this paper, the idea of FPPM is introduced and its performance is studied. The experimental results show that the parallel algo-rithm FPPM has high scalability.