计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2013年
20期
112-117
,共6页
李伟卫%赵航%张阳%王勇
李偉衛%趙航%張暘%王勇
리위위%조항%장양%왕용
云计算%数据挖掘%Hadoop%MapReduce
雲計算%數據挖掘%Hadoop%MapReduce
운계산%수거알굴%Hadoop%MapReduce
cloud computing%data mining%Hadoop%MapReduce
MapReduce是一种编程模型,可以运行在异构环境下,编程简单,不必关心底层实现细节,用于大规模数据集的并行运算。将MapReduce应用在数据挖掘的三个算法中:朴素贝叶斯分类算法、K-modes聚类算法和ECLAT频繁项集挖掘算法。实验结果表明,在保证算法准确率的前提下,MapReduce可以有效提高海量数据挖掘工作的效率。
MapReduce是一種編程模型,可以運行在異構環境下,編程簡單,不必關心底層實現細節,用于大規模數據集的併行運算。將MapReduce應用在數據挖掘的三箇算法中:樸素貝葉斯分類算法、K-modes聚類算法和ECLAT頻繁項集挖掘算法。實驗結果錶明,在保證算法準確率的前提下,MapReduce可以有效提高海量數據挖掘工作的效率。
MapReduce시일충편정모형,가이운행재이구배경하,편정간단,불필관심저층실현세절,용우대규모수거집적병행운산。장MapReduce응용재수거알굴적삼개산법중:박소패협사분류산법、K-modes취류산법화ECLAT빈번항집알굴산법。실험결과표명,재보증산법준학솔적전제하,MapReduce가이유효제고해량수거알굴공작적효솔。
MapReduce is a programming model which can run in a heterogeneous environment for mining massive volume of data. It is simple to be implemented without paying attention to the underlying details and can be used for large-scale parallel computing. In this paper, three data mining algorithms, Naive Bayes, K-modes, ECLAT are implemented by employing the MapReduce programming model. The results indicate that MapReduce can perform the data mining tasks on massive volume of data efficiently.