计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2013年
16期
117-120,136
,共5页
虞倩倩%戴月明%李晶晶
虞倩倩%戴月明%李晶晶
우천천%대월명%리정정
数据挖掘%MapReduce%蚁群优化%K-means%云计算
數據挖掘%MapReduce%蟻群優化%K-means%雲計算
수거알굴%MapReduce%의군우화%K-means%운계산
data mining%MapReduce%Ant Colony Optimization(ACO)%K-means%cloud computing
针对K-means算法处理海量数据存在严重的内存不足,提出利用MapReduce并行化K-means,但是普通的K均值存在收敛速度慢、易陷入局部最优和对初始聚类中心的选取等局限性,因此选择了经ACO改进过的ACO-K-means聚类算法。实验结果表明,经MapReduce并行化的ACO-K-means,不仅具有良好的加速比和扩展性,其收敛性以及聚类精度均得到了改善。
針對K-means算法處理海量數據存在嚴重的內存不足,提齣利用MapReduce併行化K-means,但是普通的K均值存在收斂速度慢、易陷入跼部最優和對初始聚類中心的選取等跼限性,因此選擇瞭經ACO改進過的ACO-K-means聚類算法。實驗結果錶明,經MapReduce併行化的ACO-K-means,不僅具有良好的加速比和擴展性,其收斂性以及聚類精度均得到瞭改善。
침대K-means산법처리해량수거존재엄중적내존불족,제출이용MapReduce병행화K-means,단시보통적K균치존재수렴속도만、역함입국부최우화대초시취류중심적선취등국한성,인차선택료경ACO개진과적ACO-K-means취류산법。실험결과표명,경MapReduce병행화적ACO-K-means,불부구유량호적가속비화확전성,기수렴성이급취류정도균득도료개선。
There is a serious lack of memory when use K-means to deal with massive data. In this paper MapReduce is used to parallelize K-means. Due to ordinary K-means has slow rate of convergence, easily fall into local optimization and the limita-tions of selection of initial cluster centers, Ant Colony Optimization(ACO) is led into K-means. The result demonstrates ACO-K-means clustering algorithm based on MapReduce model has high speedup and good scalability, and it’s convergence and clustering accuracy are also improved in some degree.