计算机科学与探索
計算機科學與探索
계산궤과학여탐색
JOURNAL OF FRONTIERS OF COMPUTER SCIENCE & TECHNOLOGY
2015年
5期
611-620
,共10页
谢娟英%鲁肖肖%屈亚楠%高红超
謝娟英%魯肖肖%屈亞楠%高紅超
사연영%로초초%굴아남%고홍초
粒计算%初始聚类中心%最大最小距离法%K-me doids聚类算法
粒計算%初始聚類中心%最大最小距離法%K-me doids聚類算法
립계산%초시취류중심%최대최소거리법%K-me doids취류산법
granular computing%initial seeds%max-min distance means%K-me doids clustering algorithm
针对快速K-me doids聚类算法所选初始聚类中心可能位于同一类簇的缺陷,以及基于粒计算的K-medoids算法构造样本去模糊相似矩阵时需要主观给定阈值的缺陷,提出了粒计算优化初始聚类中心的K-medoids聚类算法。该算法结合粒计算与最大最小距离法,优化K-medoids算法初始聚类中心的选取,选择处于样本分布密集区域且相距较远的K个样本作为初始聚类中心;使用所有样本的相似度均值作为其构造去模糊相似矩阵的阈值。人工模拟数据集和UCI机器学习数据库数据集的实验测试表明,新K-medoids聚类算法具有更稳定的聚类效果,其准确率和Adjusted Rand Index等聚类结果评价指标值优于传统K-medoids聚类算法、快速K-medoids聚类算法和基于粒计算的K-medoids聚类算法。
針對快速K-me doids聚類算法所選初始聚類中心可能位于同一類簇的缺陷,以及基于粒計算的K-medoids算法構造樣本去模糊相似矩陣時需要主觀給定閾值的缺陷,提齣瞭粒計算優化初始聚類中心的K-medoids聚類算法。該算法結閤粒計算與最大最小距離法,優化K-medoids算法初始聚類中心的選取,選擇處于樣本分佈密集區域且相距較遠的K箇樣本作為初始聚類中心;使用所有樣本的相似度均值作為其構造去模糊相似矩陣的閾值。人工模擬數據集和UCI機器學習數據庫數據集的實驗測試錶明,新K-medoids聚類算法具有更穩定的聚類效果,其準確率和Adjusted Rand Index等聚類結果評價指標值優于傳統K-medoids聚類算法、快速K-medoids聚類算法和基于粒計算的K-medoids聚類算法。
침대쾌속K-me doids취류산법소선초시취류중심가능위우동일류족적결함,이급기우립계산적K-medoids산법구조양본거모호상사구진시수요주관급정역치적결함,제출료립계산우화초시취류중심적K-medoids취류산법。해산법결합립계산여최대최소거리법,우화K-medoids산법초시취류중심적선취,선택처우양본분포밀집구역차상거교원적K개양본작위초시취류중심;사용소유양본적상사도균치작위기구조거모호상사구진적역치。인공모의수거집화UCI궤기학습수거고수거집적실험측시표명,신K-medoids취류산법구유경은정적취류효과,기준학솔화Adjusted Rand Index등취류결과평개지표치우우전통K-medoids취류산법、쾌속K-medoids취류산법화기우립계산적K-medoids취류산법。
To overcome the defects of fast K-me doids clustering algorithm which may choose the initial seeds in a same cluster for different clusters and the arbitrary of granular computing based K-medoids clustering algorithm in determining the threshold to construct the defuzzy similarity matrix, this paper proposes two new K-medoids clustering algorithms with optimized initial seeds by granular computing. This proposed algorithms combine granular computing with max-min distance means to choose the optimal initial seeds, so that the K instances in dense area and apart from each other are selected as initial seeds, and adopt the mean similarity between instances as the threshold to con-struct the defuzzy similarity matrix. This paper tests the proposed algorithms on the synthetically generated datasets and the datasets from UCI machine learning repository. The experimental results evaluated in terms of clustering accuracy and Adjusted Rand Index etc. demonstrate that the proposed K-medoids algorithms are superior to the tradi-tional K-medoids algorithm, the fast K-medoids algorithm and the previous K-medoids clustering algorithm based on granular computing.