郑州大学学报(工学版)
鄭州大學學報(工學版)
정주대학학보(공학판)
JOURNAL OF ZHENGZHOU UNIVERSITY(ENGINEERING SCIENCE)
2010年
1期
89-92
,共4页
k_means算法%聚类%加权%变异系数
k_means算法%聚類%加權%變異繫數
k_means산법%취류%가권%변이계수
k_means algorithm%clustering%weight%coefficient of variation
传统的k_means算法将欧式距离作为最常用的距离度量方法.针对基于欧式距离计算样本点与类间相似度的不足,用"相对距离"代替"绝对距离"可以更好地反映样本的实际分布,提出一种在领域知识未知的情况下基于加权欧式距离的k_means算法.针对公共数据库UCI里的数据实验表明改进后的算法能产生质量较高的聚类结果.
傳統的k_means算法將歐式距離作為最常用的距離度量方法.針對基于歐式距離計算樣本點與類間相似度的不足,用"相對距離"代替"絕對距離"可以更好地反映樣本的實際分佈,提齣一種在領域知識未知的情況下基于加權歐式距離的k_means算法.針對公共數據庫UCI裏的數據實驗錶明改進後的算法能產生質量較高的聚類結果.
전통적k_means산법장구식거리작위최상용적거리도량방법.침대기우구식거리계산양본점여류간상사도적불족,용"상대거리"대체"절대거리"가이경호지반영양본적실제분포,제출일충재영역지식미지적정황하기우가권구식거리적k_means산법.침대공공수거고UCI리적수거실험표명개진후적산법능산생질량교고적취류결과.
Euclid distance is commonly used to measure distance in the traditional k_means algorithm. The k_ means algorithm based on weighted Euclid distance is researched and presented to overcome the existing prob-lems of similarity calculation in clustering analysis based on traditional Euclid distance when we have no any domain knowledge about the data objects, the relative distance but not absolute distance is more accurately re-sponse to data distribution. Experiments on the standard database UCI show that the proposed method can pro-duce a high accuracy clustering result.