CAJ | 학술논문

在目前聚类方法中， k-means与势函数是最常用的算法，虽然两种算法有很多优点，但也存在自身的局限性。 k-means聚类算法：其聚类数目无法确定，需要提前进行预估，同时对初始聚类中心敏感，且容易受到异常点干扰；势函数聚类算法：其聚类区间范围有限，对多维数据进行聚类其效率低。针对以上两种算法的缺点，提出了一种基于 K-means 与势函数法的改进聚类算法。它首先采用势函数法确定聚类数目与初始中心，然后利用K-means法进行聚类，该改进算法具有势函数法“盲”特性及K-means法高效性的优点。实验对改进算法的有效性进行了验证，结果表明，改进算法在聚类精度及收敛速度方面有很大提高。
재목전취류방법중， k-means여세함수시최상용적산법，수연량충산법유흔다우점，단야존재자신적국한성。 k-means취류산법：기취류수목무법학정，수요제전진행예고，동시대초시취류중심민감，차용역수도이상점간우；세함수취류산법：기취류구간범위유한，대다유수거진행취류기효솔저。침대이상량충산법적결점，제출료일충기우 K-means 여세함수법적개진취류산법。타수선채용세함수법학정취류수목여초시중심，연후이용K-means법진행취류，해개진산법구유세함수법“맹”특성급K-means법고효성적우점。실험대개진산법적유효성진행료험증，결과표명，개진산법재취류정도급수렴속도방면유흔대제고。
In the present clustering method, k-means with potential function is the most commonly used algorithm, although the two algorithms have many advantages, but they also have their own limitations. The clustering number of k-means clustering algorithm cannot be determined, estimate in advance, at the same time sensitive to initial clustering center, and easy to be interfered by abnormal point, the clustering range of potential function clustering algorithm is limited, low efficiency of clustering multidimensional data. In view of the above two algorithms disadvantage, an improved clustering algorithm based on K-means and potential function is proposed in the paper. First, potential function method is used to determine the clustering number and initial center, and then cluster by using K-means method. The improved algorithm has the advantage of blind characteristics of potential function algorithm and also has the advantages of high efficiency of K-means. The experiment verified the validity of the improved algorithm, the results show that the improved algorithm have greatly improved in clustering accuracy and convergence speed.