漳州师范学院学报:自然科学版
漳州師範學院學報:自然科學版
장주사범학원학보:자연과학판
Journal of ZhangZhou Teachers College(Natural Science)
2012年
1期
43-47
,共5页
分类%KNN算法%信息熵%聚类
分類%KNN算法%信息熵%聚類
분류%KNN산법%신식적%취류
classification%K- Nearest Neighbor algorithm%information entropy%clustering
KNN算法通过近邻样本的个数分类,Entropy-KNN算法给出新的相似度定义,而且投票时综合待测样本与近邻样本的个数和各类近邻的平均距离,但两种算法均未考虑近邻样本间的相似.提出的基于层次聚类法的Entropy-KNN算法,首先对训练集按类别进行层次聚类,接着在与待测样本最相似的子类中选取近邻样本,使得近邻样本具有较高的相似度,最后结合Entropy-KNN算法进行分类.在蘑菇数据集上的实验结果表明,该算法的分类准确率高于Entropy-KNN算法.
KNN算法通過近鄰樣本的箇數分類,Entropy-KNN算法給齣新的相似度定義,而且投票時綜閤待測樣本與近鄰樣本的箇數和各類近鄰的平均距離,但兩種算法均未攷慮近鄰樣本間的相似.提齣的基于層次聚類法的Entropy-KNN算法,首先對訓練集按類彆進行層次聚類,接著在與待測樣本最相似的子類中選取近鄰樣本,使得近鄰樣本具有較高的相似度,最後結閤Entropy-KNN算法進行分類.在蘑菇數據集上的實驗結果錶明,該算法的分類準確率高于Entropy-KNN算法.
KNN산법통과근린양본적개수분류,Entropy-KNN산법급출신적상사도정의,이차투표시종합대측양본여근린양본적개수화각류근린적평균거리,단량충산법균미고필근린양본간적상사.제출적기우층차취류법적Entropy-KNN산법,수선대훈련집안유별진행층차취류,접착재여대측양본최상사적자류중선취근린양본,사득근린양본구유교고적상사도,최후결합Entropy-KNN산법진행분류.재마고수거집상적실험결과표명,해산법적분류준학솔고우Entropy-KNN산법.
The class label of the test sample on KNN is decided by the K nearest neighbors numbers on the respective class. On algorithm Entropy-KNN, we not only define a distance of the two samples, but also decide the class label of the test sample by the average distance and the numbers on the respective class. But they are not focus on the similarity degree of the K nearest neighbors, which is useful to the class label of the test sample. On the contrary, we propose Entropy-KNN algorithm Based on clustering. At first, the samples of the different class label are clustered. Second, we select the K nearest neighbors from the child clusters, which is nearest to the test sample. Finally, we decide the class label of the test sample by algorithm Entropy-KNN. We perform our experiments on mushroom data set. The experimental results show that our approach has much better than algorithm Entropy-KNN.