鞍山师范学院学报
鞍山師範學院學報
안산사범학원학보
JOURNAL OF ANSHAN TEACHERS COLLEGE
2013年
6期
38-41,59
,共5页
离散属性%连续属性%KNN算法%多属性分类
離散屬性%連續屬性%KNN算法%多屬性分類
리산속성%련속속성%KNN산법%다속성분류
Discrete attribute%Continuous attribute%KNN algorithm%Multi-attribute classification
提出了一种基于多属性分类的KNN改进算法,可有效提高传统的欧几里德KNN算法和基于信息熵的KNN改进算法的分类准确度。首先,按照单个属性不同属性值的个数占整个属性包含样本的比例进行属性的分类,分为基于信息熵的KNN算法处理的离散属性和基于传统欧几里德KNN相似度处理的连续属性两类,然后分别对不同属性进行区别处理;其次,将两类不同处理后得到的结果按比例求和作为样本之间的距离;最后,选取与待测样本的距离最小的k个样本判断测试样本的决策属性类别。
提齣瞭一種基于多屬性分類的KNN改進算法,可有效提高傳統的歐幾裏德KNN算法和基于信息熵的KNN改進算法的分類準確度。首先,按照單箇屬性不同屬性值的箇數佔整箇屬性包含樣本的比例進行屬性的分類,分為基于信息熵的KNN算法處理的離散屬性和基于傳統歐幾裏德KNN相似度處理的連續屬性兩類,然後分彆對不同屬性進行區彆處理;其次,將兩類不同處理後得到的結果按比例求和作為樣本之間的距離;最後,選取與待測樣本的距離最小的k箇樣本判斷測試樣本的決策屬性類彆。
제출료일충기우다속성분류적KNN개진산법,가유효제고전통적구궤리덕KNN산법화기우신식적적KNN개진산법적분류준학도。수선,안조단개속성불동속성치적개수점정개속성포함양본적비례진행속성적분류,분위기우신식적적KNN산법처리적리산속성화기우전통구궤리덕KNN상사도처리적련속속성량류,연후분별대불동속성진행구별처리;기차,장량류불동처리후득도적결과안비례구화작위양본지간적거리;최후,선취여대측양본적거리최소적k개양본판단측시양본적결책속성유별。
To improve the classification accuracy of the conventional Euclidean KNN algorithm and the im-proved KNN algorithm based on information entropy,this paper proposes an improved KNN algorithm based on multi-attribute classification. The procedures of the new algorithm comprise:i) classify the attributes according to the percentage of their attribute values in an entire attribute of sample set into those discrete attributes suit-able for entropy-based KNN algorithm and those continuous attributes suitable for conventional Euclidean KNN similarity-based algorithm;ii) process the two types of attributes separately and then sum up the two series of results with weighing and put the sum as the distance between samples;iii) select k samples those are closest to the test sample to determine the decision attribute type of the test sample.