北京交通大学学报
北京交通大學學報
북경교통대학학보
JOURNAL OF NORTHERN JIAOTONG UNIVERSITY
2009年
6期
106-109
,共4页
孙雪%李昆仑%胡夕坤%赵瑞
孫雪%李昆崙%鬍夕坤%趙瑞
손설%리곤륜%호석곤%조서
半监督聚类%constrained-K均值%K均值算法%投票%阈值
半鑑督聚類%constrained-K均值%K均值算法%投票%閾值
반감독취류%constrained-K균치%K균치산법%투표%역치
semi-supervised clustering%constrained-Kmeans%K-means%voting%threshold
提出一种基于半监督K-means的K值全局寻优算法,该算法打破传统方法中采用样本类别作为K值的限定,利用少量标记数据即可指导和规划大量无监督数据.结合数据集自身的分布特点及聚类后各个簇内的监督信息,根据投票方法来指导簇中数据集的类别标记.实验表明,本文所提出的方法可以有效的寻找适合数据集的最佳K值和聚类的中心,提高聚类性能.
提齣一種基于半鑑督K-means的K值全跼尋優算法,該算法打破傳統方法中採用樣本類彆作為K值的限定,利用少量標記數據即可指導和規劃大量無鑑督數據.結閤數據集自身的分佈特點及聚類後各箇簇內的鑑督信息,根據投票方法來指導簇中數據集的類彆標記.實驗錶明,本文所提齣的方法可以有效的尋找適閤數據集的最佳K值和聚類的中心,提高聚類性能.
제출일충기우반감독K-means적K치전국심우산법,해산법타파전통방법중채용양본유별작위K치적한정,이용소량표기수거즉가지도화규화대량무감독수거.결합수거집자신적분포특점급취류후각개족내적감독신식,근거투표방법래지도족중수거집적유별표기.실험표명,본문소제출적방법가이유효적심조괄합수거집적최가K치화취류적중심,제고취류성능.
In this paper, we propose a global optimising K value for semi-supervised K-means algorithm. It has broken the limits that traditional methods have in selecting samples as the K value. It can direct and plan a great amount of supervision data by using only a small amount of labled data. Combining the distribution characteristics of data sets and monitoring information in each cluster after clustering, we use the voting rule to guide the cluster labeling in the data sets. The experiments show that the method proposed in this paper can effectively find the best data sets for K values and clustering center and enhancing the performance of clustering.