高技术通讯(英文版)
高技術通訊(英文版)
고기술통신(영문판)
HIGH TECHNOLOGY LETTERS
2007年
2期
131-135
,共5页
classification%clustering%data mining
The problem of scalable classification by clustering in large databases was discussed. Clustering based classification method first generates clusters using clustering algorithms . To classify new coming data points , it finds the k nearest clusters of the data point as neighbors , and assign each data point to the dominant class of these neighbors . Existing algorithms incorporated class information in making clustering decisions and produced pure clusters (each cluster associated with only one class) . We presented hybrid cluster based algorithms , which produce clusters by unsupervised clustering and allow each cluster associated with multiple classes . Experimental results show that hybrid cluster based algorithms outperform pure ones in both classification accuracy and training speed.