计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2010年
6期
139-143
,共5页
半监督聚类%成对约束%主成分分析%线性判别分析
半鑑督聚類%成對約束%主成分分析%線性判彆分析
반감독취류%성대약속%주성분분석%선성판별분석
semi-supervised clustering%pairwise constraint%principal component analysis%linear discriminant analysis
与无监督聚类相比,半监督聚类是利用一部分先验信息来更好地挖掘和理解数据的内在结构,并紧密遵从用户的偏好.现有的典型半监督聚类算法仅仅适合于低维数据,文中提出一种新颖的基于判别分析的半监督聚类算法来解决高维数据聚类问题.新算法首先使用主成分分析来投影高维数据,进一步在投影空间中,使用基于球形K均值聚类算法对数据聚类;然后利用聚类结果,使用线性判别分析降维输入空间数据;最后在投影空间中对数据再次聚类.在一组真实数据集上的实验表明,所提出的算法不仅可以有效地处理高维数据,还提高了聚类性能.
與無鑑督聚類相比,半鑑督聚類是利用一部分先驗信息來更好地挖掘和理解數據的內在結構,併緊密遵從用戶的偏好.現有的典型半鑑督聚類算法僅僅適閤于低維數據,文中提齣一種新穎的基于判彆分析的半鑑督聚類算法來解決高維數據聚類問題.新算法首先使用主成分分析來投影高維數據,進一步在投影空間中,使用基于毬形K均值聚類算法對數據聚類;然後利用聚類結果,使用線性判彆分析降維輸入空間數據;最後在投影空間中對數據再次聚類.在一組真實數據集上的實驗錶明,所提齣的算法不僅可以有效地處理高維數據,還提高瞭聚類性能.
여무감독취류상비,반감독취류시이용일부분선험신식래경호지알굴화리해수거적내재결구,병긴밀준종용호적편호.현유적전형반감독취류산법부부괄합우저유수거,문중제출일충신영적기우판별분석적반감독취류산법래해결고유수거취류문제.신산법수선사용주성분분석래투영고유수거,진일보재투영공간중,사용기우구형K균치취류산법대수거취류;연후이용취류결과,사용선성판별분석강유수입공간수거;최후재투영공간중대수거재차취류.재일조진실수거집상적실험표명,소제출적산법불부가이유효지처리고유수거,환제고료취류성능.
The semi-supervised clustering is to mine and help to understand better the structure of unlabeled data and to more closely conform to the user's preferences using those supervised data,in comparison with unsupervised clustering.Most existing semi-supervised clustering methods are designed for handling low-dimensional data.In this paper,a novel Semi-supervised Cluster ing Approach with Discriminant Analysis(SCADA) is presented for clustering the high-dimensional data.Specifically,the data are first mapped onto the low-dimensional space by principal component analysis such that constrained spherical K-means algorithm is used to cluster those transformed data.Secondly,linear discriminant analysis is used to reduce the number of the dimensionality of the data in terms of the clustering results.Finally,the data in the embedded space are clustered.Indeed,the experimental results on several real-world data sets show the SCADA method can effectively deal with the high-dimensional data and provides an appealing clustering performance.