电脑知识与技术
電腦知識與技術
전뇌지식여기술
COMPUTER KNOWLEDGE AND TECHNOLOGY
2014年
4期
673-676
,共4页
高维不确定对象%凝聚层次聚类%相似性度量%不确定聚类
高維不確定對象%凝聚層次聚類%相似性度量%不確定聚類
고유불학정대상%응취층차취류%상사성도량%불학정취류
High dimensional uncertain objects%agglomerative hierarchical clustering%similarity measure%uncertain cluster
维度灾难、含有噪声数据和输入参数对领域知识的强依赖性,是不确定数据聚类领域中具有挑战性的问题。针对这些问题,基于相似性度量和凝聚层次聚类思想的基础上提出了高维不确定数据高效聚类HDUDEC(High Dimensional Un-certain Data Efficient Clustering)算法。该算法采用一个能够准确表达不确定高维对象之间的相似度的度量函数计算出对象之间的相似度,然后根据相似度阈值自底向上进行聚类分析。实验证明新的算法需要的先验知识较少、可以有效地过滤噪声数据、可以高效的获得任意形状的高维不确定聚类结果。
維度災難、含有譟聲數據和輸入參數對領域知識的彊依賴性,是不確定數據聚類領域中具有挑戰性的問題。針對這些問題,基于相似性度量和凝聚層次聚類思想的基礎上提齣瞭高維不確定數據高效聚類HDUDEC(High Dimensional Un-certain Data Efficient Clustering)算法。該算法採用一箇能夠準確錶達不確定高維對象之間的相似度的度量函數計算齣對象之間的相似度,然後根據相似度閾值自底嚮上進行聚類分析。實驗證明新的算法需要的先驗知識較少、可以有效地過濾譟聲數據、可以高效的穫得任意形狀的高維不確定聚類結果。
유도재난、함유조성수거화수입삼수대영역지식적강의뢰성,시불학정수거취류영역중구유도전성적문제。침대저사문제,기우상사성도량화응취층차취류사상적기출상제출료고유불학정수거고효취류HDUDEC(High Dimensional Un-certain Data Efficient Clustering)산법。해산법채용일개능구준학표체불학정고유대상지간적상사도적도량함수계산출대상지간적상사도,연후근거상사도역치자저향상진행취류분석。실험증명신적산법수요적선험지식교소、가이유효지과려조성수거、가이고효적획득임의형상적고유불학정취류결과。
Cursing of dimensionality, including noise data and the input parameters are highly dependence on relevant domain knowledge are all challenging problems in the field of uncertain data clustering. For these problems, HDUDEC(High Dimension-al Uncertain Data Efficient Clustering )algorithm based on Similarity measure and agglomerative hierarchical clustering idea was proposed. The algorithm uses a metric function who can accurately express the similarity between the uncertain high-dimension-al objects to calculate the similarity between objects, and then cluster analysis from the bottom up based on similarity threshold. Experiments show that the new algorithm can filter noise data effectively and obtain arbitrary shape uncertain clustering results ef-ficiently with little priori knowledge.