计算机研究与发展
計算機研究與髮展
계산궤연구여발전
JOURNAL OF COMPUTER RESEARCH AND DEVELOPMENT
2015年
7期
1463-1476
,共14页
张博%郝杰%马刚%岳金朋%张建华%史忠植
張博%郝傑%馬剛%嶽金朋%張建華%史忠植
장박%학걸%마강%악금붕%장건화%사충식
典型相关性分析%概率典型相关性分析%混合概率模型%聚类融合%模式识别
典型相關性分析%概率典型相關性分析%混閤概率模型%聚類融閤%模式識彆
전형상관성분석%개솔전형상관성분석%혼합개솔모형%취류융합%모식식별
canonical correlation analysis%probabilistic canonical correlation analysis%mixture probabilistic model%cluster ensembles%pattern recognition
典型相关性分析(canonical correlation analysis ,CCA)是一种用来分析2组随机变量之间相关性的统计分析工具,但作为一种线性数学模型,CCA 不足以揭示真实世界中大量存在的非线性相关现象。采用局部化的方法,在概率典型相关性分析(probabilistic CCA ,PCCA )的基础上,使用概率混合模型框架,提出了混合概率典型相关性分析模型(mixture of probabilistic CCA ,MixPCCA )以及估计模型参数的2阶段期望最大化(expectation maximization ,EM )算法,并给出了使用聚类融合确定局部线性模型数量的方法和 MixPCCA 模型应用于模式识别的理论框架。在手写体数据集 USPS 和 MNIST 上的实验证明,MixPCCA 模型通过混合多个局部线性 PCCA 模型不仅提供了一种捕捉复杂的全局非线性相关性的解决方案,而且还具备检测只在局部区域才存在的相关性的能力。
典型相關性分析(canonical correlation analysis ,CCA)是一種用來分析2組隨機變量之間相關性的統計分析工具,但作為一種線性數學模型,CCA 不足以揭示真實世界中大量存在的非線性相關現象。採用跼部化的方法,在概率典型相關性分析(probabilistic CCA ,PCCA )的基礎上,使用概率混閤模型框架,提齣瞭混閤概率典型相關性分析模型(mixture of probabilistic CCA ,MixPCCA )以及估計模型參數的2階段期望最大化(expectation maximization ,EM )算法,併給齣瞭使用聚類融閤確定跼部線性模型數量的方法和 MixPCCA 模型應用于模式識彆的理論框架。在手寫體數據集 USPS 和 MNIST 上的實驗證明,MixPCCA 模型通過混閤多箇跼部線性 PCCA 模型不僅提供瞭一種捕捉複雜的全跼非線性相關性的解決方案,而且還具備檢測隻在跼部區域纔存在的相關性的能力。
전형상관성분석(canonical correlation analysis ,CCA)시일충용래분석2조수궤변량지간상관성적통계분석공구,단작위일충선성수학모형,CCA 불족이게시진실세계중대량존재적비선성상관현상。채용국부화적방법,재개솔전형상관성분석(probabilistic CCA ,PCCA )적기출상,사용개솔혼합모형광가,제출료혼합개솔전형상관성분석모형(mixture of probabilistic CCA ,MixPCCA )이급고계모형삼수적2계단기망최대화(expectation maximization ,EM )산법,병급출료사용취류융합학정국부선성모형수량적방법화 MixPCCA 모형응용우모식식별적이론광가。재수사체수거집 USPS 화 MNIST 상적실험증명,MixPCCA 모형통과혼합다개국부선성 PCCA 모형불부제공료일충포착복잡적전국비선성상관성적해결방안,이차환구비검측지재국부구역재존재적상관성적능력。
Canonical correlation analysis (CCA) is a statistical analysis tool ,which is used to analyze the correlation between two sets of random variables .A critical limitation of CCA is that it can only detect linear correlation between the two domains that is globally valid throughout both data sets .It is not enough to reveal the large amount of non‐linear correlation phenomenon in the real world . To address this limitation ,there are three main ways :kernel mapping ,neural network and the method of localization .In this paper ,a mixture model of local linear probabilistic canonical correlation analysis (PCCA) called MixPCCA is constructed based on the idea of localization , and a two‐stage EM algorithm is proposed to estimate the model parameters .How to determine the number of local linear models is a fundamental issue to be addressed .We solve this problem by the framework of cluster ensembles .In addition ,the theoretical framework of MixPCCA model applied in pattern recognition is put forward .The results on both USPS and MNIST handwritten image datasets demonstrate that the proposed MixPCCA model not only provides a solution to capture the complex global non‐linear correlation ,but also has the ability of detecting correlation which only exist in the local area ,which traditional CCA or PCCA fails to discover .