新型工业化
新型工業化
신형공업화
New Industrialization Straregy
2013年
3期
85-96
,共12页
机器学习%非线性维数约简%流形学习%独立多流形%切空间%DC-ISOMAP
機器學習%非線性維數約簡%流形學習%獨立多流形%切空間%DC-ISOMAP
궤기학습%비선성유수약간%류형학습%독립다류형%절공간%DC-ISOMAP
Machine Learning%Nonlinear Dimensionality reduction%Manifold learning%Well-separated multi-manifold%Tangent space%DC-ISOMAP
流形学习已经成为机器学习与数据挖掘领域的一个重要的研究课题。目前的流形学习算法都假设所研究的高维数据存在于同一个流形上,并不能支持或者应用于大量存在的采样于多流形上的高维数据。本文针对等维度的独立多流形提出了DC-ISOMAP算法。该算法首先通过从采样密集点开始扩展切空间的方法将多流形准确分解为单个流形,并逐个计算其低维嵌入,然后基于各子流形间的内部位置关系将其低维嵌入组合起来,得到最终的嵌入结果。实验结果表明,该算法在人造数据和实际的人脸图像数据上都能有效地计算出高维数据的低维嵌入结果。
流形學習已經成為機器學習與數據挖掘領域的一箇重要的研究課題。目前的流形學習算法都假設所研究的高維數據存在于同一箇流形上,併不能支持或者應用于大量存在的採樣于多流形上的高維數據。本文針對等維度的獨立多流形提齣瞭DC-ISOMAP算法。該算法首先通過從採樣密集點開始擴展切空間的方法將多流形準確分解為單箇流形,併逐箇計算其低維嵌入,然後基于各子流形間的內部位置關繫將其低維嵌入組閤起來,得到最終的嵌入結果。實驗結果錶明,該算法在人造數據和實際的人臉圖像數據上都能有效地計算齣高維數據的低維嵌入結果。
류형학습이경성위궤기학습여수거알굴영역적일개중요적연구과제。목전적류형학습산법도가설소연구적고유수거존재우동일개류형상,병불능지지혹자응용우대량존재적채양우다류형상적고유수거。본문침대등유도적독립다류형제출료DC-ISOMAP산법。해산법수선통과종채양밀집점개시확전절공간적방법장다류형준학분해위단개류형,병축개계산기저유감입,연후기우각자류형간적내부위치관계장기저유감입조합기래,득도최종적감입결과。실험결과표명,해산법재인조수거화실제적인검도상수거상도능유효지계산출고유수거적저유감입결과。
Manifold learning has become a hot issue in the field of machine learning and data mining. Its algorithms often assume that the data resides on a single manifold. And both the theories and algorithms are lacking when the data is supported on a mixture of manifolds. A new method, which is called DC-ISOMAP method, is proposed for the nonlinear dimensionality reduction of data lying on the separated multi-manifold with same intrinsic dimension. The main idea is first to decompose a given data set into several sub-manifolds by propagating the tangent subspace of the point with maximum sampling density to a separate sub-manifold, and then the low-dimensional embeddings of each sub-manifold is independently calculated. Finally the embeddings of all sub-manifolds are composed into their proper positions and orientations based on their inter-connections. Experimental results on synthetic data as well as real world images demonstrate that our approaches can construct an accurate low-dimensional representation of the data in an efficient manner.