软件学报
軟件學報
연건학보
JOURNAL OF SOFTWARE
2013年
11期
2498-2507
,共10页
陶卿%高乾坤%姜纪远%储德军
陶卿%高乾坤%薑紀遠%儲德軍
도경%고건곤%강기원%저덕군
L1正则化%在线优化%随机优化%坐标优化
L1正則化%在線優化%隨機優化%坐標優化
L1정칙화%재선우화%수궤우화%좌표우화
L1-regularization%online optimization%stochastic optimization%coordinate optimization
机器学习正面临着数据规模日益扩大的严峻挑战,如何处理大规模甚至超大规模数据问题,是当前统计学习亟需解决的关键性科学问题。大规模机器学习问题的训练样本集合往往具有冗余和稀疏的特点,机器学习优化问题中的正则化项和损失函数也蕴含着特殊的结构含义,直接使用整个目标函数梯度的批处理黑箱方法不仅难以处理大规模问题,而且无法满足机器学习对结构的要求。目前,依靠机器学习自身特点驱动而迅速发展起来的坐标优化、在线和随机优化方法成为解决大规模问题的有效手段。针对L1正则化问题,介绍了这些大规模算法的一些研究进展。
機器學習正麵臨著數據規模日益擴大的嚴峻挑戰,如何處理大規模甚至超大規模數據問題,是噹前統計學習亟需解決的關鍵性科學問題。大規模機器學習問題的訓練樣本集閤往往具有冗餘和稀疏的特點,機器學習優化問題中的正則化項和損失函數也蘊含著特殊的結構含義,直接使用整箇目標函數梯度的批處理黑箱方法不僅難以處理大規模問題,而且無法滿足機器學習對結構的要求。目前,依靠機器學習自身特點驅動而迅速髮展起來的坐標優化、在線和隨機優化方法成為解決大規模問題的有效手段。針對L1正則化問題,介紹瞭這些大規模算法的一些研究進展。
궤기학습정면림착수거규모일익확대적엄준도전,여하처리대규모심지초대규모수거문제,시당전통계학습극수해결적관건성과학문제。대규모궤기학습문제적훈련양본집합왕왕구유용여화희소적특점,궤기학습우화문제중적정칙화항화손실함수야온함착특수적결구함의,직접사용정개목표함수제도적비처리흑상방법불부난이처리대규모문제,이차무법만족궤기학습대결구적요구。목전,의고궤기학습자신특점구동이신속발전기래적좌표우화、재선화수궤우화방법성위해결대규모문제적유효수단。침대L1정칙화문제,개소료저사대규모산법적일사연구진전。
Machine learning is facing a great challenge arising from the increasing scale of data. How to cope with the large-scale even huge-scale data is a key problem in the emerging area of statistical learning. Usually, there exist redundancy and sparsity in the training set of large-scale learning problems, and there are structural implications in the regularizer and loss function of a learning problem. If the gradient-type black-box methods are employed directly in batch settings, not only the large-scale problems cannot be solved but also the structural information implied by the machine learning cannot be exploited. Recently, the state-of-the-art scalable methods such as coordinate descent, online and stochastic algorithms, which are driven by the characteristics of machine learning, have become the dominant paradigms for large-scale problems. This paper focuses on L1-regularized problems and reviews some significant advances of these scalable algorithms.