计算机研究与发展
計算機研究與髮展
계산궤연구여발전
JOURNAL OF COMPUTER RESEARCH AND DEVELOPMENT
2009年
11期
1925-1933
,共9页
曾志强%吴群%廖备水%朱顺痣
曾誌彊%吳群%廖備水%硃順痣
증지강%오군%료비수%주순지
序贯最小优化%工作集%可行方向%缓存%收敛性
序貫最小優化%工作集%可行方嚮%緩存%收斂性
서관최소우화%공작집%가행방향%완존%수렴성
sequential minimal optimization%working set%feasible direction%cache%convergence
针对标准序贯最小优化(sequential minimal optimization, SMO)算法采用可行方向工作集选择策略所带来的缓存命中率低下问题,给出了SMO类型算法每次迭代所带来的目标函数下降量的二阶表达式,并据此提出了一种改进的工作集选择策略.新策略综合考虑算法收敛所需的迭代次数及缓存效率,从总体上减少了核函数计算次数,因此极大提高了训练效率,并且,它在理论上具有严格的收敛保障.实验结果表明,核函数越复杂,样本维度越高,缓存容量相对训练样本的规模越小,改进工作集选择策略的SMO算法相较于标准SMO算法的性能提高就越显著.
針對標準序貫最小優化(sequential minimal optimization, SMO)算法採用可行方嚮工作集選擇策略所帶來的緩存命中率低下問題,給齣瞭SMO類型算法每次迭代所帶來的目標函數下降量的二階錶達式,併據此提齣瞭一種改進的工作集選擇策略.新策略綜閤攷慮算法收斂所需的迭代次數及緩存效率,從總體上減少瞭覈函數計算次數,因此極大提高瞭訓練效率,併且,它在理論上具有嚴格的收斂保障.實驗結果錶明,覈函數越複雜,樣本維度越高,緩存容量相對訓練樣本的規模越小,改進工作集選擇策略的SMO算法相較于標準SMO算法的性能提高就越顯著.
침대표준서관최소우화(sequential minimal optimization, SMO)산법채용가행방향공작집선택책략소대래적완존명중솔저하문제,급출료SMO류형산법매차질대소대래적목표함수하강량적이계표체식,병거차제출료일충개진적공작집선택책략.신책략종합고필산법수렴소수적질대차수급완존효솔,종총체상감소료핵함수계산차수,인차겁대제고료훈련효솔,병차,타재이론상구유엄격적수렴보장.실험결과표명,핵함수월복잡,양본유도월고,완존용량상대훈련양본적규모월소,개진공작집선택책략적SMO산법상교우표준SMO산법적성능제고취월현저.
Working set selection is an important step in the sequential minimal optimization (SMO) type methods for training support vector machine (SVM). However, the feasible direction strategy for selecting working set may degrade the performance of the kernel cache maintained in standard SMO. In this paper, an improved strategy for selecting working set applied in SMO is presented to handle such difficulties based on the decrease of objective function corresponding to second order information. The new strategy takes into consideration both iteration times and kernel cache performance related to the selection of working set in order to improve the efficiency of the kernel cache, which leads to reduction of the number of kernel evaluation of the algorithm as a whole. As a result, the training efficiency of the new method upgrades greatly compared with the older version. On the other hand, the SMO with the new strategy of working set selection is guaranteed to converge to an optimal solution in theory. It is shown in the experiments on the well-known data sets that the proposed method is remarkably faster than the standard SMO. The more complex the kernel is, the higher the dimensional spaces are, and the relatively smaller the cache is, the greater the improvement is.