计算机应用
計算機應用
계산궤응용
COMPUTER APPLICATION
2010年
3期
695-698
,共4页
半监督学习%集成学习%入侵检测
半鑑督學習%集成學習%入侵檢測
반감독학습%집성학습%입침검측
semi-supervised learning%ensemble learning%intrusion detection
针对入侵检测的标记数据难以获得的问题,提出一种基于集成学习的Self-training方法--正则化Self-training.该方法结合主动学习和正则化理论,利用无标记数据对已有的分类器(该分类器对分类模式已学习得很好)作进一步的改进.对三种主要的集成学习方法在不同标记数据比例下进行对比实验,实验结果表明:借助大量无标记数据可以改善组合分类器的分类边界,算法能显著地降低结果分类器的错误率.
針對入侵檢測的標記數據難以穫得的問題,提齣一種基于集成學習的Self-training方法--正則化Self-training.該方法結閤主動學習和正則化理論,利用無標記數據對已有的分類器(該分類器對分類模式已學習得很好)作進一步的改進.對三種主要的集成學習方法在不同標記數據比例下進行對比實驗,實驗結果錶明:藉助大量無標記數據可以改善組閤分類器的分類邊界,算法能顯著地降低結果分類器的錯誤率.
침대입침검측적표기수거난이획득적문제,제출일충기우집성학습적Self-training방법--정칙화Self-training.해방법결합주동학습화정칙화이론,이용무표기수거대이유적분류기(해분류기대분류모식이학습득흔호)작진일보적개진.대삼충주요적집성학습방법재불동표기수거비례하진행대비실험,실험결과표명:차조대량무표기수거가이개선조합분류기적분류변계,산법능현저지강저결과분류기적착오솔.
Regularization self-training is a new method based on ensemble learning. It can solve the problem of insufficient labeled training samples in intrusion detection. The proposed algorithm combined active learning and regularization theory, and utilized unlabeled data to improve the existing classifiers. The experiments were running on three main ensemble learning algorithms under different unlabeled rate. The results prove that the proposed method can improve the boundary of the ensemble classifiers, and reduce the error rate with the help of large amounts of unlabeled data.