北京交通大学学报
北京交通大學學報
북경교통대학학보
JOURNAL OF NORTHERN JIAOTONG UNIVERSITY
2009年
6期
70-75
,共6页
林正奎%唐焕玲%鲁明羽%王敬东
林正奎%唐煥玲%魯明羽%王敬東
림정규%당환령%로명우%왕경동
AdaBoost%加权朴素贝叶斯%文本分类%特征多视图%样本权重
AdaBoost%加權樸素貝葉斯%文本分類%特徵多視圖%樣本權重
AdaBoost%가권박소패협사%문본분류%특정다시도%양본권중
AdaBoost%weighted Naive Bayesian classifier%text categorization%multiple views%examples′ weight
AdaBoost作为一种有效的集成学习方法,能够明显提高不稳定学习算法的分类正确率,但对稳定的Naive Bayesian分类算法的提升效果却不明显.为此,利用多种特征评估函数建立不同的特征视图,生成多个有差异的加权朴素贝叶斯(WNB)基分类器;尝试使用几种不同的方式将样本权重嵌入WNB基分类器的参数中,对WNB产生扰动,进一步增加基分类器的不稳定性.实验结果表明,对比AdaBoost所提算法,BoostMV-WNB能够明显提升WNB文本分类器的性能.
AdaBoost作為一種有效的集成學習方法,能夠明顯提高不穩定學習算法的分類正確率,但對穩定的Naive Bayesian分類算法的提升效果卻不明顯.為此,利用多種特徵評估函數建立不同的特徵視圖,生成多箇有差異的加權樸素貝葉斯(WNB)基分類器;嘗試使用幾種不同的方式將樣本權重嵌入WNB基分類器的參數中,對WNB產生擾動,進一步增加基分類器的不穩定性.實驗結果錶明,對比AdaBoost所提算法,BoostMV-WNB能夠明顯提升WNB文本分類器的性能.
AdaBoost작위일충유효적집성학습방법,능구명현제고불은정학습산법적분류정학솔,단대은정적Naive Bayesian분류산법적제승효과각불명현.위차,이용다충특정평고함수건립불동적특정시도,생성다개유차이적가권박소패협사(WNB)기분류기;상시사용궤충불동적방식장양본권중감입WNB기분류기적삼수중,대WNB산생우동,진일보증가기분류기적불은정성.실험결과표명,대비AdaBoost소제산법,BoostMV-WNB능구명현제승WNB문본분류기적성능.
AdaBoost, as an effective ensemble learning method, can improve the performance of unstable learning algorithms, yet works poorly with Naive Bayesian classifier due to its relative stability. So, a revised AdaBoost algorithm with weighted Naive Bayesian (WNB) classifier named BoostMV-WNB was proposed. Firstly, at boosting iterations, multi-views are constructed on the same training set in terms of different terms evaluation functions. Then diverse WNB classifiers are generated by using multiple views. Moreover, the weights of training examples are introduced to the parameters of WNB classifier utilizing a certain function. In this way, the base WNB classifiers become more unstable due to the perturbation. Experimental comparison shows that the BoostMV-WNB algorithm performs better than AdaBoost with WNB text categorization.