计算机工程与设计
計算機工程與設計
계산궤공정여설계
COMPUTER ENGINEERING AND DESIGN
2015年
7期
1808-1812
,共5页
文本分类%选择%迁移学习%集成bagging算法%负迁移
文本分類%選擇%遷移學習%集成bagging算法%負遷移
문본분류%선택%천이학습%집성bagging산법%부천이
text classification%selected%transfer learning%integrated bagging algorithm%negative transfer
针对目标域训练样本数量较少无法建立优质分类模型的问题,提出一种在迁移框架下基于集成bagging算法的跨领域分类方法。引入源域的数据并对其进行筛选,对混合数据集进行学习,建立基于集成bagging算法的分类模型,投票得出预测结果。仿真对比结果表明,采用基于贝叶斯个体分类器的集成bagging算法能够优化源域的迁移,提升目标域的分类准确率及泛化性能。分析源域的噪音数据数量,其结果表明,该算法可以部分规避负迁移。
針對目標域訓練樣本數量較少無法建立優質分類模型的問題,提齣一種在遷移框架下基于集成bagging算法的跨領域分類方法。引入源域的數據併對其進行篩選,對混閤數據集進行學習,建立基于集成bagging算法的分類模型,投票得齣預測結果。倣真對比結果錶明,採用基于貝葉斯箇體分類器的集成bagging算法能夠優化源域的遷移,提升目標域的分類準確率及汎化性能。分析源域的譟音數據數量,其結果錶明,該算法可以部分規避負遷移。
침대목표역훈련양본수량교소무법건립우질분류모형적문제,제출일충재천이광가하기우집성bagging산법적과영역분류방법。인입원역적수거병대기진행사선,대혼합수거집진행학습,건립기우집성bagging산법적분류모형,투표득출예측결과。방진대비결과표명,채용기우패협사개체분류기적집성bagging산법능구우화원역적천이,제승목표역적분류준학솔급범화성능。분석원역적조음수거수량,기결과표명,해산법가이부분규피부천이。
The high‐quality classification model can not be built due to the problem of the deletion of target training texts ,and a cross‐cutting classification method based on an integrated bagging algorithm was proposed under the transfer framework .Source data were selected and mixed data sets were studied for establishing the model based on the integrated bagging algorithm ,and fi‐nal results were predicted through voting .Comparing experimental results ,it shows that integrated bagging algorithm based on learner of Bayesian can obtain the best migration result ,higher classification accuracy and better generalization performance be‐tween the source and target domains .The analysis of the number of noisy source data shows that negative transfer can be partial‐ly prevented .