计算机科学
計算機科學
계산궤과학
COMPUTER SCIENCE
2010年
1期
204-207
,共4页
概念漂移%选择性集成%朴素贝叶斯%error-ambiguity分解
概唸漂移%選擇性集成%樸素貝葉斯%error-ambiguity分解
개념표이%선택성집성%박소패협사%error-ambiguity분해
Concept drift%Selective ensemble%Naive bayes%Error-ambiguity decomposition
提出一种挖掘概念漂移数据流的选择性集成学习算法.该算法根据各基分类器在验证集上的输出结果向量方向与参考向量方向之间的偏离程度,选择参与集成的基分类器.分别在具有突发性和渐进性概念漂移的人造数据集SEA和Hyperplane上进行实验分析.实验结果表明,这种基分类器选择方法大幅度提高了集成算法在处理概念漂移数据流时的分类准确性.使用error-ambiguity分解对算法构建的naive Bayes集成在解决分类问题时的性能进行了分析.实验结果表明,算法成功的主要原因是它能显著降低平均泛化误差.
提齣一種挖掘概唸漂移數據流的選擇性集成學習算法.該算法根據各基分類器在驗證集上的輸齣結果嚮量方嚮與參攷嚮量方嚮之間的偏離程度,選擇參與集成的基分類器.分彆在具有突髮性和漸進性概唸漂移的人造數據集SEA和Hyperplane上進行實驗分析.實驗結果錶明,這種基分類器選擇方法大幅度提高瞭集成算法在處理概唸漂移數據流時的分類準確性.使用error-ambiguity分解對算法構建的naive Bayes集成在解決分類問題時的性能進行瞭分析.實驗結果錶明,算法成功的主要原因是它能顯著降低平均汎化誤差.
제출일충알굴개념표이수거류적선택성집성학습산법.해산법근거각기분류기재험증집상적수출결과향량방향여삼고향량방향지간적편리정도,선택삼여집성적기분류기.분별재구유돌발성화점진성개념표이적인조수거집SEA화Hyperplane상진행실험분석.실험결과표명,저충기분류기선택방법대폭도제고료집성산법재처리개념표이수거류시적분류준학성.사용error-ambiguity분해대산법구건적naive Bayes집성재해결분류문제시적성능진행료분석.실험결과표명,산법성공적주요원인시타능현저강저평균범화오차.
In data streams concept is often not stable but change with time.We proposed a selective integration algorithm OSEN (Orientation based Selected ENsemble) for handling concept drift data streams.This algorithm selects a near optimal subset of base classifiers based on the output of each base classifier on validation dataset.Our experiments with synthetic data sets simulating abrupt (SEA) and gradual (Hyperplane) concept drifts demonstrate that selective integration of classifiers built over small time intervals or fixed-sized data blocks can be significantly better than majority voting and weighted voting,which are currently the most commonly used integration techniques for handling concept drift with ensembles.This paper also explained the working mechanism of OSEN from error-ambiguity decomposition.Based on experiments,OSEN improves the generalization ability through reducing the average generalization error of the base classifiers constituting the ensembles.