计算机科学与探索
計算機科學與探索
계산궤과학여탐색
JOURNAL OF FRONTIERS OF COMPUTER SCIENCE & TECHNOLOGY
2014年
6期
727-734
,共8页
不平衡数据集%分类%遗传算子%少数类样本合成过采样技术(SMOTE)
不平衡數據集%分類%遺傳算子%少數類樣本閤成過採樣技術(SMOTE)
불평형수거집%분류%유전산자%소수류양본합성과채양기술(SMOTE)
imbalanced dataset%classification%genetic operator%synthetic minority over-sampling technique (SMOTE)
针对SMOTE(synthetic minority over-sampling technique)在合成少数类新样本时存在的不足,提出了一种改进的SMOTE算法GA-SMOTE。该算法的关键将是遗传算法中的3个基本算子引入到SMOTE中,利用选择算子实现对少数类样本有区别的选择,使用交叉、变异算子实现对合成样本质量的控制.结合GA-SMOTE与SVM(support vector machine)算法来处理不平衡数据的分类问题.UCI数据集上的大量实验表明,GA-SMOTE在新样本的整体合成效果上表现出色,有效提高了SVM在不平衡数据集上的分类性能。
針對SMOTE(synthetic minority over-sampling technique)在閤成少數類新樣本時存在的不足,提齣瞭一種改進的SMOTE算法GA-SMOTE。該算法的關鍵將是遺傳算法中的3箇基本算子引入到SMOTE中,利用選擇算子實現對少數類樣本有區彆的選擇,使用交扠、變異算子實現對閤成樣本質量的控製.結閤GA-SMOTE與SVM(support vector machine)算法來處理不平衡數據的分類問題.UCI數據集上的大量實驗錶明,GA-SMOTE在新樣本的整體閤成效果上錶現齣色,有效提高瞭SVM在不平衡數據集上的分類性能。
침대SMOTE(synthetic minority over-sampling technique)재합성소수류신양본시존재적불족,제출료일충개진적SMOTE산법GA-SMOTE。해산법적관건장시유전산법중적3개기본산자인입도SMOTE중,이용선택산자실현대소수류양본유구별적선택,사용교차、변이산자실현대합성양본질량적공제.결합GA-SMOTE여SVM(support vector machine)산법래처리불평형수거적분류문제.UCI수거집상적대량실험표명,GA-SMOTE재신양본적정체합성효과상표현출색,유효제고료SVM재불평형수거집상적분류성능。
Based on analyzing the shortages of SMOTE (synthetic minority over-sampling technique) in the synthesis of minority class samples, this paper presents an improved SMOTE (GA-SMOTE). The key of GA-SMOTE lies on leading three basic genetic operators of genetic algorithm (GA) into SMOTE, making use of the selection operator to achieve the different samples from the minority class and depending on crossover operator and mutation operator to realize the fine control of the synthesis quality to the minority class samples. GA-SMOTE and SVM (support vector machine) are combined to handle the classification problem on imbalanced datasets. A large amount of experiments on the UCI datasets show that GA-SMOTE promises prominent synthesis effect to the minority class samples, and brings better classification performance on imbalanced datasets with SVM.