电子学报
電子學報
전자학보
ACTA ELECTRONICA SINICA
2009年
11期
2489-2495
,共7页
曾志强%吴群%廖备水%高济
曾誌彊%吳群%廖備水%高濟
증지강%오군%료비수%고제
非平衡数据集%支持向量机%输入空间%特征空间%原像
非平衡數據集%支持嚮量機%輸入空間%特徵空間%原像
비평형수거집%지지향량궤%수입공간%특정공간%원상
imbalance data set%support vector machine%input space%feature space%pre-image
本文提出一种基于核SMOTE(Synthetic Minority Over-sampling Technique)的分类方法来处理支持向量机(SVM)在非平衡数据集上的分类问题.其核心思想是首先在特征空间中采用核SMOTE方法对少数类样本进行上采样,然后通过输入空间和特征空间的距离关系寻找所合成样本在输入空间的原像,最后再采用SVM对其进行训练.实验表明,核SMOTE方法所合成的样本质量高于SMOTE算法,从而有效提高SVM在非平衡数据集上的分类效果.
本文提齣一種基于覈SMOTE(Synthetic Minority Over-sampling Technique)的分類方法來處理支持嚮量機(SVM)在非平衡數據集上的分類問題.其覈心思想是首先在特徵空間中採用覈SMOTE方法對少數類樣本進行上採樣,然後通過輸入空間和特徵空間的距離關繫尋找所閤成樣本在輸入空間的原像,最後再採用SVM對其進行訓練.實驗錶明,覈SMOTE方法所閤成的樣本質量高于SMOTE算法,從而有效提高SVM在非平衡數據集上的分類效果.
본문제출일충기우핵SMOTE(Synthetic Minority Over-sampling Technique)적분류방법래처리지지향량궤(SVM)재비평형수거집상적분류문제.기핵심사상시수선재특정공간중채용핵SMOTE방법대소수류양본진행상채양,연후통과수입공간화특정공간적거리관계심조소합성양본재수입공간적원상,최후재채용SVM대기진행훈련.실험표명,핵SMOTE방법소합성적양본질량고우SMOTE산법,종이유효제고SVM재비평형수거집상적분류효과.
An approach based on kernel SMOTE (Synthetic Minority Over-sampling Technique) to solve classification on imbalance data set by Support Vector Machine (SVM) is presented. The method first oversamples the minority class in feature space by kernel SMOTE algorithm, then the pre-images of the synthetic instances are found based on a distance relation between feature space and input space. Finally, these pre-images are appended to the original data set to train a SVM. Experiments on real data sets indicate that compared with SMOTE approach, the samples constructed by the kernel SMOTE algorithm have the higher quality.As a result, the effectiveness of classification by SVM on imbalance data set is unproved.