通信学报
通信學報
통신학보
JOURNAL OF CHINA INSTITUTE OF COMMUNICATIONS
2013年
5期
42-51
,共10页
夏战国%夏士雄%蔡世玉%万玲
夏戰國%夏士雄%蔡世玉%萬玲
하전국%하사웅%채세옥%만령
类不均衡%半监督%高斯过程分类%自训练
類不均衡%半鑑督%高斯過程分類%自訓練
류불균형%반감독%고사과정분류%자훈련
class imbalance%semi-supervised%Gaussian process classification%self-training
针对传统的监督学习方法难以解决真实数据集标记信息少、训练样本集中存在类不均衡的问题,提出了类不均衡的半监督高斯过程分类算法.算法引入自训练的半监督学习思想,结合高斯过程分类算法计算后验概率,向未标记数据中注入类标记以获得更多准确可信的标记数据,使得训练样本的类分布相对平衡,分类器自适应优化以获得较好的分类效果.实验结果表明,在类不均衡的训练样本及标记信息过少的情况下,该算法通过自训练分类器获得了有效标记,使分类精度得到了有效提高,为解决类不均衡数据分类提供了一个新的思路.
針對傳統的鑑督學習方法難以解決真實數據集標記信息少、訓練樣本集中存在類不均衡的問題,提齣瞭類不均衡的半鑑督高斯過程分類算法.算法引入自訓練的半鑑督學習思想,結閤高斯過程分類算法計算後驗概率,嚮未標記數據中註入類標記以穫得更多準確可信的標記數據,使得訓練樣本的類分佈相對平衡,分類器自適應優化以穫得較好的分類效果.實驗結果錶明,在類不均衡的訓練樣本及標記信息過少的情況下,該算法通過自訓練分類器穫得瞭有效標記,使分類精度得到瞭有效提高,為解決類不均衡數據分類提供瞭一箇新的思路.
침대전통적감독학습방법난이해결진실수거집표기신식소、훈련양본집중존재류불균형적문제,제출료류불균형적반감독고사과정분류산법.산법인입자훈련적반감독학습사상,결합고사과정분류산법계산후험개솔,향미표기수거중주입류표기이획득경다준학가신적표기수거,사득훈련양본적류분포상대평형,분류기자괄응우화이획득교호적분류효과.실험결과표명,재류불균형적훈련양본급표기신식과소적정황하,해산법통과자훈련분류기획득료유효표기,사분류정도득도료유효제고,위해결류불균형수거분류제공료일개신적사로.
The traditional supervised learning is difficult to deal with real-world datasets with less labeled information when the training sets class is imbalanced. Therefore, a new semi-supervised Gaussian process classification of address-ing was proposed. The semi-supervised Gaussian process was realized by calculating the posterior probability to obtain more accurate and credible labeled data, and embarking from self-training semi-supervised methods to add class label in-to the unlabeled data. The algorithm makes the distribution of training samples relatively balance, so the classifier can adaptively optimized to obtain better effect of classification. According to the experimental results, when the circums-tances of training set are class imbalance and much lack of label information, The algorithm improves the accuracy by obtaining effective labeled in comparison with other related works and provides a new idea for addressing the class im-balance is demonstrated.