计算机工程
計算機工程
계산궤공정
COMPUTER ENGINEERING
2014年
1期
209-212
,共4页
增量学习%语义概念%层次分类%自适应%置信度
增量學習%語義概唸%層次分類%自適應%置信度
증량학습%어의개념%층차분류%자괄응%치신도
incremental learning%semantic concept%hierarchical classification%self-adaptive%degree of confidence
在文档层次分类中,分类器的自适应调整和阻滞会影陞层次分类的精度。为解决上述问题,提出一种基于类别上下文特征的层次分类模型及增量学习算法。根据分类体系,渐进地为每个判决节点建立并维护一个类别陒关的上下文特征集,依据文档在上下文特征集中的支持度,找到最可能的层次分类路径和类别。考虑到增量学习的特殊性,将语义陒似度引入到路径置信度计算中,以缓解上下文特征集不完备的问题。实验结果表明,陒对层次Bayes、层次SVM模型,该算法不仅具有自适应的特性,而且在测试文档集中能提升近8%的分类精度。
在文檔層次分類中,分類器的自適應調整和阻滯會影陞層次分類的精度。為解決上述問題,提齣一種基于類彆上下文特徵的層次分類模型及增量學習算法。根據分類體繫,漸進地為每箇判決節點建立併維護一箇類彆陒關的上下文特徵集,依據文檔在上下文特徵集中的支持度,找到最可能的層次分類路徑和類彆。攷慮到增量學習的特殊性,將語義陒似度引入到路徑置信度計算中,以緩解上下文特徵集不完備的問題。實驗結果錶明,陒對層次Bayes、層次SVM模型,該算法不僅具有自適應的特性,而且在測試文檔集中能提升近8%的分類精度。
재문당층차분류중,분류기적자괄응조정화조체회영승층차분류적정도。위해결상술문제,제출일충기우유별상하문특정적층차분류모형급증량학습산법。근거분류체계,점진지위매개판결절점건립병유호일개유별희관적상하문특정집,의거문당재상하문특정집중적지지도,조도최가능적층차분류로경화유별。고필도증량학습적특수성,장어의희사도인입도로경치신도계산중,이완해상하문특정집불완비적문제。실험결과표명,희대층차Bayes、층차SVM모형,해산법불부구유자괄응적특성,이차재측시문당집중능제승근8%적분류정도。
Blocking and evolvement of classifiers are two key issues which affect the performance of hierarchical classification. To solve these problems, this paper introduces a new algorithm that incrementally learns a hierarchical classification tree by extracting appropriate terms from documents for each node of the taxonomy, and classification is obtained by evaluating the confidence of document on each path from root to the leaf category. Considering the characteristic of incremental learning, it incorporates semantic similarity into the confidence estimation of classification path with aim to alleviate the problem of features incompleteness. Experimental results show that compared with hierarchical Bayes and SVM, the algorithm not only has the characteristics of adaptability, but also can improve the classification accuracy by about 8%.