计算机系统应用
計算機繫統應用
계산궤계통응용
APPLICATIONS OF THE COMPUTER SYSTEMS
2010年
4期
81-84
,共4页
文本分类%差异频度%类别空间模型%向量空间模型%二值分类
文本分類%差異頻度%類彆空間模型%嚮量空間模型%二值分類
문본분류%차이빈도%유별공간모형%향량공간모형%이치분류
text classification%difference frequency%class space model%vector space model%binary classification
针对目前文本分类中对向量空间模型的依赖以及文档频率(DF)特征提取方法在二值分类方面的不足,提出了基于差异频度的类别空间模型的二值分类方法,该方法突破了向量空间模型的限制,采用改进DF的差异频度方法进行特征提取,实现了二值分类功能.实验结果表明,改进的方法是有效的,其分类结果中精确率、召回率、F1测试值均有改善,提高了分类的准确率.并且本文的方法在其他领域的二值分类中同样值得借鉴.
針對目前文本分類中對嚮量空間模型的依賴以及文檔頻率(DF)特徵提取方法在二值分類方麵的不足,提齣瞭基于差異頻度的類彆空間模型的二值分類方法,該方法突破瞭嚮量空間模型的限製,採用改進DF的差異頻度方法進行特徵提取,實現瞭二值分類功能.實驗結果錶明,改進的方法是有效的,其分類結果中精確率、召迴率、F1測試值均有改善,提高瞭分類的準確率.併且本文的方法在其他領域的二值分類中同樣值得藉鑒.
침대목전문본분류중대향량공간모형적의뢰이급문당빈솔(DF)특정제취방법재이치분류방면적불족,제출료기우차이빈도적유별공간모형적이치분류방법,해방법돌파료향량공간모형적한제,채용개진DF적차이빈도방법진행특정제취,실현료이치분류공능.실험결과표명,개진적방법시유효적,기분류결과중정학솔、소회솔、F1측시치균유개선,제고료분류적준학솔.병차본문적방법재기타영역적이치분류중동양치득차감.
As current text classification depends on vector space model and document frequency lacks binary classification,a method based on class space model of difference frequency is presented in this paper.The method breaks the constraint on vector space model,and selects feature with difference frequency improved on document frequency,thus realizes the function of binary Classification.The experiment shows that improved method is effective.Three evaluation parameters,including Precision,Recall and F1,are improved in classification result,and classification precision is better.In addition,the method is worth learning in binary Classification of other areas.