合肥工业大学学报(自然科学版)
閤肥工業大學學報(自然科學版)
합비공업대학학보(자연과학판)
Journal of Hefei University of Technology (Natural Science)
2015年
11期
1488-1492
,共5页
跨领域%特征选择%情感分类
跨領域%特徵選擇%情感分類
과영역%특정선택%정감분류
cross-domain%feature selection%sentiment classification
已有的跨领域情感分类方法多通过抽取公共特征空间或建立领域特定特征间的映射关系来消减领域间的差异性,由于不考虑特征情感区分力的差异,使得公共特征空间及特征映射的求解往往不准确。具有高区分力的特征对于文本情感分类具有重要的意义,但标记的缺失使得已有的特征选择方法难以应用。文章基于特征选择方法,提出一种快速的跨领域情感分类方法(cross‐domain sentiment classification based on fea‐ture selection ,CSFS),构建源领域特征与目标领域特征的词共现矩阵,基于该矩阵对目标领域特征的情感区分力进行评估,在目标领域中选择出其中具有高情感区分力的特征;再利用源领域信息计算目标领域特征的情感语义大小,从而构建目标领域分类器。实验结果表明,该方法在保证准确率的前提下,大大提高了跨领域分类的效率。
已有的跨領域情感分類方法多通過抽取公共特徵空間或建立領域特定特徵間的映射關繫來消減領域間的差異性,由于不攷慮特徵情感區分力的差異,使得公共特徵空間及特徵映射的求解往往不準確。具有高區分力的特徵對于文本情感分類具有重要的意義,但標記的缺失使得已有的特徵選擇方法難以應用。文章基于特徵選擇方法,提齣一種快速的跨領域情感分類方法(cross‐domain sentiment classification based on fea‐ture selection ,CSFS),構建源領域特徵與目標領域特徵的詞共現矩陣,基于該矩陣對目標領域特徵的情感區分力進行評估,在目標領域中選擇齣其中具有高情感區分力的特徵;再利用源領域信息計算目標領域特徵的情感語義大小,從而構建目標領域分類器。實驗結果錶明,該方法在保證準確率的前提下,大大提高瞭跨領域分類的效率。
이유적과영역정감분류방법다통과추취공공특정공간혹건립영역특정특정간적영사관계래소감영역간적차이성,유우불고필특정정감구분력적차이,사득공공특정공간급특정영사적구해왕왕불준학。구유고구분력적특정대우문본정감분류구유중요적의의,단표기적결실사득이유적특정선택방법난이응용。문장기우특정선택방법,제출일충쾌속적과영역정감분류방법(cross‐domain sentiment classification based on fea‐ture selection ,CSFS),구건원영역특정여목표영역특정적사공현구진,기우해구진대목표영역특정적정감구분력진행평고,재목표영역중선택출기중구유고정감구분력적특정;재이용원영역신식계산목표영역특정적정감어의대소,종이구건목표영역분류기。실험결과표명,해방법재보증준학솔적전제하,대대제고료과영역분류적효솔。
Many existing cross‐domain sentiment classification methods reduce the distribution difference be‐tween domains by extracting a common sub‐space or establishing the mapping relationship between domain specific features ,and do not consider the difference of features ’ sentiment orientation .Some features with lower sentiment orientation will influence the result of sub‐space and mapping relationship .Features with higher sentiment orientation are important for sentiment classification .However ,it is difficult to apply exist‐ing feature selection methods on unlabeled data .In this paper ,a fast cross‐domain sentiment classification based on feature selection (CSFS) is proposed .Firstly ,the word co‐occurrence matrix between the source fea‐tures and target features is constructed ,the sentiment orientation of target domain features is evaluated ,and then words with higher sentiment orientation are selected as the feature space of target domain .Secondly ,the features in target domain are labeled using the source features ,and then a classifier is created based on the la‐beled features .The empirical result shows that CSFS highly improves the time efficiency of cross‐domain clas‐sification w hile maintaining the classification accuracy .