CAJ | 학술논문

已有的跨领域情感分类方法多通过抽取公共特征空间或建立领域特定特征间的映射关系来消减领域间的差异性，由于不考虑特征情感区分力的差异，使得公共特征空间及特征映射的求解往往不准确。具有高区分力的特征对于文本情感分类具有重要的意义，但标记的缺失使得已有的特征选择方法难以应用。文章基于特征选择方法，提出一种快速的跨领域情感分类方法（cross‐domain sentiment classification based on fea‐ture selection ，CSFS），构建源领域特征与目标领域特征的词共现矩阵，基于该矩阵对目标领域特征的情感区分力进行评估，在目标领域中选择出其中具有高情感区分力的特征；再利用源领域信息计算目标领域特征的情感语义大小，从而构建目标领域分类器。实验结果表明，该方法在保证准确率的前提下，大大提高了跨领域分类的效率。
이유적과영역정감분류방법다통과추취공공특정공간혹건립영역특정특정간적영사관계래소감영역간적차이성，유우불고필특정정감구분력적차이，사득공공특정공간급특정영사적구해왕왕불준학。구유고구분력적특정대우문본정감분류구유중요적의의，단표기적결실사득이유적특정선택방법난이응용。문장기우특정선택방법，제출일충쾌속적과영역정감분류방법（cross‐domain sentiment classification based on fea‐ture selection ，CSFS），구건원영역특정여목표영역특정적사공현구진，기우해구진대목표영역특정적정감구분력진행평고，재목표영역중선택출기중구유고정감구분력적특정；재이용원영역신식계산목표영역특정적정감어의대소，종이구건목표영역분류기。실험결과표명，해방법재보증준학솔적전제하，대대제고료과영역분류적효솔。
Many existing cross‐domain sentiment classification methods reduce the distribution difference be‐tween domains by extracting a common sub‐space or establishing the mapping relationship between domain specific features ,and do not consider the difference of features ’ sentiment orientation .Some features with lower sentiment orientation will influence the result of sub‐space and mapping relationship .Features with higher sentiment orientation are important for sentiment classification .However ,it is difficult to apply exist‐ing feature selection methods on unlabeled data .In this paper ,a fast cross‐domain sentiment classification based on feature selection (CSFS) is proposed .Firstly ,the word co‐occurrence matrix between the source fea‐tures and target features is constructed ,the sentiment orientation of target domain features is evaluated ,and then words with higher sentiment orientation are selected as the feature space of target domain .Secondly ,the features in target domain are labeled using the source features ,and then a classifier is created based on the la‐beled features .The empirical result shows that CSFS highly improves the time efficiency of cross‐domain clas‐sification w hile maintaining the classification accuracy .