电子科技
電子科技
전자과기
IT AGE
2012年
7期
69-71,75
,共4页
知网%中文分词%句子相似度%最大匹配
知網%中文分詞%句子相似度%最大匹配
지망%중문분사%구자상사도%최대필배
how-net%Chinese word segmentation%sentence similarity%maximum matching
针对基于词项的句子相似度计算存在信息冗余干扰和局部最优的缺陷,提出一种改进的基于知网的句子相似度计算方法。该方法通过增加筛选候选语句以降低冗余信息对准确度造成的干扰,同时在分词和词性标注的基础上,采用改进的带权最大二分图匹配算法获得全局最优匹配。实验结果表明,文中提出的方法有效地提高了句子相似度计算的准确度。
針對基于詞項的句子相似度計算存在信息冗餘榦擾和跼部最優的缺陷,提齣一種改進的基于知網的句子相似度計算方法。該方法通過增加篩選候選語句以降低冗餘信息對準確度造成的榦擾,同時在分詞和詞性標註的基礎上,採用改進的帶權最大二分圖匹配算法穫得全跼最優匹配。實驗結果錶明,文中提齣的方法有效地提高瞭句子相似度計算的準確度。
침대기우사항적구자상사도계산존재신식용여간우화국부최우적결함,제출일충개진적기우지망적구자상사도계산방법。해방법통과증가사선후선어구이강저용여신식대준학도조성적간우,동시재분사화사성표주적기출상,채용개진적대권최대이분도필배산법획득전국최우필배。실험결과표명,문중제출적방법유효지제고료구자상사도계산적준학도。
In order to overcome the defects of information redundancy interference and local optimum of sentence similarity calculation based on lexical item, this paper proposes a new sentence similarity calculation method based on how-net. This method reduces the interference of redundant information by adding a step of screening of statements, which obtains the global optimal maximal matching using the improved algorithm of maximum matching of weighted bigraph based on participle and speech tagging. The experimental results show that the method proposed in this paper can effectively improve the accuracy of sentence similarity computation.