电子学报
電子學報
전자학보
ACTA ELECTRONICA SINICA
2015年
8期
1481-1487
,共7页
尹存燕%黄书剑%戴新宇%陈家骏
尹存燕%黃書劍%戴新宇%陳傢駿
윤존연%황서검%대신우%진가준
分词%命名实体识别%双语对齐%机器翻译
分詞%命名實體識彆%雙語對齊%機器翻譯
분사%명명실체식별%쌍어대제%궤기번역
word segmentation%named-entity recognition%alignment%machine translation
中文分词结果对中英命名实体识别及对齐有着直接的影响,本文提出了一种命名实体识别及对齐中的中文分词优化方法。该方法利用实体词汇的对齐信息,首先修正命名实体识别结果,然后根据实体对齐结果调整分词粒度、修正错误分词。分词优化后的结果使得双语命名实体尽可能多地实现一一对应,进而提高中英命名实体翻译抽取和统计机器翻译的效果。实验结果表明了本文优化方法的有效性。
中文分詞結果對中英命名實體識彆及對齊有著直接的影響,本文提齣瞭一種命名實體識彆及對齊中的中文分詞優化方法。該方法利用實體詞彙的對齊信息,首先脩正命名實體識彆結果,然後根據實體對齊結果調整分詞粒度、脩正錯誤分詞。分詞優化後的結果使得雙語命名實體儘可能多地實現一一對應,進而提高中英命名實體翻譯抽取和統計機器翻譯的效果。實驗結果錶明瞭本文優化方法的有效性。
중문분사결과대중영명명실체식별급대제유착직접적영향,본문제출료일충명명실체식별급대제중적중문분사우화방법。해방법이용실체사회적대제신식,수선수정명명실체식별결과,연후근거실체대제결과조정분사립도、수정착오분사。분사우화후적결과사득쌍어명명실체진가능다지실현일일대응,진이제고중영명명실체번역추취화통계궤기번역적효과。실험결과표명료본문우화방법적유효성。
Bilingual named entity recognition and alignment are important for many natural language processing.Named enti-ty translation can improve a lot the performance of the system like statistical machine translation or cross-language information re-trieval.Quality of Chinese word segmentation does have a big impact over named entity (NE)recognition and bilingual NE extrac-tion.Bilingual alignment information provides indications for NE recognition and word segmentation.Accordingly,based on the characteristics of NE recognition,NE alignment,and word segmentation,this paper proposes an optimization algorithm of Chinese word segmentation.By correcting word segmentation error and adjusting word segmentation granularity,the optimization algorithm can enhance extraction effect of Chinese-English NE translation and performance of statistical machine translation.The experimental result on Chinese-English news corpus shows the efficiency of our algorithm.