中文信息学报
中文信息學報
중문신식학보
JOURNAL OF CHINESE INFORMAITON PROCESSING
2010年
2期
91-95,121
,共6页
朱聪慧%赵铁军%韩习武%郑德权
硃聰慧%趙鐵軍%韓習武%鄭德權
주총혜%조철군%한습무%정덕권
人工智能%机器翻译%动词次范畴化%跨语言论元对应关系%自动获取%统计机器翻译
人工智能%機器翻譯%動詞次範疇化%跨語言論元對應關繫%自動穫取%統計機器翻譯
인공지능%궤기번역%동사차범주화%과어언론원대응관계%자동획취%통계궤기번역
artificial intelligence%machine translation%verb subcategorization%cross-lingual argument crrespondence%automatic acquisition%statistical machine translation
动词次范畴是根据句法行为对动词的进一步划分,它是由核心动词和一系列论元组成.其相关研究在英汉等多种语言方面都取得了较好的成果,但跨语言之间的研究还很少.该文提出了一种基于主动学习策略的英汉动词次范畴论元对应关系自动获取方法,这种方法可以在双语平行语料上,几乎不需要任何先验的语言学知识的情况下,自动获取英汉论元的对应关系.然后我们将这些对应关系加入了统计机器翻译系统.实验结果表明,融合了英汉动词次范畴论元对应关系的SMT系统在性能上有明显的提升,证明了自动抽取的对应关系的有效性,也为SMT提供了新的研究方向.
動詞次範疇是根據句法行為對動詞的進一步劃分,它是由覈心動詞和一繫列論元組成.其相關研究在英漢等多種語言方麵都取得瞭較好的成果,但跨語言之間的研究還很少.該文提齣瞭一種基于主動學習策略的英漢動詞次範疇論元對應關繫自動穫取方法,這種方法可以在雙語平行語料上,幾乎不需要任何先驗的語言學知識的情況下,自動穫取英漢論元的對應關繫.然後我們將這些對應關繫加入瞭統計機器翻譯繫統.實驗結果錶明,融閤瞭英漢動詞次範疇論元對應關繫的SMT繫統在性能上有明顯的提升,證明瞭自動抽取的對應關繫的有效性,也為SMT提供瞭新的研究方嚮.
동사차범주시근거구법행위대동사적진일보화분,타시유핵심동사화일계렬론원조성.기상관연구재영한등다충어언방면도취득료교호적성과,단과어언지간적연구환흔소.해문제출료일충기우주동학습책략적영한동사차범주론원대응관계자동획취방법,저충방법가이재쌍어평행어료상,궤호불수요임하선험적어언학지식적정황하,자동획취영한론원적대응관계.연후아문장저사대응관계가입료통계궤기번역계통.실험결과표명,융합료영한동사차범주론원대응관계적SMT계통재성능상유명현적제승,증명료자동추취적대응관계적유효성,야위SMT제공료신적연구방향.
The verb subcategorization (SCF) is a more brief classification based on syntactic behaviors of verb and it is composed by a verb and several arguments. Recently it has attracted substantial researches for a single language, e.g. English and Chinese, whereas the cross-lingual subcategorization demands more systematic efforts. We present a novel method to obtain SCF argument crrespondence between Chinese and English based on active learning. This method can find the new relations through bilingual parallel sentence pairs almost without any priori language knowledge. We also integrated these relations to the statistical machine translation (SMT) system and experiment results show that the performance of SMT combined bilingual argument relationships has significant improvement, which indicates the validity of argument corresponding relationships automatically obtained.