计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2015年
1期
125-129,171
,共6页
赵奇猛%王裴岩%冯好国%蔡东风
趙奇猛%王裴巖%馮好國%蔡東風
조기맹%왕배암%풍호국%채동풍
中文专利依存树库%开放式实体关系抽取%Markov逻辑网
中文專利依存樹庫%開放式實體關繫抽取%Markov邏輯網
중문전리의존수고%개방식실체관계추취%Markov라집망
Chinese patents dependency treebank%open entity relation extraction%Markov Logic Networks(MLN)
针对传统实体关系抽取需要预先指定关系类型和制定抽取规则等无法胜任大规模文本的情况,开放式信息抽取(Open Information Extraction,OIE)在以英语为代表的西方语言中取得了重大进展,但对于汉语的研究却显得不足。为此,研究了在组块层次标注基础上应用马尔可夫逻辑网分层次进行中文专利开放式实体关系抽取的方法。实验表明:以组块为出发点降低了对句子理解的难度,外层和内层组块可以统一处理,减少了工程代价;而且在相同特征条件下与支持向量机相比,基于马尔可夫逻辑网的关系抽取效果更理想,外层和内层识别结果的F值分别可达到77.92%和69.20%。
針對傳統實體關繫抽取需要預先指定關繫類型和製定抽取規則等無法勝任大規模文本的情況,開放式信息抽取(Open Information Extraction,OIE)在以英語為代錶的西方語言中取得瞭重大進展,但對于漢語的研究卻顯得不足。為此,研究瞭在組塊層次標註基礎上應用馬爾可伕邏輯網分層次進行中文專利開放式實體關繫抽取的方法。實驗錶明:以組塊為齣髮點降低瞭對句子理解的難度,外層和內層組塊可以統一處理,減少瞭工程代價;而且在相同特徵條件下與支持嚮量機相比,基于馬爾可伕邏輯網的關繫抽取效果更理想,外層和內層識彆結果的F值分彆可達到77.92%和69.20%。
침대전통실체관계추취수요예선지정관계류형화제정추취규칙등무법성임대규모문본적정황,개방식신식추취(Open Information Extraction,OIE)재이영어위대표적서방어언중취득료중대진전,단대우한어적연구각현득불족。위차,연구료재조괴층차표주기출상응용마이가부라집망분층차진행중문전리개방식실체관계추취적방법。실험표명:이조괴위출발점강저료대구자리해적난도,외층화내층조괴가이통일처리,감소료공정대개;이차재상동특정조건하여지지향량궤상비,기우마이가부라집망적관계추취효과경이상,외층화내층식별결과적F치분별가체도77.92%화69.20%。
The main goal of information extraction is to transform unstructured or semi-structured texts into structured information, in which entity relation extraction is a major task. In general, traditional methods require pre-specified relation types. But pre-defined rules and manual labels are not adaptive to massive texts. Recently, open information extraction can solve the problems properly. In contrast with the significant achievements concerning English and other Western languages, research on Chinese open relation extraction is quite scarce. The hierarchical Chinese open entity relation extraction approach is proposed that applies Markov Logic Networks(MLN)on the base of both external and internal chunk-tags. The experimental results reveal that the origin of chunks can simplify the understanding of sentences, and both layers can be handled consistently so that engineering efforts are reduced. And on the same conditions, MLN can perform better than SVM, in which the F-score of external and internal layers can reach 77.92%and 69.20%respectively.