情报学报
情報學報
정보학보
2009年
6期
857-863
,共7页
史树敏%冯冲%黄河燕%刘东升%王树梅
史樹敏%馮遲%黃河燕%劉東升%王樹梅
사수민%풍충%황하연%류동승%왕수매
领域实体%领域命名实体识别%本体%词性规则模板%CRFs
領域實體%領域命名實體識彆%本體%詞性規則模闆%CRFs
영역실체%영역명명실체식별%본체%사성규칙모판%CRFs
DNE%DNER%domain ontology%POS-Rule Template%CRFs
命名实体识别是众多自然语言处理任务的核心内容之一,也是近年来的领域研究热点.本文将命名实体分为两大类:常规命名实体和领域命名实体.基于已经构建的领域本体MPO,本文提出一种基于本体知识规则与统计方法相结合的领域命名实体识别方法.该方法通过本体化实例,获取实体构成词性规则模板,结合CRFs机器学习模型,进行领域命名实体识别.实验结果表明:相比运用单一统计方法而言,该方法能使领域实体的识别性能显著提高,F值达到92.36%.同时表明本体化知识规则的有效运用,能够在领域实体边界和特殊形式领域实体识别的准确率上发挥积极作用.
命名實體識彆是衆多自然語言處理任務的覈心內容之一,也是近年來的領域研究熱點.本文將命名實體分為兩大類:常規命名實體和領域命名實體.基于已經構建的領域本體MPO,本文提齣一種基于本體知識規則與統計方法相結閤的領域命名實體識彆方法.該方法通過本體化實例,穫取實體構成詞性規則模闆,結閤CRFs機器學習模型,進行領域命名實體識彆.實驗結果錶明:相比運用單一統計方法而言,該方法能使領域實體的識彆性能顯著提高,F值達到92.36%.同時錶明本體化知識規則的有效運用,能夠在領域實體邊界和特殊形式領域實體識彆的準確率上髮揮積極作用.
명명실체식별시음다자연어언처리임무적핵심내용지일,야시근년래적영역연구열점.본문장명명실체분위량대류:상규명명실체화영역명명실체.기우이경구건적영역본체MPO,본문제출일충기우본체지식규칙여통계방법상결합적영역명명실체식별방법.해방법통과본체화실례,획취실체구성사성규칙모판,결합CRFs궤기학습모형,진행영역명명실체식별.실험결과표명:상비운용단일통계방법이언,해방법능사영역실체적식별성능현저제고,F치체도92.36%.동시표명본체화지식규칙적유효운용,능구재영역실체변계화특수형식영역실체식별적준학솔상발휘적겁작용.
Named Entity Recognition ( NER) is one of kernel task in many Natural Language Processing ( NLP) applications,which has recently become the hot spot of research. Named Entities are classified into General Named Entities ( GNEs) and Domain Named Entities (DNEs) in this paper. We put forward a method of Chinese Domain Named Entity Recognition (DNER) which combining Conditional Random Field ( CRF) with the rule templates of POS based on formalized instances that acquired from domain ontology constructed already. Results of experiments indicate that such a method can improve effectively the performance on DNER and F-measure has reached 92.36 % . Experimental data also show that ontological knowledge can make great effect in recognizing the boundaries of DNEs and DNEs with special forms.