计算机应用研究
計算機應用研究
계산궤응용연구
Application Research of Computers
2015年
11期
3319-3323
,共5页
王娜娜%黄运有%唐素勤%王石%曹存根
王娜娜%黃運有%唐素勤%王石%曹存根
왕나나%황운유%당소근%왕석%조존근
术语%术语间关系%关系公理%文本知识获取%术语关系获取%术语关系验证
術語%術語間關繫%關繫公理%文本知識穫取%術語關繫穫取%術語關繫驗證
술어%술어간관계%관계공리%문본지식획취%술어관계획취%술어관계험증
term%relationship between terms%axiom of relationships%knowledge acquisition from text%term relationship acquisition%term relationship verification
为了从海量数据中获取知识,提出了术语间关系的抽取方法:定义了上下位关系和整分关系,在此基础上新增加了 N 条关系,从关系的定义域和值域、关系的限制条件、关系的公理等出发,准确地定义出关系的内涵,并根据关系的内涵定义出关系的语义特征。最后针对关系表达的灵活性,总结出了关系表述的语法特征和表述文法。结合上述语法特征和语义特征,编写了一种可执行的知识抽取程序———OMKast 程序,并从原始文本语料中抽取关系。利用语义特征和统计学的方法验证抽取出的关系。实验结果表明,该方法具有良好的有效性。
為瞭從海量數據中穫取知識,提齣瞭術語間關繫的抽取方法:定義瞭上下位關繫和整分關繫,在此基礎上新增加瞭 N 條關繫,從關繫的定義域和值域、關繫的限製條件、關繫的公理等齣髮,準確地定義齣關繫的內涵,併根據關繫的內涵定義齣關繫的語義特徵。最後針對關繫錶達的靈活性,總結齣瞭關繫錶述的語法特徵和錶述文法。結閤上述語法特徵和語義特徵,編寫瞭一種可執行的知識抽取程序———OMKast 程序,併從原始文本語料中抽取關繫。利用語義特徵和統計學的方法驗證抽取齣的關繫。實驗結果錶明,該方法具有良好的有效性。
위료종해량수거중획취지식,제출료술어간관계적추취방법:정의료상하위관계화정분관계,재차기출상신증가료 N 조관계,종관계적정의역화치역、관계적한제조건、관계적공리등출발,준학지정의출관계적내함,병근거관계적내함정의출관계적어의특정。최후침대관계표체적령활성,총결출료관계표술적어법특정화표술문법。결합상술어법특정화어의특정,편사료일충가집행적지식추취정서———OMKast 정서,병종원시문본어료중추취관계。이용어의특정화통계학적방법험증추취출적관계。실험결과표명,해방법구유량호적유효성。
In order to obtain knowledge from big data,this paper proposed a method of term relationship extraction.Firstly,it defined the hyponymy and part whole relationship and added new relationship based on these relationship.Then accurately de-fined the connotation of the relationship by defining the relationship’s domain,range,constraints,axiom etc,and defined the semantic features of the relationship based on the connotation of the relationship.Finally it summarized the grammatical fea-tures and grammer of the expression of relationship in view of the flexibility of the expression of relationship.With the help of the grammatical features and the semantic features.A program was wrote and used to extract the relationship from the text.It used the semantic features and statistical method to validate the set extracted from corpus with above relationship.Experimental results demonstrate the feasibility of the method.