计算机工程
計算機工程
계산궤공정
COMPUTER ENGINEERING
2014年
12期
126-131
,共6页
知识发现%模式发现%自然语言处理%算法知识名称%中文分词%词性标注
知識髮現%模式髮現%自然語言處理%算法知識名稱%中文分詞%詞性標註
지식발현%모식발현%자연어언처리%산법지식명칭%중문분사%사성표주
knowledge discovery%pattern discovery%natural language processing%terms of algorithmic knowledge%Chinese word segmentation%part-of-speech tagging
网络中的很多程序资源在知识概念上有内在的联系,却没有超链接将它们连接在一起。将网络程序资源中的算法知识名称获取出来,组织成一个算法知识专家库文件,用于识别程序设计资源所含的知识点,即可将程序设计资源按知识点相互联系。为了自动获取程序资源中的算法知识名称,提出一种基于自然语言处理的算法知识名称发现方法。通过发现含有算法知识名称语句的字符串模式,从程序资源中提取可能含算法知识名称的字符串,从中找出最有可能出现在算法知识名称中的分词,并根据这些分词获取算法知识名称。实验结果表明,与原有人工整理出的算法知识名称集合相比,该方法新增了11.2%的算法知识点和13.6%的算法知识名称。
網絡中的很多程序資源在知識概唸上有內在的聯繫,卻沒有超鏈接將它們連接在一起。將網絡程序資源中的算法知識名稱穫取齣來,組織成一箇算法知識專傢庫文件,用于識彆程序設計資源所含的知識點,即可將程序設計資源按知識點相互聯繫。為瞭自動穫取程序資源中的算法知識名稱,提齣一種基于自然語言處理的算法知識名稱髮現方法。通過髮現含有算法知識名稱語句的字符串模式,從程序資源中提取可能含算法知識名稱的字符串,從中找齣最有可能齣現在算法知識名稱中的分詞,併根據這些分詞穫取算法知識名稱。實驗結果錶明,與原有人工整理齣的算法知識名稱集閤相比,該方法新增瞭11.2%的算法知識點和13.6%的算法知識名稱。
망락중적흔다정서자원재지식개념상유내재적련계,각몰유초련접장타문련접재일기。장망락정서자원중적산법지식명칭획취출래,조직성일개산법지식전가고문건,용우식별정서설계자원소함적지식점,즉가장정서설계자원안지식점상호련계。위료자동획취정서자원중적산법지식명칭,제출일충기우자연어언처리적산법지식명칭발현방법。통과발현함유산법지식명칭어구적자부천모식,종정서자원중제취가능함산법지식명칭적자부천,종중조출최유가능출현재산법지식명칭중적분사,병근거저사분사획취산법지식명칭。실험결과표명,여원유인공정리출적산법지식명칭집합상비,해방법신증료11.2%적산법지식점화13.6%적산법지식명칭。
There are many programming resources on the Internet. Although these programming resources have internal relations,there are often no hyperlinks connecting them. Getting the terms of algorithmic knowledge,organizing the terms to an expert file,which is used for recognizing the knowledge in the programming resources,the programming resources can be connected by the knowledge. To get the terms of algorithmic knowledge,this paper proposes a method to discover terms of algorithmic knowledge based on natural language processing. This method consists of discovering the patterns of strings which contain terms of algorithmic knowledge,extracting from programming resources that probably contain terms of algorithmic knowledge according to the discovered patterns,finding the word segmentation most likely appearing in the terms of algorithmic knowledge,and fetching the terms of algorithmic knowledge according to the word segmentation. This method increases 11 . 2% algorithmic knowledge and 13 . 6% terms of algorithmic knowledge in comparison with the manual collection of terms of algorithmic knowledge which is obtained by previous work.