计算机系统应用
計算機繫統應用
계산궤계통응용
APPLICATIONS OF THE COMPUTER SYSTEMS
2012年
12期
203-205
,共3页
中文深层网络%分词算法%词典%模式匹配%接口集成
中文深層網絡%分詞算法%詞典%模式匹配%接口集成
중문심층망락%분사산법%사전%모식필배%접구집성
Chinese deep web%segmentation algorithm%lexical dictionary%schema matching%interface integration
目前国内外在深层网络方面的研究几乎都围绕英文环境进行,还没有针对中文深层网络的研究.提出了对中文深层网络进行模式匹配和接口集成的方法.该方法首先创建一个用来存储同义词、超义词和子义词的字典,然后使用基于规则的分词算法将从接口中抽取的属性分成词.对于每一个属性,从定义的字典中找到其对应的所有同义词、超义词和子义词,生成一条相应的记录并存储到列表中,再从每条记录中选取出现次数最多的属性作为联合接口的属性.
目前國內外在深層網絡方麵的研究幾乎都圍繞英文環境進行,還沒有針對中文深層網絡的研究.提齣瞭對中文深層網絡進行模式匹配和接口集成的方法.該方法首先創建一箇用來存儲同義詞、超義詞和子義詞的字典,然後使用基于規則的分詞算法將從接口中抽取的屬性分成詞.對于每一箇屬性,從定義的字典中找到其對應的所有同義詞、超義詞和子義詞,生成一條相應的記錄併存儲到列錶中,再從每條記錄中選取齣現次數最多的屬性作為聯閤接口的屬性.
목전국내외재심층망락방면적연구궤호도위요영문배경진행,환몰유침대중문심층망락적연구.제출료대중문심층망락진행모식필배화접구집성적방법.해방법수선창건일개용래존저동의사、초의사화자의사적자전,연후사용기우규칙적분사산법장종접구중추취적속성분성사.대우매일개속성,종정의적자전중조도기대응적소유동의사、초의사화자의사,생성일조상응적기록병존저도렬표중,재종매조기록중선취출현차수최다적속성작위연합접구적속성.
Many researches about deep web focus on the deep web with English language, ignoring that with Chinese. In this paper, we present our work in schema matching and interface integration for Chinese deep web. We create a dictionary, which stores synonyms, hypernyms and hyponyms, at the very beginning. After interface extracting, we use Principle-based Segmentation algorithm to segment each attribute into words. Then, for each attribute, we look up the pre-created dictionary to find all its synonyms, hypernyms and hyponyms, form a record and store them in a list. Furthermore, we keep a counter for each attribute in the list to record times it appearing in the local interfaces. At last, we choose from each record a synonym with the largest count number as the attribute of union interface.