计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2015年
3期
117-123
,共7页
Deep Web%模式匹配%相关性挖掘%抽样
Deep Web%模式匹配%相關性挖掘%抽樣
Deep Web%모식필배%상관성알굴%추양
Deep Web%schema matching%dual correlation mining%sampling
针对DCM(Dual Correlation Mining)框架匹配特殊模式集时查准率低下的缺陷,借鉴机器学习领域中的bagging方法,提出一种基于抽样的Deep Web模式匹配框架。该框架随机在模式集中抽取多个子模式集,分别对子模式集进行复杂匹配,集成各个子模式集的匹配结果,在整体上提高匹配的查准率。分析与实验证明该框架在处理特殊模式集时,平均能提高查准率41.2%。
針對DCM(Dual Correlation Mining)框架匹配特殊模式集時查準率低下的缺陷,藉鑒機器學習領域中的bagging方法,提齣一種基于抽樣的Deep Web模式匹配框架。該框架隨機在模式集中抽取多箇子模式集,分彆對子模式集進行複雜匹配,集成各箇子模式集的匹配結果,在整體上提高匹配的查準率。分析與實驗證明該框架在處理特殊模式集時,平均能提高查準率41.2%。
침대DCM(Dual Correlation Mining)광가필배특수모식집시사준솔저하적결함,차감궤기학습영역중적bagging방법,제출일충기우추양적Deep Web모식필배광가。해광가수궤재모식집중추취다개자모식집,분별대자모식집진행복잡필배,집성각개자모식집적필배결과,재정체상제고필배적사준솔。분석여실험증명해광가재처리특수모식집시,평균능제고사준솔41.2%。
The dual correlation mining frame has a low precision when some special schemas are in the set. Inspired by bagging algorithm in machine learning, a schema matching frame based on sampling is proposed. The frame randomly sample several subsets form input schemas, then execute the DCM matcher on each subset. The frame will achieve a robust matching accuracy by synthesizing the results of each subset. Experimental results show that the precision is increased by 41.2%in average.