电子学报
電子學報
전자학보
Acta Electronica Sinica
2015年
9期
1745-1749
,共5页
章登义%吴文李%欧阳黜霏
章登義%吳文李%歐暘黜霏
장등의%오문리%구양출비
不确定资源描述框架图%查询处理%选择基数估计%查询优化
不確定資源描述框架圖%查詢處理%選擇基數估計%查詢優化
불학정자원묘술광가도%사순처리%선택기수고계%사순우화
uncertain RDF graph%query processing%selectivity estimation%query optimization
资源描述框架图查询中,准确估计查询结果的大小是查询优化器中的关键步骤。已有方法忽略了该图自身的不确定性以及子查询间的关联关系,无法有效估计结果。针对该问题,本文提出一种基于贝叶斯模型的基数估计方法。该方法引入贝叶斯网络模型,挖掘出子查询内的属性依赖。同时,在这些属性依赖的基础上提出子网拼接方法,计算出子查询间的影响因子。最后,利用以上信息准确估计出任意查询结果集的基数。实验表明:与已有方法相比,本文方法的准确性提高15%以上,性能没有大幅度下降。
資源描述框架圖查詢中,準確估計查詢結果的大小是查詢優化器中的關鍵步驟。已有方法忽略瞭該圖自身的不確定性以及子查詢間的關聯關繫,無法有效估計結果。針對該問題,本文提齣一種基于貝葉斯模型的基數估計方法。該方法引入貝葉斯網絡模型,挖掘齣子查詢內的屬性依賴。同時,在這些屬性依賴的基礎上提齣子網拼接方法,計算齣子查詢間的影響因子。最後,利用以上信息準確估計齣任意查詢結果集的基數。實驗錶明:與已有方法相比,本文方法的準確性提高15%以上,性能沒有大幅度下降。
자원묘술광가도사순중,준학고계사순결과적대소시사순우화기중적관건보취。이유방법홀략료해도자신적불학정성이급자사순간적관련관계,무법유효고계결과。침대해문제,본문제출일충기우패협사모형적기수고계방법。해방법인입패협사망락모형,알굴출자사순내적속성의뢰。동시,재저사속성의뢰적기출상제출자망병접방법,계산출자사순간적영향인자。최후,이용이상신식준학고계출임의사순결과집적기수。실험표명:여이유방법상비,본문방법적준학성제고15%이상,성능몰유대폭도하강。
In RDF(Resource Description Framework)graph query,accurately estimating the size of the query result is a cru-cial step to the query optimizer.The previous work,which ignores both the uncertainty of RDF graph itself and the correlations be-tween subqueries,is difficult to obtain accurate estimations.To solve this problem,this paper proposes an estimation method based on Bayesian probability model.Our method introduces Bayesian network model for subqueries to dig out the dependencies between properties in subqueries.At the meanwhile,based on these dependencies we propose a connection approach of subnets to compute the impact factors between subqueries.Finally,we exploit the above information to accurately estimate the cardinality of the result about an arbitrary query.The experiments indicate that the accuracy of our estimation results is improved by over 15% and that the query run-time is not increased significantly in comparison with the previous art.