计算机科学与探索
計算機科學與探索
계산궤과학여탐색
JOURNAL OF FRONTIERS OF COMPUTER SCIENCE & TECHNOLOGY
2015年
1期
24-35
,共12页
姚华传%王丽珍%陈红梅%邹目权
姚華傳%王麗珍%陳紅梅%鄒目權
요화전%왕려진%진홍매%추목권
网格微分算法%质心%σ2微分格%空间实例压缩率
網格微分算法%質心%σ2微分格%空間實例壓縮率
망격미분산법%질심%σ2미분격%공간실례압축솔
grid differential algorithm%centroid%σ2 differential grid%compression ratio of spatial instances
空间co-location模式挖掘是空间数据挖掘的一个重要任务,目前无论是挖掘确定数据,还是不确定数据,算法的时间和空间效率都不高,更谈不上对海量数据进行挖掘。为此,在深入分析传统挖掘方式过度消耗时间和空间资源的根本原因的基础上,提出了网格微分挖掘co-location模式的算法。新算法在传统网格基础上实施微分,求出各微分格中属于同一特征的实例质心,并基于这些质心进行多分辨剪枝co-location模式挖掘。算法在保证具有较高准确率的前提下,较好地解决了传统挖掘方式中存在的效率问题,从而解决了面向海量数据进行空间co-location模式挖掘的难题。大量实验证明,网格微分算法具有高效性、稳健性和高准确率等优点。
空間co-location模式挖掘是空間數據挖掘的一箇重要任務,目前無論是挖掘確定數據,還是不確定數據,算法的時間和空間效率都不高,更談不上對海量數據進行挖掘。為此,在深入分析傳統挖掘方式過度消耗時間和空間資源的根本原因的基礎上,提齣瞭網格微分挖掘co-location模式的算法。新算法在傳統網格基礎上實施微分,求齣各微分格中屬于同一特徵的實例質心,併基于這些質心進行多分辨剪枝co-location模式挖掘。算法在保證具有較高準確率的前提下,較好地解決瞭傳統挖掘方式中存在的效率問題,從而解決瞭麵嚮海量數據進行空間co-location模式挖掘的難題。大量實驗證明,網格微分算法具有高效性、穩健性和高準確率等優點。
공간co-location모식알굴시공간수거알굴적일개중요임무,목전무론시알굴학정수거,환시불학정수거,산법적시간화공간효솔도불고,경담불상대해량수거진행알굴。위차,재심입분석전통알굴방식과도소모시간화공간자원적근본원인적기출상,제출료망격미분알굴co-location모식적산법。신산법재전통망격기출상실시미분,구출각미분격중속우동일특정적실례질심,병기우저사질심진행다분변전지co-location모식알굴。산법재보증구유교고준학솔적전제하,교호지해결료전통알굴방식중존재적효솔문제,종이해결료면향해량수거진행공간co-location모식알굴적난제。대량실험증명,망격미분산법구유고효성、은건성화고준학솔등우점。
Spatial co-location patterns mining is an important task in spatial data mining, but the efficiencies of running time and space are low for traditional mining algorithms of determination data and uncertain data, not to mention the massive data. Therefore, based on the analysis of why traditional mining algorithms consumed excessive time and space resources, this paper proposes a grid differential algorithm to mine spatial co-location patterns. The new algorithm divides the traditional grids into differential ones, and then calculates the centroids of instances that belong to the same feature for each differential grid. Finally, based on these centroids, the co-location patterns are mined with multiresolution pruning method. The proposed algorithm greatly improves the overall efficiency and has a high accuracy rate, which better solves the problem of mining spatial co-location patterns from a massive data set. Extensive experiments show that the grid differential algorithm has the advantages of high efficiency, robustness and high accuracy and so on.