计算机科学与探索
計算機科學與探索
계산궤과학여탐색
JOURNAL OF FRONTIERS OF COMPUTER SCIENCE & TECHNOLOGY
2008年
1期
60-76
,共17页
常雷%杨冬青%王腾蛟%唐世渭
常雷%楊鼕青%王騰蛟%唐世渭
상뢰%양동청%왕등교%당세위
数据挖掘%序列模式压缩%SP-Feature
數據挖掘%序列模式壓縮%SP-Feature
수거알굴%서렬모식압축%SP-Feature
data mining%sequential pattern compression%SP-Feature
研究了如何使用SP-Feature来压缩序列模式.SP-Feature是一种简洁表示序列模式的新颖结构.一种新的相似性度量被用来聚类SP-Feature,同时也给出了SP-Feature的合并方法.基于层次聚类框架,设计了一种有效的挖掘压缩序列模式的算法CSP.在真实和模拟数据上的大量实验表明CSP能够快速有效地压缩序列模式(在稠密数据集上的恢复误差小于4%).
研究瞭如何使用SP-Feature來壓縮序列模式.SP-Feature是一種簡潔錶示序列模式的新穎結構.一種新的相似性度量被用來聚類SP-Feature,同時也給齣瞭SP-Feature的閤併方法.基于層次聚類框架,設計瞭一種有效的挖掘壓縮序列模式的算法CSP.在真實和模擬數據上的大量實驗錶明CSP能夠快速有效地壓縮序列模式(在稠密數據集上的恢複誤差小于4%).
연구료여하사용SP-Feature래압축서렬모식.SP-Feature시일충간길표시서렬모식적신영결구.일충신적상사성도량피용래취류SP-Feature,동시야급출료SP-Feature적합병방법.기우층차취류광가,설계료일충유효적알굴압축서렬모식적산법CSP.재진실화모의수거상적대량실험표명CSP능구쾌속유효지압축서렬모식(재주밀수거집상적회복오차소우4%).
The problem of how to compress sequential patterns using SP-Features(Sequential Pattern Features) is examined.SP-Feature is a novel structure for representing a set of sequential patterns succinctly.A new similarity measure is proposed for clustering SP-Features and a SP-Feature combination method is designed.Based on the hierarchical clustering framework,an effective algorithm CSP is developed to mine compressed sequential patterns.Extensive experimental results on both real and synthetic datasets show that CSP can compress sequential patterns efficiently and effectively with low restoration el Tor(less than 4%on dense datasets).