计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2014年
20期
1-4,19
,共5页
时间序列%关键点%数据挖掘%相似性%不同长度
時間序列%關鍵點%數據挖掘%相似性%不同長度
시간서렬%관건점%수거알굴%상사성%불동장도
time series%key point%data mining%similarity%different length
目前,时间序列的相似性大多是在原始序列上进行判断和比较的,原始序列维度较高,计算量大,不利于相似性比较。提出了新的关键点(转折点或极值点)算法,除利用常用的极值法求非单调序列的关键点外,还提出了求单调序列关键点的新算法,利用该算法可以压缩时间序列,降低维度,又能保持序列的轮廓。在关键点时间序列上提出了新的相似性判定算法,利用该算法可计算任意两序列的相似度,并且提高了相似性判定的鲁棒性,减少人为干预设置阈值带来的影响。实验结果表明,基于时间序列关键点的相似性算法能很好地判定任意两序列的相似性,减少了计算量,提高了鲁棒性及减少人为干扰,对时间序列数据挖掘中的聚类与预测有很好的帮助作用。
目前,時間序列的相似性大多是在原始序列上進行判斷和比較的,原始序列維度較高,計算量大,不利于相似性比較。提齣瞭新的關鍵點(轉摺點或極值點)算法,除利用常用的極值法求非單調序列的關鍵點外,還提齣瞭求單調序列關鍵點的新算法,利用該算法可以壓縮時間序列,降低維度,又能保持序列的輪廓。在關鍵點時間序列上提齣瞭新的相似性判定算法,利用該算法可計算任意兩序列的相似度,併且提高瞭相似性判定的魯棒性,減少人為榦預設置閾值帶來的影響。實驗結果錶明,基于時間序列關鍵點的相似性算法能很好地判定任意兩序列的相似性,減少瞭計算量,提高瞭魯棒性及減少人為榦擾,對時間序列數據挖掘中的聚類與預測有很好的幫助作用。
목전,시간서렬적상사성대다시재원시서렬상진행판단화비교적,원시서렬유도교고,계산량대,불리우상사성비교。제출료신적관건점(전절점혹겁치점)산법,제이용상용적겁치법구비단조서렬적관건점외,환제출료구단조서렬관건점적신산법,이용해산법가이압축시간서렬,강저유도,우능보지서렬적륜곽。재관건점시간서렬상제출료신적상사성판정산법,이용해산법가계산임의량서렬적상사도,병차제고료상사성판정적로봉성,감소인위간예설치역치대래적영향。실험결과표명,기우시간서렬관건점적상사성산법능흔호지판정임의량서렬적상사성,감소료계산량,제고료로봉성급감소인위간우,대시간서렬수거알굴중적취류여예측유흔호적방조작용。
At present, the similarity of time series is to judge and compare in the raw series, because of the original sequence of high dimension, large amount of calculation,it is not conducive to the similarity comparison. The algorithm of new key points(turning point and extreme point)is presented in this paper, in addition to the key point is found by the extreme method in non-monotonic sequence, it also proposes a new algorithm for monotone sequence of turning points, using this algorithm can compress time series, dimension reduction, and can keep the sequence of contour. The key point(turning point and extreme point)is the most important point characterization of time series, which reflects the sequence of con-tour. The key point is accurately found out in the sequence,that plays a key role in the time series similarity matching and time series compression. In this paper, the new method of similarity based on key points is proposed, it can calculate the similarity of two sequences, improves the robustness of similar decision, and avoids influence of setting the threshold. The experimental results show that this algorithm can effectively determine the similarity of arbitrary sequences, improves the robustness and reduces human intervention and can help clustering, prediction in data mining.