计算机研究与发展
計算機研究與髮展
계산궤연구여발전
Journal of Computer Research and Development
2015年
9期
1976-1991
,共16页
数据流%相似性查询%数据畸变%最长公共子序列%动态规划方法
數據流%相似性查詢%數據畸變%最長公共子序列%動態規劃方法
수거류%상사성사순%수거기변%최장공공자서렬%동태규화방법
data stream%similarity query%data distortion%longest common subsequence (LCSS )%dynamic programming method
数据流相似性查询广泛应用于智能家居、环境监测等领域.当前以 LCSS (longest common subsequence)作为相似性测度函数的研究并不多.NAIVE算法使用基本动态规划方法计算测度函数值,通过该值与相似阈值的比较得到查询结果,对基于LCSS的数据流相似性查询问题进行研究.针对NAIVE算法必须在动态规划矩阵所有成员取值的计算完成后才能得到查询结果的缺点,提出了一种基于PS(possible solution)‐CC(column critical)域优化策略的数据流相似性查询处理算法.该算法划定了每个窗口上动态规划矩阵的PS域和CC域,很好地利用了这2个域中成员所具有的性质和相似性查询的特点,无须获得测度函数的最终值便可得到查询结果,省略了很多矩阵成员的计算.实验部分证明了该算法的有效性,与同类算法相比,在处理具有更高精度结果要求的查询时效果更好.
數據流相似性查詢廣汎應用于智能傢居、環境鑑測等領域.噹前以 LCSS (longest common subsequence)作為相似性測度函數的研究併不多.NAIVE算法使用基本動態規劃方法計算測度函數值,通過該值與相似閾值的比較得到查詢結果,對基于LCSS的數據流相似性查詢問題進行研究.針對NAIVE算法必鬚在動態規劃矩陣所有成員取值的計算完成後纔能得到查詢結果的缺點,提齣瞭一種基于PS(possible solution)‐CC(column critical)域優化策略的數據流相似性查詢處理算法.該算法劃定瞭每箇窗口上動態規劃矩陣的PS域和CC域,很好地利用瞭這2箇域中成員所具有的性質和相似性查詢的特點,無鬚穫得測度函數的最終值便可得到查詢結果,省略瞭很多矩陣成員的計算.實驗部分證明瞭該算法的有效性,與同類算法相比,在處理具有更高精度結果要求的查詢時效果更好.
수거류상사성사순엄범응용우지능가거、배경감측등영역.당전이 LCSS (longest common subsequence)작위상사성측도함수적연구병불다.NAIVE산법사용기본동태규화방법계산측도함수치,통과해치여상사역치적비교득도사순결과,대기우LCSS적수거류상사성사순문제진행연구.침대NAIVE산법필수재동태규화구진소유성원취치적계산완성후재능득도사순결과적결점,제출료일충기우PS(possible solution)‐CC(column critical)역우화책략적수거류상사성사순처리산법.해산법화정료매개창구상동태규화구진적PS역화CC역,흔호지이용료저2개역중성원소구유적성질화상사성사순적특점,무수획득측도함수적최종치편가득도사순결과,성략료흔다구진성원적계산.실험부분증명료해산법적유효성,여동류산법상비,재처리구유경고정도결과요구적사순시효과경호.
Nowadays similarity query about data stream has been essential in many applications ,like smart home and environmental monitoring . However ,few of the current relevant researches take LCSS (longest common subsequence) as the similarity measurement function .The NAIVE algorithm gets the query results by comparing the threshold and the value of measurement function w hich is obtained based on the basic dynamic programming method . T he similarity query over data stream based on the LCSS is considered in this paper .The D2S‐PC algorithm is proposed to overcome the draw back that the query result cannot be gotten until the calculations on all the elements in the full dynamic programming matrix are finished .It defines the PS and CC domains of the matrix over every window ,and utilizes the characteristics of the similarity query and matrix members in these two domains effectively .By taking this algorithm ,the similarity query results can be obtained before the final length of LCSS is calculated .Compared with the original algorithm ,it reduces the computations about the members in the matrix greatly .Extensive experiments on real and synthetic datasets show that the D2S‐PC algorithm is effective in handling the similarity query over data stream based on the LCSS in the condition of more precise query results ,and can meet the requirements of practical applications .