系统工程理论与实践
繫統工程理論與實踐
계통공정이론여실천
Systems Engineering—Theory & Practice
2012年
12期
2764~2773
,共null页
数据流 频繁模式 兴趣度 MIFS—HT
數據流 頻繁模式 興趣度 MIFS—HT
수거류 빈번모식 흥취도 MIFS—HT
data stream; frequent itemset; degree of interesting; MIFS-HT
频繁模式挖掘是很多数据流挖掘工作的基础.现有算法虽然能够有效的在数据流中挖掘近似的频繁模式,但是由于数据流数据的不确定性、连续性以及海量性,始终不能有效的将算法的时间效率和空间效率控制在一个可以接受的范围内.本文通过使用散列表作为概要数据的存储结构,并引入关联规则兴趣度的概念,提出了数据流频繁模式挖掘算法MIFS-HT(mining interesting frequent itemsets with hash table),不仅有效降低现有算法的时空复杂度,同时提高了算法的应用价值.最后,实验结果表明:MIFS—HT是一种高效的数据流频繁模式挖掘算法,其性能优于FP—Stream、Lossy Counting等算法,并且挖掘结果更具有现实意义.
頻繁模式挖掘是很多數據流挖掘工作的基礎.現有算法雖然能夠有效的在數據流中挖掘近似的頻繁模式,但是由于數據流數據的不確定性、連續性以及海量性,始終不能有效的將算法的時間效率和空間效率控製在一箇可以接受的範圍內.本文通過使用散列錶作為概要數據的存儲結構,併引入關聯規則興趣度的概唸,提齣瞭數據流頻繁模式挖掘算法MIFS-HT(mining interesting frequent itemsets with hash table),不僅有效降低現有算法的時空複雜度,同時提高瞭算法的應用價值.最後,實驗結果錶明:MIFS—HT是一種高效的數據流頻繁模式挖掘算法,其性能優于FP—Stream、Lossy Counting等算法,併且挖掘結果更具有現實意義.
빈번모식알굴시흔다수거류알굴공작적기출.현유산법수연능구유효적재수거류중알굴근사적빈번모식,단시유우수거류수거적불학정성、련속성이급해량성,시종불능유효적장산법적시간효솔화공간효솔공제재일개가이접수적범위내.본문통과사용산렬표작위개요수거적존저결구,병인입관련규칙흥취도적개념,제출료수거류빈번모식알굴산법MIFS-HT(mining interesting frequent itemsets with hash table),불부유효강저현유산법적시공복잡도,동시제고료산법적응용개치.최후,실험결과표명:MIFS—HT시일충고효적수거류빈번모식알굴산법,기성능우우FP—Stream、Lossy Counting등산법,병차알굴결과경구유현실의의.
Frequent itemsets mining, which is the basic in the field of data stream mining, has been paid more and more attention by researchers. Due to the uncertainties, continuities and large amount of data streams, many mining algorithms are difficult to deal with these dynamic data streams. In this paper, hashed table and the interesting degree of association rules are introduced, where the former is used to represent the synoptic data structure and the latter is applied to incorporate attention of customers. After that, a new frequent itemsets mining algorithm named MIFS-HT(mining interesting frequent itemsets with hash table) is proposed. Comparing with lossy counting and a similar algorithm called mining frequent item sets over data streams by matrix (MISM for short), the result shows that MIFS-HT is more effective both in time and space efficiency.