江西师范大学学报(自然科学版)
江西師範大學學報(自然科學版)
강서사범대학학보(자연과학판)
JOURNAL OF JIANGXI NORMAL UNIVERSITY(NATURAL SCIENCES EDITION)
2014年
5期
449-453
,共5页
高维数据流%滑动窗口%属性约简%K-均值%微聚类%信息熵%离群点检测
高維數據流%滑動窗口%屬性約簡%K-均值%微聚類%信息熵%離群點檢測
고유수거류%활동창구%속성약간%K-균치%미취류%신식적%리군점검측
high-dimensional data stream%sliding window%attribute reduction%K-means%micro-clustering%informa-tion entropy%outlier detection
针对基于聚类的离群点检测算法在处理高维数据流时效率和精确度低的问题,提出一种高维数据流的聚类离群点检测(CODHD-Stream)算法。该算法首先采用滑动窗口技术对数据流划分,然后通过属性约简算法对高维数据集降维;其次运用基于距离的信息熵过滤机制的 K-means 聚类算法将数据集划分成微聚类,并检测微聚类的离群点。通过实验结果分析表明:该算法可以有效提高高维数据流中离群点检测的效率和准确度。
針對基于聚類的離群點檢測算法在處理高維數據流時效率和精確度低的問題,提齣一種高維數據流的聚類離群點檢測(CODHD-Stream)算法。該算法首先採用滑動窗口技術對數據流劃分,然後通過屬性約簡算法對高維數據集降維;其次運用基于距離的信息熵過濾機製的 K-means 聚類算法將數據集劃分成微聚類,併檢測微聚類的離群點。通過實驗結果分析錶明:該算法可以有效提高高維數據流中離群點檢測的效率和準確度。
침대기우취류적리군점검측산법재처리고유수거류시효솔화정학도저적문제,제출일충고유수거류적취류리군점검측(CODHD-Stream)산법。해산법수선채용활동창구기술대수거류화분,연후통과속성약간산법대고유수거집강유;기차운용기우거리적신식적과려궤제적 K-means 취류산법장수거집화분성미취류,병검측미취류적리군점。통과실험결과분석표명:해산법가이유효제고고유수거류중리군점검측적효솔화준학도。
The existing clustering-based outlier detection suffers from low efficiency and precision when dealing with high-dimensional data stream. To relieve this problem,an algorithm of clustering-based outlier detection for high-di-mensional data stream(CODHD-Stream)was presented. The algorithm used sliding window technology to divide the data stream. Then dimensions of high-dimensional data streams were reduced by an attribute reduction algorithm. Fi-nally,it divided the data set into a number of micro-clustering to detect outliers contained in the micro-clustering by the K-means method of the distance-based information entropy mechanism. The experimental analyses show that the proposed algorithm can effectively raise the speed and accuracy of outlier detection in high-dimensional data stream.