模式识别与人工智能
模式識彆與人工智能
모식식별여인공지능
Moshi Shibie yu Rengong Zhineng
2015年
4期
327-334
,共8页
海量数据%粒计算%属性约简%分层抽样%区分能力
海量數據%粒計算%屬性約簡%分層抽樣%區分能力
해량수거%립계산%속성약간%분층추양%구분능력
Massive Dada%Granular Computing%Attribute Reduction%Stratified Sampling%Discernibility
传统的属性约简方法将整个数据集一次性装入内存,很难适应大数据背景下的数据分析。为此文中提出基于粒计算与区分能力的属性约简算法。该算法运用统计学中的分层抽样技术,拆分原始大数据集为多个样本子集(粒),在每个粒上运用属性的区分能力进行属性约简,最后将各粒约简结果进行加权融合,得到原始大数据集的属性约简结果。实验表明该算法对海量数据集进行属性约简的可行性和高效性。
傳統的屬性約簡方法將整箇數據集一次性裝入內存,很難適應大數據揹景下的數據分析。為此文中提齣基于粒計算與區分能力的屬性約簡算法。該算法運用統計學中的分層抽樣技術,拆分原始大數據集為多箇樣本子集(粒),在每箇粒上運用屬性的區分能力進行屬性約簡,最後將各粒約簡結果進行加權融閤,得到原始大數據集的屬性約簡結果。實驗錶明該算法對海量數據集進行屬性約簡的可行性和高效性。
전통적속성약간방법장정개수거집일차성장입내존,흔난괄응대수거배경하적수거분석。위차문중제출기우립계산여구분능력적속성약간산법。해산법운용통계학중적분층추양기술,탁분원시대수거집위다개양본자집(립),재매개립상운용속성적구분능력진행속성약간,최후장각립약간결과진행가권융합,득도원시대수거집적속성약간결과。실험표명해산법대해량수거집진행속성약간적가행성화고효성。
In traditional attribute reduction algorithms, all the data are loaded into the main memory once, which is hard to adapt to the big data analyses. Aiming at this problem, an attribute reduction algorithm based on granular computing and discernibility is proposed. An original large-scale datset is divided into small granularities by applying stratified sampling in statistics, and then attributes are reduced on each small granularity based on discernibility of attribute. Finally, all the reductions on small granularities are fused by weighting. Experimental results show that the proposed algorithm is feasible and efficient for attribute reduction on massive datasets.