计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2010年
5期
123-125
,共3页
李宏%阿玛尼%李平%吴敏
李宏%阿瑪尼%李平%吳敏
리굉%아마니%리평%오민
丢失数据填充%参数更新器%最大期望值算法(EM)%贝叶斯网络
丟失數據填充%參數更新器%最大期望值算法(EM)%貝葉斯網絡
주실수거전충%삼수경신기%최대기망치산법(EM)%패협사망락
missing values imputation%parameter updater%Expectation-Maximization(EM)%Bayesian network
实际应用中存在大量的丢失数据的数据集,对丢失数据的处理已成为目前分类领域的研究热点.分析和比较了几种通用的丢失数据填充算法,并提出一种新的基于EM和贝叶斯网络的丢失数据填充算法.算法利用朴素贝叶斯估计出EM算法初值,然后将EM和贝叶斯网络结合进行迭代确定最终更新器,同时得到填充后的完整数据集.实验结果表明,与经典填充算法相比,新算法具有更高的分类准确率,且节省了大量开销.
實際應用中存在大量的丟失數據的數據集,對丟失數據的處理已成為目前分類領域的研究熱點.分析和比較瞭幾種通用的丟失數據填充算法,併提齣一種新的基于EM和貝葉斯網絡的丟失數據填充算法.算法利用樸素貝葉斯估計齣EM算法初值,然後將EM和貝葉斯網絡結閤進行迭代確定最終更新器,同時得到填充後的完整數據集.實驗結果錶明,與經典填充算法相比,新算法具有更高的分類準確率,且節省瞭大量開銷.
실제응용중존재대량적주실수거적수거집,대주실수거적처리이성위목전분류영역적연구열점.분석화비교료궤충통용적주실수거전충산법,병제출일충신적기우EM화패협사망락적주실수거전충산법.산법이용박소패협사고계출EM산법초치,연후장EM화패협사망락결합진행질대학정최종경신기,동시득도전충후적완정수거집.실험결과표명,여경전전충산법상비,신산법구유경고적분류준학솔,차절성료대량개소.
Dataset with missing values is quite common in real applications,and handling missing values has become a research hot issue in the classification field.This paper analyzes and compares several popular missing values imputation algorithms,and has proposed a novel imputation algorithm for missing values based on EM(Expectation Maximization)and Bayesian network.In this algorithm,the Na(i)ve Bayesian is employed to estimate the initial values of EM algorithm,and the EM inspired approach for filling up missing values is incorporated to Bayesian network learning with the objective of ensuring the ultimate updater.As a result,the complete dataset is got after imputation.Experiment results demonstrate that the proposed algorithm enables much higher classification accuracy and lower cost when compared with other classical imputation algorithm.