计算机仿真
計算機倣真
계산궤방진
COMPUTER SIMULATION
2010年
3期
94-97
,共4页
连续型变量%条件概率密度%遗传算法
連續型變量%條件概率密度%遺傳算法
련속형변량%조건개솔밀도%유전산법
Continuous variables%Conditional probability Density%Genetic algorithm
在软测量建模问题中为了提高模型的估计精度,通常需要将原始数据集分类,以构造多个子模型.数据分类中利用朴素贝叶斯分类器简单高效的优点,首先对连续的类变量进行类别范围划分,然后用概率论中的"3σ"规则对连续的属性变量离散.可以消除训练样本中干扰数据的影响,利用遗传算法从训练样本集中优选样本.对连续变量的离散和样本的优选作为对数据的预处理,预处理后的训练样本构建贝叶斯分类器.通过对UCI数据集和双酚A生产过程在线监测数据集的实验仿真,实验结果表明,遗传算法优选样本集的"3σ"规则朴素贝叶斯分类方法比其它方法有更高的分类精度.
在軟測量建模問題中為瞭提高模型的估計精度,通常需要將原始數據集分類,以構造多箇子模型.數據分類中利用樸素貝葉斯分類器簡單高效的優點,首先對連續的類變量進行類彆範圍劃分,然後用概率論中的"3σ"規則對連續的屬性變量離散.可以消除訓練樣本中榦擾數據的影響,利用遺傳算法從訓練樣本集中優選樣本.對連續變量的離散和樣本的優選作為對數據的預處理,預處理後的訓練樣本構建貝葉斯分類器.通過對UCI數據集和雙酚A生產過程在線鑑測數據集的實驗倣真,實驗結果錶明,遺傳算法優選樣本集的"3σ"規則樸素貝葉斯分類方法比其它方法有更高的分類精度.
재연측량건모문제중위료제고모형적고계정도,통상수요장원시수거집분류,이구조다개자모형.수거분류중이용박소패협사분류기간단고효적우점,수선대련속적류변량진행유별범위화분,연후용개솔론중적"3σ"규칙대련속적속성변량리산.가이소제훈련양본중간우수거적영향,이용유전산법종훈련양본집중우선양본.대련속변량적리산화양본적우선작위대수거적예처리,예처리후적훈련양본구건패협사분류기.통과대UCI수거집화쌍분A생산과정재선감측수거집적실험방진,실험결과표명,유전산법우선양본집적"3σ"규칙박소패협사분류방법비기타방법유경고적분류정도.
Constructing sub-models can increase estimation accuracy in soft sensing modeling,and the construction of multi-model is based on the classification of the original data set.Among the methods of data classification,Naive Bayesian classifier has been widely applied because of its simplicity and efficiency.The continuous class variables are firstly divided into several categories,then the "3σ" rule based on probability theory is proposed to discretize the attributes.In order to eliminate the interferences from the training sample,the optimal sub sample set is selected from the training sample set by genetic algorithm.Finally the preprocessed training samples are used to build the Bayesian classifier.Experiments of both UCI data sets and the on-line monitoring data sets from the process of production for Bisphenol-A(BPA)production are carried out,The results show that it is possible to reliably improve the naive Bayesian classifier by using data discretization and selected as part of data pre-processing.