光谱学与光谱分析
光譜學與光譜分析
광보학여광보분석
SPECTROSCOPY AND SPECTRAL ANALYSIS
2014年
4期
947-951
,共5页
刘伟%赵众%袁洪福%宋春风%李效玉
劉偉%趙衆%袁洪福%宋春風%李效玉
류위%조음%원홍복%송춘풍%리효옥
样本分集%PLS回归%Kennard-Stone理论%近红外光谱%红外光谱
樣本分集%PLS迴歸%Kennard-Stone理論%近紅外光譜%紅外光譜
양본분집%PLS회귀%Kennard-Stone이론%근홍외광보%홍외광보
Sample subset partitioning%PLS regression%Kennard-Stone algorithm%NIR spectrometry%IR spectrometry
分析了校正集和验证集样品数随性质分布不均匀性对光谱多元分析校正的不良影响,揭示了实际光谱多元校正中“均值化”现象,即性质值小的样本预测值结果偏大,性质值大的则偏小,提出了一种优选样品新方法-Rank-KS。其综合考虑光谱空间和性质空间对样本进行挑选,将性质空间平均分为若干小区间,在每个小区间内分别利用Kennard-Stone法和随机法进行校正集和验证集样本的挑选,这样得到的校正集和验证集可明显改善样本数随性质分布的均匀性。以红外光谱测定汽油中碳酸二甲酯(DMC )含量和近红外光谱测定二甲亚砜溶液二甲亚砜浓度为研究对象,分别采用Rank-KS、随机法、Kennard-Stone、浓度梯度法和SPXY等方法选择校正集和验证集样品,使用多元线性回归和偏最小二乘法建立模型,比较这些方法对光谱多元校正分析的影响,结果表明Rank-KS方法可改善校正集和验证集样品数随性质分布的均匀性;对于样本数分布中间局部样本多和两端局部少、或者局部没有样本的样本集,使用Rank-KS算法挑选校正集,无论使用MLR还是PLS1建立多元分析模型,均能明显改善其模型预测能力,使得到的模型的预测均方根最小。
分析瞭校正集和驗證集樣品數隨性質分佈不均勻性對光譜多元分析校正的不良影響,揭示瞭實際光譜多元校正中“均值化”現象,即性質值小的樣本預測值結果偏大,性質值大的則偏小,提齣瞭一種優選樣品新方法-Rank-KS。其綜閤攷慮光譜空間和性質空間對樣本進行挑選,將性質空間平均分為若榦小區間,在每箇小區間內分彆利用Kennard-Stone法和隨機法進行校正集和驗證集樣本的挑選,這樣得到的校正集和驗證集可明顯改善樣本數隨性質分佈的均勻性。以紅外光譜測定汽油中碳痠二甲酯(DMC )含量和近紅外光譜測定二甲亞砜溶液二甲亞砜濃度為研究對象,分彆採用Rank-KS、隨機法、Kennard-Stone、濃度梯度法和SPXY等方法選擇校正集和驗證集樣品,使用多元線性迴歸和偏最小二乘法建立模型,比較這些方法對光譜多元校正分析的影響,結果錶明Rank-KS方法可改善校正集和驗證集樣品數隨性質分佈的均勻性;對于樣本數分佈中間跼部樣本多和兩耑跼部少、或者跼部沒有樣本的樣本集,使用Rank-KS算法挑選校正集,無論使用MLR還是PLS1建立多元分析模型,均能明顯改善其模型預測能力,使得到的模型的預測均方根最小。
분석료교정집화험증집양품수수성질분포불균균성대광보다원분석교정적불량영향,게시료실제광보다원교정중“균치화”현상,즉성질치소적양본예측치결과편대,성질치대적칙편소,제출료일충우선양품신방법-Rank-KS。기종합고필광보공간화성질공간대양본진행도선,장성질공간평균분위약간소구간,재매개소구간내분별이용Kennard-Stone법화수궤법진행교정집화험증집양본적도선,저양득도적교정집화험증집가명현개선양본수수성질분포적균균성。이홍외광보측정기유중탄산이갑지(DMC )함량화근홍외광보측정이갑아풍용액이갑아풍농도위연구대상,분별채용Rank-KS、수궤법、Kennard-Stone、농도제도법화SPXY등방법선택교정집화험증집양품,사용다원선성회귀화편최소이승법건립모형,비교저사방법대광보다원교정분석적영향,결과표명Rank-KS방법가개선교정집화험증집양품수수성질분포적균균성;대우양본수분포중간국부양본다화량단국부소、혹자국부몰유양본적양본집,사용Rank-KS산법도선교정집,무론사용MLR환시PLS1건립다원분석모형,균능명현개선기모형예측능력,사득도적모형적예측균방근최소。
The side effects in spectral multivariate modeling caused by the uneven distribution of sample numbers in the region of the calibration set and validation set were analyzed ,and the“average” phenomenon that samples with small property values are predicted with larger values ,and those with large property values are predicted with less values in spectral multivariate calibra-tion is showed in this paper .Considering the distribution feature of spectral space and property space simultaneously ,a new method of optimal sample selection named Rank-KS is proposed .Rank-KS aims at improving the uniformity of calibration set and validation set .Y-space was divided into some regions uniformly ,samples of calibration set and validation set were extracted by Kennard-Stone(KS) and Random-Select(RS) algorithm respectively in every region ,so the calibration set was distributed evenly and had a strong presentation .The proposed method were applied to the prediction of dimethylcarbonate (DMC) content in gaso-line with infrared spectra and dimethylsulfoxide in its aqueous solution with near infrared spectra .The“average” phenomenon showed in the prediction of multiple linear regression (MLR) model of dimethylsulfoxide was weakened effectively by Rank-KS . For comparison ,the MLR models and PLS1 models of MDC and dimethylsulfoxide were constructed by using RS ,KS ,Rank-Se-lect ,sample set partitioning based on joint X-and Y-blocks (SPXY) and proposed Rank-KS algorithms to select the calibration set ,respectively .Application results verified that the best prediction was achieved by using Rank-KS .Especially ,for the distri-bution of sample set with more in the middle and less on the boundaries ,or none in the local ,prediction of the model constructed by calibration set selected using Rank-KS can be improved obviously .