分析化学
分析化學
분석화학
CHINESE JOURNAL OF ANALYTICAL CHEMISTRY
2014年
11期
1687-1691
,共5页
梁淼%蔡嘉月%杨凯%束茹欣%赵龙莲%张录达%李军会
樑淼%蔡嘉月%楊凱%束茹訢%趙龍蓮%張錄達%李軍會
량묘%채가월%양개%속여흔%조룡련%장록체%리군회
半监督学习%偏最小二乘%近红外光谱%烟叶%感官质量
半鑑督學習%偏最小二乘%近紅外光譜%煙葉%感官質量
반감독학습%편최소이승%근홍외광보%연협%감관질량
Semisupervisedlearning%Partialleastsquares%Near-infraredspectroscopy%Tobacco%Sensory quality
半监督学习方法可以充分利用大量未标注样本来弥补已标注样本的不足,针对应用近红外光谱建立农产品等复杂体系的分析模型中,存在获得大量精确标注样本较困难,而使用少量标注样本或大量未准确标注样品建模结果不理想的问题,基于半监督自训练理念,提出半监督偏最小二乘( Semi supervised-partial least squares, SS-PLS)方法优化模型。本研究以全国不同产地、不同等级的211份原料烟叶近红外光谱及其对应感官评价数据为例,应用SS-PLS方法优化模型,模型性能较原始模型有显著提高,优化后SS-PLS方法模型的决定系数( R2)达90%左右,建模标定值分布标准差与拟合值标准差的比值( Ratio of Performance to Devia-tion, RPD)达3.0以上,模型内部交叉验证及预测标准差(Standard error of cross validation SECV以及Standard Error of Prediction, SEP)值达1.0以下;并将原始感官评价数据与SS-PLS优化后的数据,按照固定阈值划分为优、中、差三个等级,应用基于主成分及FISHER准则的投影方法( Projection Model based on Principal Compo-nent and Fisher Criterion, PPF)分析得到的结果表明,SS-PLS优化后的分类结果也显著好于原始感官评价数据。 SS-PLS可解决使用小样品集建模的数据代表性问题,在获得大量精确标注样本较困难情况下,为建立近红外光谱分析模型提供了一种新的化学计量学方法。
半鑑督學習方法可以充分利用大量未標註樣本來瀰補已標註樣本的不足,針對應用近紅外光譜建立農產品等複雜體繫的分析模型中,存在穫得大量精確標註樣本較睏難,而使用少量標註樣本或大量未準確標註樣品建模結果不理想的問題,基于半鑑督自訓練理唸,提齣半鑑督偏最小二乘( Semi supervised-partial least squares, SS-PLS)方法優化模型。本研究以全國不同產地、不同等級的211份原料煙葉近紅外光譜及其對應感官評價數據為例,應用SS-PLS方法優化模型,模型性能較原始模型有顯著提高,優化後SS-PLS方法模型的決定繫數( R2)達90%左右,建模標定值分佈標準差與擬閤值標準差的比值( Ratio of Performance to Devia-tion, RPD)達3.0以上,模型內部交扠驗證及預測標準差(Standard error of cross validation SECV以及Standard Error of Prediction, SEP)值達1.0以下;併將原始感官評價數據與SS-PLS優化後的數據,按照固定閾值劃分為優、中、差三箇等級,應用基于主成分及FISHER準則的投影方法( Projection Model based on Principal Compo-nent and Fisher Criterion, PPF)分析得到的結果錶明,SS-PLS優化後的分類結果也顯著好于原始感官評價數據。 SS-PLS可解決使用小樣品集建模的數據代錶性問題,在穫得大量精確標註樣本較睏難情況下,為建立近紅外光譜分析模型提供瞭一種新的化學計量學方法。
반감독학습방법가이충분이용대량미표주양본래미보이표주양본적불족,침대응용근홍외광보건립농산품등복잡체계적분석모형중,존재획득대량정학표주양본교곤난,이사용소량표주양본혹대량미준학표주양품건모결과불이상적문제,기우반감독자훈련이념,제출반감독편최소이승( Semi supervised-partial least squares, SS-PLS)방법우화모형。본연구이전국불동산지、불동등급적211빈원료연협근홍외광보급기대응감관평개수거위례,응용SS-PLS방법우화모형,모형성능교원시모형유현저제고,우화후SS-PLS방법모형적결정계수( R2)체90%좌우,건모표정치분포표준차여의합치표준차적비치( Ratio of Performance to Devia-tion, RPD)체3.0이상,모형내부교차험증급예측표준차(Standard error of cross validation SECV이급Standard Error of Prediction, SEP)치체1.0이하;병장원시감관평개수거여SS-PLS우화후적수거,안조고정역치화분위우、중、차삼개등급,응용기우주성분급FISHER준칙적투영방법( Projection Model based on Principal Compo-nent and Fisher Criterion, PPF)분석득도적결과표명,SS-PLS우화후적분류결과야현저호우원시감관평개수거。 SS-PLS가해결사용소양품집건모적수거대표성문제,재획득대량정학표주양본교곤난정황하,위건립근홍외광보분석모형제공료일충신적화학계량학방법。
Semisupervisedmakesfulluseoflargeamountsofunlabeledsamplestomakeuptheinsufficiency of labeled samples. Since it is difficult to obtain a large number of accurate labeled samples and it is a good way for modeling by using a small amount of labeled samples or a large number of inaccurate samples, we proposed a new method named as semi-supervised partial least squares ( SS-PLS) to optimize model based on semi supervised learning. We used 211 samples of tobacco near infrared spectrum and sensory evaluation for modeling and used SS-PLS method to optimize tobacco sensory evaluation model. In the optimized model, the coefficient of determination ( R2 ) can reach up to 90%, the ratio of performance to deviation ( RPD) can reach up to 3 . 0 , and the standard error of cross validation and the standard error of prediction ( SECV and SEP) are below 1. 0. We divided the original sensory evaluation and SS-PLS optimized data into three grades of excellent, medium and poor in accordance with the fixed threshold, the result using projection model of based on principal component and Fisher criterion ( PPF ) shows that the classification of SS-PLS optimized data is better than the original sensory evaluation data. The SS-PLS method can solve the data representation problem of using small sample set for modeling and provides a new chemometrics method for near infrared spectroscopy modeling in case of obtaining a large number of accurately labeled samples is difficult.