农业工程学报
農業工程學報
농업공정학보
2013年
z1期
270-274
,共5页
近红外%模型%支持向量机%最小二乘法%草莓%固酸比%可滴定酸%潜在变量
近紅外%模型%支持嚮量機%最小二乘法%草莓%固痠比%可滴定痠%潛在變量
근홍외%모형%지지향량궤%최소이승법%초매%고산비%가적정산%잠재변량
near infrared spectroscopy%models%support vector machines%least squares method%strawberry%SSC-to-TA ratio%TA%latent variables
为提高草莓固酸比和可滴定酸近红外光谱定量模型的性能,该文采用偏最小二乘法提取的潜在变量作为最小二乘-支持向量机模型的输入变量,建立了两指标的近红外定量模型,并与偏最小二乘模型结果进行了比较,建模所使用的光谱范围为6000~12500 cm-1.结果表明,草莓可滴定酸和固酸比偏最小二乘模型校正相关系数、校正和预测均方根误差分别为0.430、0.096%、0.096%及0.688、0.926和1.190,而两指标的前10个潜在变量得分作为输入变量的最小二乘—支持向量机模型各项性能均远优于偏最小二乘模型,其校正和预测相关系数、校正和预测均方根误差以及剩余预测偏差分别为:可滴定酸0.965、0.967、0.028%、0.027%、3.881;固酸比0.980、0.973、0.258、0.373、3.111.研究表明,潜在变量作为最小二乘支持向量机模型的输入变量可在较大程度上改善草莓可滴定酸和固酸比指标近红外定量模型的预测性能和稳定性.
為提高草莓固痠比和可滴定痠近紅外光譜定量模型的性能,該文採用偏最小二乘法提取的潛在變量作為最小二乘-支持嚮量機模型的輸入變量,建立瞭兩指標的近紅外定量模型,併與偏最小二乘模型結果進行瞭比較,建模所使用的光譜範圍為6000~12500 cm-1.結果錶明,草莓可滴定痠和固痠比偏最小二乘模型校正相關繫數、校正和預測均方根誤差分彆為0.430、0.096%、0.096%及0.688、0.926和1.190,而兩指標的前10箇潛在變量得分作為輸入變量的最小二乘—支持嚮量機模型各項性能均遠優于偏最小二乘模型,其校正和預測相關繫數、校正和預測均方根誤差以及剩餘預測偏差分彆為:可滴定痠0.965、0.967、0.028%、0.027%、3.881;固痠比0.980、0.973、0.258、0.373、3.111.研究錶明,潛在變量作為最小二乘支持嚮量機模型的輸入變量可在較大程度上改善草莓可滴定痠和固痠比指標近紅外定量模型的預測性能和穩定性.
위제고초매고산비화가적정산근홍외광보정량모형적성능,해문채용편최소이승법제취적잠재변량작위최소이승-지지향량궤모형적수입변량,건립료량지표적근홍외정량모형,병여편최소이승모형결과진행료비교,건모소사용적광보범위위6000~12500 cm-1.결과표명,초매가적정산화고산비편최소이승모형교정상관계수、교정화예측균방근오차분별위0.430、0.096%、0.096%급0.688、0.926화1.190,이량지표적전10개잠재변량득분작위수입변량적최소이승—지지향량궤모형각항성능균원우우편최소이승모형,기교정화예측상관계수、교정화예측균방근오차이급잉여예측편차분별위:가적정산0.965、0.967、0.028%、0.027%、3.881;고산비0.980、0.973、0.258、0.373、3.111.연구표명,잠재변량작위최소이승지지향량궤모형적수입변량가재교대정도상개선초매가적정산화고산비지표근홍외정량모형적예측성능화은정성.
In order to improve performance of near infrared spectroscopy (NIR) models for quantitative analysis of soluble-solid-content-to-titratable-acidity ratio (SSC-to-TA) and titratable acidity (TA) in fresh strawberry, least squares-support vector machine (LS-SVM) with latent variables (LVs), extracted by partial least squares (PLS), as input were used to establish calibration models. And the performance were compared with PLS models. Three hundreds and eighteen fresh strawberry samples of three varieties including “Tianbao”(n=100), “Fengxiang”(n=100) and“Mingxing”(n=118) were analyzed. The spectral region used in this paper was 6000~12500 cm-1. The detector, scan times and resolution were Pbs, 64 and 8 cm-1 respectively. The internal gold background as the reference spectrum was scanned before samples spectra collection. The reference data of SSC values were measured by a digital refractometer with 0.02°Brix accuracy using temperature correction from 10 to 60 °C. And the TA data were obtained by an acid-base titration method according to the National Standard of the pepole’s republic of China. Before models construction Chauvenet rule was used to detect spectral outliers that should be removed from the sample set, and then concentration outliers were removed based on student residual and leverage values. Various mathematical signal treatments were used and compared when PLS models were constructed, including savitzky-golay smoothing (SG) (points of 5, 15 and 25), first and second derivative, multiplicative scatter correction (MSC), and the standard normal variate (SNV). But the PLS models with these pretreatments either for SSC-to-TA or for TA were deteriorated. The best PLS model was established using full bands raw spectra, with correlation coefficients of calibration, root mean square error of calibration and prediction (rc, RMSEC and RMSEP) of 0.430, 0.096%, and 0.096%for TA;of 0.688, 0.926, and 1.190 for SSC-to-TA, which showed a poor predictive accuracy. Ten LVs were extracted from raw spectra of full bands by PLS. The LS-SVM models with input of LVs from 1 to 10 were compared, and the LS-SVM model presenting the best performance was obtained when the first 10 LVs were inputted. The two step grid searching and leave-one-out cross validations were used to realize the global optimization of regularization parameter gamma (γ) and kernel parameter sig2 (σ2) of radial basis function (RBF). The best LS-SVM model was far superior to the best PLS. The optimal models were obtained by LS-SVM with the first 10 LVs as input, with rc, correlation coefficients of prediction (rp), RMSEC, RESEP and the residual predictive deviation (RPD) of 0.965, 0.967, 0.028%, 0.027%and 3.881 for TA;0.980, 0.973, 0.258, 0.373 and 3.111 for SSC-to-TA. The results indicate that with LVs as input nonlinear methods of LS-SVM offers more effective quantitative capability for SSC-to-TA and TA in strawberry. Further studies with a larger size and more varieties of strawberry samples should be done to improve the specificity, prediction accuracy, and robustness of models.