物理化学学报
物理化學學報
물이화학학보
ACTA PHYSICO-CHIMICA SINICA
2014年
5期
803-810
,共8页
黏度%ISODATA%蚁群算法%多元线性回归%支持向量机
黏度%ISODATA%蟻群算法%多元線性迴歸%支持嚮量機
점도%ISODATA%의군산법%다원선성회귀%지지향량궤
Viscosity%ISODATA%Ant colony algorithm%Multiple linear regression%Support vector machine
为了构建310个有机物分子结构与其黏度之间的定量结构-性质关系(QSPR)模型,探讨影响有机物液体黏度的结构因素,首先运用迭代自组织数据分析技术(ISODATA)将样本集初步分类,划分为训练集和测试集,进而应用 DRAGON2.1软件计算310个有机物分子的分子结构描述符,以蚁群算法(ACO)筛选分子描述符,得到5个参数,随后分别采用多元线性回归法(MLR)和支持向量机法(SVM)建立 ACO-MLR 模型和 ACO-SVM 模型.结果表明,非线性 ACO-SVM 模型(相关系数 R 2train=0.9013, R 2test=0.9026)的性能优于线性 ACO-MLR模型(R 2train=0.7680, R 2test=0.8725). ACO-MLR模型和ACO-SVM模型对测试集所得预测值与实验值的相关系数分别为0.934和0.950,预测效果令人满意.本文应用Wil iams图对模型的应用域进行了一定的研究,所建立的模型为工程上提供了一种根据分子结构预测有机物黏度的有效方法.
為瞭構建310箇有機物分子結構與其黏度之間的定量結構-性質關繫(QSPR)模型,探討影響有機物液體黏度的結構因素,首先運用迭代自組織數據分析技術(ISODATA)將樣本集初步分類,劃分為訓練集和測試集,進而應用 DRAGON2.1軟件計算310箇有機物分子的分子結構描述符,以蟻群算法(ACO)篩選分子描述符,得到5箇參數,隨後分彆採用多元線性迴歸法(MLR)和支持嚮量機法(SVM)建立 ACO-MLR 模型和 ACO-SVM 模型.結果錶明,非線性 ACO-SVM 模型(相關繫數 R 2train=0.9013, R 2test=0.9026)的性能優于線性 ACO-MLR模型(R 2train=0.7680, R 2test=0.8725). ACO-MLR模型和ACO-SVM模型對測試集所得預測值與實驗值的相關繫數分彆為0.934和0.950,預測效果令人滿意.本文應用Wil iams圖對模型的應用域進行瞭一定的研究,所建立的模型為工程上提供瞭一種根據分子結構預測有機物黏度的有效方法.
위료구건310개유궤물분자결구여기점도지간적정량결구-성질관계(QSPR)모형,탐토영향유궤물액체점도적결구인소,수선운용질대자조직수거분석기술(ISODATA)장양본집초보분류,화분위훈련집화측시집,진이응용 DRAGON2.1연건계산310개유궤물분자적분자결구묘술부,이의군산법(ACO)사선분자묘술부,득도5개삼수,수후분별채용다원선성회귀법(MLR)화지지향량궤법(SVM)건립 ACO-MLR 모형화 ACO-SVM 모형.결과표명,비선성 ACO-SVM 모형(상관계수 R 2train=0.9013, R 2test=0.9026)적성능우우선성 ACO-MLR모형(R 2train=0.7680, R 2test=0.8725). ACO-MLR모형화ACO-SVM모형대측시집소득예측치여실험치적상관계수분별위0.934화0.950,예측효과령인만의.본문응용Wil iams도대모형적응용역진행료일정적연구,소건립적모형위공정상제공료일충근거분자결구예측유궤물점도적유효방법.
The aim of this study was to construct a quantitative structure-property relationship model to identify relationships between the molecular structures and viscosities of 310 compounds, as wel as specific structural factors that could affect the viscosities of the compounds. Using an iterative self-organizing data analysis technique, the sample set was preliminarily classified into two sets, including a training set and a test set. The molecular structure descriptors of 310 compounds were calculated using version 2.1 of the Dragon software and subsequently sifted using an ant colony algorithm (ACO), which resulted in the selection of five parameters. Multiple linear regression (MLR) and the support vector machine (SVM) techniques were then used to establish ACO-MLR and ACO-SVM models, respectively. The results showed that the performance of the non-linear ACO-SVM model (correlation coefficient R 2train= 0.9013, R 2test= 0.9026) was superior to the linearACO-MLR model ( R 2train=0.7680, R 2test= 0.8725). The correlation coefficients between the experimental and predicted values of the ACO-MLR and ACO-SVM models for the test set were 0.934 and 0.950, respectively. The predictive properties of the two models were therefore determined to be satisfying. The application domain of the model was also studied using a Wil iams graph, which demonstrated that the models established in this study provide effective methods for predicting the viscosities of specific compounds based on their molecular structure.