The purpose of this paper is to present a novel way to building quantitative structure-property relationship(QSPR) models for predicting the gas-to-benzene solvation enthalpy(ΔHSolv) of 158 organic compounds based on...The purpose of this paper is to present a novel way to building quantitative structure-property relationship(QSPR) models for predicting the gas-to-benzene solvation enthalpy(ΔHSolv) of 158 organic compounds based on molecular descriptors calculated from the structure alone. Different kinds of descriptors were calculated for each compounds using dragon package. The variable selection technique of enhanced replacement method(ERM) was employed to select optimal subset of descriptors. Our investigation reveals that the dependence of physico-chemical properties on solvation enthalpy is a nonlinear observable fact and that ERM method is unable to model the solvation enthalpy accurately. The standard error value of prediction set for support vector machine(SVM) is 1.681 kJ ? mol^(-1) while it is 4.624 kJ ? mol^(-1) for ERM. The results established that the calculated ΔHSolvvalues by SVM were in good agreement with the experimental ones, and the performances of the SVM models were superior to those obtained by ERM one. This indicates that SVM can be used as an alternative modeling tool for QSPR studies.展开更多
针对最小二乘孪生支持向量机(least squares twin support vector machine,LSTSVM)对噪声或是异常数据敏感和忽略数据内在结构信息的问题,提出了一种直觉模糊的结构化最小二乘孪生支持向量机(intuition fuzzy and structural least squa...针对最小二乘孪生支持向量机(least squares twin support vector machine,LSTSVM)对噪声或是异常数据敏感和忽略数据内在结构信息的问题,提出了一种直觉模糊的结构化最小二乘孪生支持向量机(intuition fuzzy and structural least squares twin support vector machine,IF-SLSTSVM)。首先采用孤立森林对输入样本点进行预处理;然后通过直觉模糊数的概念,赋予输入样本点不同的权重以减少噪声或是异常数据对分类超平面产生的影响;最后采用K-Means算法,以协方差的形式获取输入样本点之间的结构信息。IFSLSTSVM在LS-TSVM的基础上,考虑了输入样本点在特征空间中的分布信息及输入样本点之间的关系,提高了模型的鲁棒性。实验采取UCI数据集,在0%、5%、10%以及20%的不同比例噪声环境对IF-SLSTSVM算法的有效性进行验证。结果显示相较于6种对比算法,IF-SLSTSVM算法有更好的鲁棒性。展开更多
Machine learning techniques are finding more and more applications in the field of load forecasting. A novel regression technique,called support vector machine (SVM),based on the statistical learning theory is applied...Machine learning techniques are finding more and more applications in the field of load forecasting. A novel regression technique,called support vector machine (SVM),based on the statistical learning theory is applied in this paper for the prediction of natural gas demands. Least squares support vector machine (LS-SVM) is a kind of SVM that has different cost function with respect to SVM. SVM is based on the principle of structure risk minimization as opposed to the principle of empirical risk minimization supported by conventional regression techniques. The prediction result shows that the prediction accuracy of SVM is better than that of neural network. Thus,SVM appears to be a very promising prediction tool. The software package NGPSLF based on SVM prediction has been put into practical business application.展开更多
文摘The purpose of this paper is to present a novel way to building quantitative structure-property relationship(QSPR) models for predicting the gas-to-benzene solvation enthalpy(ΔHSolv) of 158 organic compounds based on molecular descriptors calculated from the structure alone. Different kinds of descriptors were calculated for each compounds using dragon package. The variable selection technique of enhanced replacement method(ERM) was employed to select optimal subset of descriptors. Our investigation reveals that the dependence of physico-chemical properties on solvation enthalpy is a nonlinear observable fact and that ERM method is unable to model the solvation enthalpy accurately. The standard error value of prediction set for support vector machine(SVM) is 1.681 kJ ? mol^(-1) while it is 4.624 kJ ? mol^(-1) for ERM. The results established that the calculated ΔHSolvvalues by SVM were in good agreement with the experimental ones, and the performances of the SVM models were superior to those obtained by ERM one. This indicates that SVM can be used as an alternative modeling tool for QSPR studies.
文摘针对最小二乘孪生支持向量机(least squares twin support vector machine,LSTSVM)对噪声或是异常数据敏感和忽略数据内在结构信息的问题,提出了一种直觉模糊的结构化最小二乘孪生支持向量机(intuition fuzzy and structural least squares twin support vector machine,IF-SLSTSVM)。首先采用孤立森林对输入样本点进行预处理;然后通过直觉模糊数的概念,赋予输入样本点不同的权重以减少噪声或是异常数据对分类超平面产生的影响;最后采用K-Means算法,以协方差的形式获取输入样本点之间的结构信息。IFSLSTSVM在LS-TSVM的基础上,考虑了输入样本点在特征空间中的分布信息及输入样本点之间的关系,提高了模型的鲁棒性。实验采取UCI数据集,在0%、5%、10%以及20%的不同比例噪声环境对IF-SLSTSVM算法的有效性进行验证。结果显示相较于6种对比算法,IF-SLSTSVM算法有更好的鲁棒性。
文摘Machine learning techniques are finding more and more applications in the field of load forecasting. A novel regression technique,called support vector machine (SVM),based on the statistical learning theory is applied in this paper for the prediction of natural gas demands. Least squares support vector machine (LS-SVM) is a kind of SVM that has different cost function with respect to SVM. SVM is based on the principle of structure risk minimization as opposed to the principle of empirical risk minimization supported by conventional regression techniques. The prediction result shows that the prediction accuracy of SVM is better than that of neural network. Thus,SVM appears to be a very promising prediction tool. The software package NGPSLF based on SVM prediction has been put into practical business application.