The purpose of this paper is to present a novel way to building quantitative structure-property relationship(QSPR) models for predicting the gas-to-benzene solvation enthalpy(ΔHSolv) of 158 organic compounds based on...The purpose of this paper is to present a novel way to building quantitative structure-property relationship(QSPR) models for predicting the gas-to-benzene solvation enthalpy(ΔHSolv) of 158 organic compounds based on molecular descriptors calculated from the structure alone. Different kinds of descriptors were calculated for each compounds using dragon package. The variable selection technique of enhanced replacement method(ERM) was employed to select optimal subset of descriptors. Our investigation reveals that the dependence of physico-chemical properties on solvation enthalpy is a nonlinear observable fact and that ERM method is unable to model the solvation enthalpy accurately. The standard error value of prediction set for support vector machine(SVM) is 1.681 kJ ? mol^(-1) while it is 4.624 kJ ? mol^(-1) for ERM. The results established that the calculated ΔHSolvvalues by SVM were in good agreement with the experimental ones, and the performances of the SVM models were superior to those obtained by ERM one. This indicates that SVM can be used as an alternative modeling tool for QSPR studies.展开更多
以物质的电子、空间等结构性质为基础,运用Gaussian98和Cerius2程序包对偶极距(Dipole)、最高占据轨道能量(EHOMO)、最低空轨道能量(ELUMO)、分子总能量(E)、旋转键(Rotlbonds)、最弱的R-NO2键长(R-NO2 bond length,R为C或N)、氢键供体(...以物质的电子、空间等结构性质为基础,运用Gaussian98和Cerius2程序包对偶极距(Dipole)、最高占据轨道能量(EHOMO)、最低空轨道能量(ELUMO)、分子总能量(E)、旋转键(Rotlbonds)、最弱的R-NO2键长(R-NO2 bond length,R为C或N)、氢键供体(Hbond donor)和中点势(Vmid)8种描述符进行了计算,采用Cerius2程序包中的QSPR方法建立了芳香系炸药密度与8种描述符之间的构效关系式,相关系数R为0.909,30个化合物所构成的训练集和15个化合物所构成的预测集预测密度与实测密度之间的平均误差分别为3.33%和2.94%。展开更多
In this paper, according to the peak numbers of the nuclear magnetic resonance and the Randic embranchment degree (δ_i) of carbon atom i, the carbon atom’s environment valence g_i is defined as: g_i=(t_i+δ_i)/2.The...In this paper, according to the peak numbers of the nuclear magnetic resonance and the Randic embranchment degree (δ_i) of carbon atom i, the carbon atom’s environment valence g_i is defined as: g_i=(t_i+δ_i)/2.The g_i reflect the characteristic of each carbon atom, and as well as the conjunction detail of the carbon atom with other carbon atoms.So, the g_ i could distinguish better the chemical environment of each carbon atom in the molecule than δ_i.A connectivity index of environment valence ( mS) and its athwart index ( mS′) are proposed based on the adjacency matrix and the carbon atom’s environment valence g_i.Among them, the 0S and 0S′ include the characteristic and the connectivity of each carbon atom, the 1S and 1S′ reflect the second conjunction between carbon atoms.Based on 0S′ and N(the number of carbon atom), a new structural parameter——symmetry degree (N_ ec), is defined as: N_ ec=[(0S′_S/0S′_C)N] 2/3,and the N_ ec reflect the size of the molecule as well as the symmetry of the molecule.The N_ ec, 0S and R_n(the biggest ring’s edge numbers of cycloalkanes) of 474 saturated hydrocarbons (216 paraffins and 258 cycloalkanes) were calculated and correlated with their boiling points.The best regression equation was obtained as follow: ln(1056-T_b)=6.9480-0.1040N_ ec -0.0086890S-0.009614R_ n+0.01998R 0.5_n,n=474,R=0.9989,F=52627,S=5.63K.The model was checked up by the Jackknife’s method.It should have overall steadiness and could be used for predicting the boiling point of saturated hydrocarbons.展开更多
文摘The purpose of this paper is to present a novel way to building quantitative structure-property relationship(QSPR) models for predicting the gas-to-benzene solvation enthalpy(ΔHSolv) of 158 organic compounds based on molecular descriptors calculated from the structure alone. Different kinds of descriptors were calculated for each compounds using dragon package. The variable selection technique of enhanced replacement method(ERM) was employed to select optimal subset of descriptors. Our investigation reveals that the dependence of physico-chemical properties on solvation enthalpy is a nonlinear observable fact and that ERM method is unable to model the solvation enthalpy accurately. The standard error value of prediction set for support vector machine(SVM) is 1.681 kJ ? mol^(-1) while it is 4.624 kJ ? mol^(-1) for ERM. The results established that the calculated ΔHSolvvalues by SVM were in good agreement with the experimental ones, and the performances of the SVM models were superior to those obtained by ERM one. This indicates that SVM can be used as an alternative modeling tool for QSPR studies.
文摘以物质的电子、空间等结构性质为基础,运用Gaussian98和Cerius2程序包对偶极距(Dipole)、最高占据轨道能量(EHOMO)、最低空轨道能量(ELUMO)、分子总能量(E)、旋转键(Rotlbonds)、最弱的R-NO2键长(R-NO2 bond length,R为C或N)、氢键供体(Hbond donor)和中点势(Vmid)8种描述符进行了计算,采用Cerius2程序包中的QSPR方法建立了芳香系炸药密度与8种描述符之间的构效关系式,相关系数R为0.909,30个化合物所构成的训练集和15个化合物所构成的预测集预测密度与实测密度之间的平均误差分别为3.33%和2.94%。
文摘In this paper, according to the peak numbers of the nuclear magnetic resonance and the Randic embranchment degree (δ_i) of carbon atom i, the carbon atom’s environment valence g_i is defined as: g_i=(t_i+δ_i)/2.The g_i reflect the characteristic of each carbon atom, and as well as the conjunction detail of the carbon atom with other carbon atoms.So, the g_ i could distinguish better the chemical environment of each carbon atom in the molecule than δ_i.A connectivity index of environment valence ( mS) and its athwart index ( mS′) are proposed based on the adjacency matrix and the carbon atom’s environment valence g_i.Among them, the 0S and 0S′ include the characteristic and the connectivity of each carbon atom, the 1S and 1S′ reflect the second conjunction between carbon atoms.Based on 0S′ and N(the number of carbon atom), a new structural parameter——symmetry degree (N_ ec), is defined as: N_ ec=[(0S′_S/0S′_C)N] 2/3,and the N_ ec reflect the size of the molecule as well as the symmetry of the molecule.The N_ ec, 0S and R_n(the biggest ring’s edge numbers of cycloalkanes) of 474 saturated hydrocarbons (216 paraffins and 258 cycloalkanes) were calculated and correlated with their boiling points.The best regression equation was obtained as follow: ln(1056-T_b)=6.9480-0.1040N_ ec -0.0086890S-0.009614R_ n+0.01998R 0.5_n,n=474,R=0.9989,F=52627,S=5.63K.The model was checked up by the Jackknife’s method.It should have overall steadiness and could be used for predicting the boiling point of saturated hydrocarbons.