The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will resu...The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.展开更多
In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the...In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.展开更多
In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using ...In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using 0 and 1. So we can use the support vector machine regression method to fit the core-ratio value and predict the protein binding sites. We also design a new group of physical and chemical descriptors to characterize the binding sites. The new descriptors are more effective, with an averaging procedure used. Our test shows that much better prediction results can be obtained by the support vector regression (SVR) method than by the support vector classification method.展开更多
Due to its complicated matrix effects, rapid quantitative analysis of chromium in agricultural soils is difficult without the concentration gradient samples by laser-induced breakdown spectroscopy. To improve the anal...Due to its complicated matrix effects, rapid quantitative analysis of chromium in agricultural soils is difficult without the concentration gradient samples by laser-induced breakdown spectroscopy. To improve the analysis speed and accuracy, two calibration models are built with the support vector machine method: one considering the whole spectra and the other based on the segmental spectra input. Considering the results of the multiple linear regression analysis, three segmental spectra are chosen as the input variables of the support vector regression (SVR) model. Compared with the results of the SVR model with the whole spectra input, the relative standard error of prediction is reduced from 3.18% to 2.61% and the running time is saved due to the decrease in the number of input variables, showing the robustness in rapid soil analysis without the concentration gradient samples.展开更多
Support vector regression (SVR) combined with particle swarm optimization for its parameter optimization is employed to establish a model for predicting the Henry constants of multi-walled carbon nanotubes (MWNTs)...Support vector regression (SVR) combined with particle swarm optimization for its parameter optimization is employed to establish a model for predicting the Henry constants of multi-walled carbon nanotubes (MWNTs) for adsorption of volatile organic compounds (VOCs). The prediction performance of SVR is compared with those of the model of theoretical linear salvation energy relationship (TLSER). By using leave-one-out cross validation of SVR test Henry constants for adsorption of 35 VOCs on MWNTs, the root mean square error is 0.080, the mean absolute percentage error is only 1.19~, and the correlation coefficient (R2) is as high as 0.997. Compared with the results of the TLSER model, it is shown that the estimated errors by SVR are ali smaller than those achieved by TLSER. It reveals that the generalization ability of SVR is superior to that of the TLSER model Meanwhile, multifactor analysis is adopted for investigation of the influences of each molecular structure descriptor on the Henry constants. According to the TLSER model, the adsorption mechanism of adsorption of carbon nanotubes of VOCs is mainly a result of van der Waals and interactions of hydrogen bonds. These can provide the theoretical support for the application of carbon nanotube adsorption of VOCs and can make up for the lack of experimental data.展开更多
A method of multiple outputs least squares support vector regression (LS-SVR) was developed and described in detail, with the radial basis function (RBF) as the kernel function. The method was applied to predict t...A method of multiple outputs least squares support vector regression (LS-SVR) was developed and described in detail, with the radial basis function (RBF) as the kernel function. The method was applied to predict the future state of the power-shift steering transmission (PSST). A prediction model of PSST was gotten with multiple outputs LS-SVR. The model performance was greatly influenced by the penalty parameter γ and kernel parameter σ2 which were optimized using cross validation method. The training and prediction of the model were done with spectrometric oil analysis data. The predictive and actual values were compared and a fault in the second PSST was found. The research proved that this method had good accuracy in PSST fault prediction, and any possible problem in PSST could be found through a comparative analysis.展开更多
Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for...Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for this process is to combine inertial navigation system sensor information with the global navigation satellite system(GNSS)signal.However,some factors can interfere with the GNSS signal,such as ionospheric scintillation,jamming,or spoofing.One alternative method to avoid using the GNSS signal is to apply an image processing approach by matching UAV images with georeferenced images.But a high effort is required for image edge extraction.Here a support vector regression(SVR)model is proposed to reduce this computational load and processing time.The dynamic partial reconfiguration(DPR)of part of the SVR datapath is implemented to accelerate the process,reduce the area,and analyze its granularity by increasing the grain size of the reconfigurable region.Results show that the implementation in hardware is 68 times faster than that in software.This architecture with DPR also facilitates the low power consumption of 4 mW,leading to a reduction of 57%than that without DPR.This is also the lowest power consumption in current machine learning hardware implementations.Besides,the circuitry area is 41 times smaller.SVR with Gaussian kernel shows a success rate of 99.18%and minimum square error of 0.0146 for testing with the planning trajectory.This system is useful for adaptive applications where the user/designer can modify/reconfigure the hardware layout during its application,thus contributing to lower power consumption,smaller hardware area,and shorter execution time.展开更多
The application of carbon dioxide(CO_(2)) in enhanced oil recovery(EOR) has increased significantly, in which CO_(2) solubility in oil is a key parameter in predicting CO_(2) flooding performance. Hydrocarbons are the...The application of carbon dioxide(CO_(2)) in enhanced oil recovery(EOR) has increased significantly, in which CO_(2) solubility in oil is a key parameter in predicting CO_(2) flooding performance. Hydrocarbons are the major constituents of oil, thus the focus of this work lies in investigating the solubility of CO_(2) in hydrocarbons. However, current experimental measurements are time-consuming, and equations of state can be computationally complex. To address these challenges, we developed an artificial intelligence-based model to predict the solubility of CO_(2) in hydrocarbons under varying conditions of temperature, pressure, molecular weight, and density. Using experimental data from previous studies,we trained and predicted the solubility using four machine learning models: support vector regression(SVR), extreme gradient boosting(XGBoost), random forest(RF), and multilayer perceptron(MLP).Among four models, the XGBoost model has the best predictive performance, with an R^(2) of 0.9838.Additionally, sensitivity analysis and evaluation of the relative impacts of each input parameter indicate that the prediction of CO_(2) solubility in hydrocarbons is most sensitive to pressure. Furthermore, our trained model was compared with existing models, demonstrating higher accuracy and applicability of our model. The developed machine learning-based model provides a more efficient and accurate approach for predicting CO_(2) solubility in hydrocarbons, which may contribute to the advancement of CO_(2)-related applications in the petroleum industry.展开更多
基金Hebei Province Key Research and Development Project(No.20313701D)Hebei Province Key Research and Development Project(No.19210404D)+13 种基金Mobile computing and universal equipment for the Beijing Key Laboratory Open Project,The National Social Science Fund of China(17AJL014)Beijing University of Posts and Telecommunications Construction of World-Class Disciplines and Characteristic Development Guidance Special Fund “Cultural Inheritance and Innovation”Project(No.505019221)National Natural Science Foundation of China(No.U1536112)National Natural Science Foundation of China(No.81673697)National Natural Science Foundation of China(61872046)The National Social Science Fund Key Project of China(No.17AJL014)“Blue Fire Project”(Huizhou)University of Technology Joint Innovation Project(CXZJHZ201729)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902218004)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902024006)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201901197007)Industry-University Cooperation Collaborative Education Project of the Ministry of Education(No.201901199005)The Ministry of Education Industry-University Cooperation Collaborative Education Project(No.201901197001)Shijiazhuang science and technology plan project(236240267A)Hebei Province key research and development plan project(20312701D)。
文摘The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.
基金Project supported by the National Natural Science Foundation of China (Grant No 60573065)the Natural Science Foundation of Shandong Province,China (Grant No Y2007G33)the Key Subject Research Foundation of Shandong Province,China(Grant No XTD0708)
文摘In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.
基金Project supported by the National Natural Science Foundation of China (Grant Nos. 10674172 and 10874229)
文摘In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using 0 and 1. So we can use the support vector machine regression method to fit the core-ratio value and predict the protein binding sites. We also design a new group of physical and chemical descriptors to characterize the binding sites. The new descriptors are more effective, with an averaging procedure used. Our test shows that much better prediction results can be obtained by the support vector regression (SVR) method than by the support vector classification method.
基金Supported by the National High-Technology Research and Development Program of China under Grant Nos 2014AA06A513 and 2013AA065502the National Natural Science Foundation of China under Grant No 61378041the Anhui Province Outstanding Youth Science Fund of China under Grant No 1508085JGD02
文摘Due to its complicated matrix effects, rapid quantitative analysis of chromium in agricultural soils is difficult without the concentration gradient samples by laser-induced breakdown spectroscopy. To improve the analysis speed and accuracy, two calibration models are built with the support vector machine method: one considering the whole spectra and the other based on the segmental spectra input. Considering the results of the multiple linear regression analysis, three segmental spectra are chosen as the input variables of the support vector regression (SVR) model. Compared with the results of the SVR model with the whole spectra input, the relative standard error of prediction is reduced from 3.18% to 2.61% and the running time is saved due to the decrease in the number of input variables, showing the robustness in rapid soil analysis without the concentration gradient samples.
基金Supported by the Innovative Talent Funds for Project 985 under Grant No WLYJSBJRCTD201102the Fundamental Research Funds for the Central Universities under Grant No CQDXWL-2013-014+1 种基金the Natural Science Foundation of Chongqing under Grant No CSTC2006BB5240the Program for New Century Excellent Talents in Universities of China under Grant No NCET-07-0903
文摘Support vector regression (SVR) combined with particle swarm optimization for its parameter optimization is employed to establish a model for predicting the Henry constants of multi-walled carbon nanotubes (MWNTs) for adsorption of volatile organic compounds (VOCs). The prediction performance of SVR is compared with those of the model of theoretical linear salvation energy relationship (TLSER). By using leave-one-out cross validation of SVR test Henry constants for adsorption of 35 VOCs on MWNTs, the root mean square error is 0.080, the mean absolute percentage error is only 1.19~, and the correlation coefficient (R2) is as high as 0.997. Compared with the results of the TLSER model, it is shown that the estimated errors by SVR are ali smaller than those achieved by TLSER. It reveals that the generalization ability of SVR is superior to that of the TLSER model Meanwhile, multifactor analysis is adopted for investigation of the influences of each molecular structure descriptor on the Henry constants. According to the TLSER model, the adsorption mechanism of adsorption of carbon nanotubes of VOCs is mainly a result of van der Waals and interactions of hydrogen bonds. These can provide the theoretical support for the application of carbon nanotube adsorption of VOCs and can make up for the lack of experimental data.
基金Supported by the Ministerial Level Advanced Research Foundation(3031030)the"111"Project(B08043)
文摘A method of multiple outputs least squares support vector regression (LS-SVR) was developed and described in detail, with the radial basis function (RBF) as the kernel function. The method was applied to predict the future state of the power-shift steering transmission (PSST). A prediction model of PSST was gotten with multiple outputs LS-SVR. The model performance was greatly influenced by the penalty parameter γ and kernel parameter σ2 which were optimized using cross validation method. The training and prediction of the model were done with spectrometric oil analysis data. The predictive and actual values were compared and a fault in the second PSST was found. The research proved that this method had good accuracy in PSST fault prediction, and any possible problem in PSST could be found through a comparative analysis.
基金financially supported by the National Council for Scientific and Technological Development(CNPq,Brazil),Swedish-Brazilian Research and Innovation Centre(CISB),and Saab AB under Grant No.CNPq:200053/2022-1the National Council for Scientific and Technological Development(CNPq,Brazil)under Grants No.CNPq:312924/2017-8 and No.CNPq:314660/2020-8.
文摘Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for this process is to combine inertial navigation system sensor information with the global navigation satellite system(GNSS)signal.However,some factors can interfere with the GNSS signal,such as ionospheric scintillation,jamming,or spoofing.One alternative method to avoid using the GNSS signal is to apply an image processing approach by matching UAV images with georeferenced images.But a high effort is required for image edge extraction.Here a support vector regression(SVR)model is proposed to reduce this computational load and processing time.The dynamic partial reconfiguration(DPR)of part of the SVR datapath is implemented to accelerate the process,reduce the area,and analyze its granularity by increasing the grain size of the reconfigurable region.Results show that the implementation in hardware is 68 times faster than that in software.This architecture with DPR also facilitates the low power consumption of 4 mW,leading to a reduction of 57%than that without DPR.This is also the lowest power consumption in current machine learning hardware implementations.Besides,the circuitry area is 41 times smaller.SVR with Gaussian kernel shows a success rate of 99.18%and minimum square error of 0.0146 for testing with the planning trajectory.This system is useful for adaptive applications where the user/designer can modify/reconfigure the hardware layout during its application,thus contributing to lower power consumption,smaller hardware area,and shorter execution time.
基金supported by the Fundamental Research Funds for the National Major Science and Technology Projects of China (No. 2017ZX05009-005)。
文摘The application of carbon dioxide(CO_(2)) in enhanced oil recovery(EOR) has increased significantly, in which CO_(2) solubility in oil is a key parameter in predicting CO_(2) flooding performance. Hydrocarbons are the major constituents of oil, thus the focus of this work lies in investigating the solubility of CO_(2) in hydrocarbons. However, current experimental measurements are time-consuming, and equations of state can be computationally complex. To address these challenges, we developed an artificial intelligence-based model to predict the solubility of CO_(2) in hydrocarbons under varying conditions of temperature, pressure, molecular weight, and density. Using experimental data from previous studies,we trained and predicted the solubility using four machine learning models: support vector regression(SVR), extreme gradient boosting(XGBoost), random forest(RF), and multilayer perceptron(MLP).Among four models, the XGBoost model has the best predictive performance, with an R^(2) of 0.9838.Additionally, sensitivity analysis and evaluation of the relative impacts of each input parameter indicate that the prediction of CO_(2) solubility in hydrocarbons is most sensitive to pressure. Furthermore, our trained model was compared with existing models, demonstrating higher accuracy and applicability of our model. The developed machine learning-based model provides a more efficient and accurate approach for predicting CO_(2) solubility in hydrocarbons, which may contribute to the advancement of CO_(2)-related applications in the petroleum industry.