In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the...In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.展开更多
The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will resu...The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.展开更多
In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using ...In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using 0 and 1. So we can use the support vector machine regression method to fit the core-ratio value and predict the protein binding sites. We also design a new group of physical and chemical descriptors to characterize the binding sites. The new descriptors are more effective, with an averaging procedure used. Our test shows that much better prediction results can be obtained by the support vector regression (SVR) method than by the support vector classification method.展开更多
Aiming at reducing the deficiency of the traditional fire pre-warning algorithms and the intelligent fire pre-warning algorithms such as artificial neural network,and then to improve the accuracy of fire prewarning fo...Aiming at reducing the deficiency of the traditional fire pre-warning algorithms and the intelligent fire pre-warning algorithms such as artificial neural network,and then to improve the accuracy of fire prewarning for high-rise buildings,a composite fire pre-warning controller is designed according to the characteristic( nonlinear,less historical data,many influence factors),also a high-rise building fire pre-warning model is set up based on the support vector regression( SV R). Then the wood fire standard history data is applied to make empirical analysis. The research results can provide a reliable decision support framework for high-rise building fire pre-warning.展开更多
Eight casing failure modes and 32 risk factors in oil and gas wells are given in this paper. According to the quantitative analysis of the influence degree and occurrence probability of risk factors, the Borda counts ...Eight casing failure modes and 32 risk factors in oil and gas wells are given in this paper. According to the quantitative analysis of the influence degree and occurrence probability of risk factors, the Borda counts for failure modes are obtained with the Borda method. The risk indexes of failure modes are derived from the Borda matrix. Based on the support vector machine (SVM), a casing life prediction model is established. In the prediction model, eight risk indexes are defined as input vectors and casing life is defined as the output vector. The ideal model parameters are determined with the training set from 19 wells with casing failure. The casing life prediction software is developed with the SVM model as a predictor. The residual life of 60 wells with casing failure is predicted with the software, and then compared with the actual casing life. The comparison results show that the casing life prediction software with the SVM model has high accuracy.展开更多
A forecasting system of patent application counts is studied in this paper. The optimization model proposed in the research is based on support vector machines (SVM), in which cross-validation algorithm is used for ...A forecasting system of patent application counts is studied in this paper. The optimization model proposed in the research is based on support vector machines (SVM), in which cross-validation algorithm is used for preferences selection. Results of data simulation show that the proposed method has higher forecasting precision power and stronger generalization ability than BP neural network and RBF neural network. In addi- tion, it is feasible and effective in forecasting patent application counts.展开更多
The use of support vector machines (SVM) for watermarking of 3D mesh models is investigated. SVMs have been widely explored for images, audio, and video watermarking but to date the potential of SVMs has not been ex...The use of support vector machines (SVM) for watermarking of 3D mesh models is investigated. SVMs have been widely explored for images, audio, and video watermarking but to date the potential of SVMs has not been explored in the 3D watermarking domain. The proposed approach utilizes SVM as a binary classifier for the selection of vertices for watermark embedding. The SVM is trained with feature vectors derived from the angular difference between the eigen normal and surface normals of a 1-ring neighborhood of vertices taken from normalized 3D mesh models. The SVM learns to classify vertices as appropriate or inappropriate candidates for modification in order to accommodate the watermark. Experimental results verify that the proposed algorithm is imperceptible and robust against attacks such as mesh smoothing, cropping and noise addition.展开更多
The application of carbon dioxide(CO_(2)) in enhanced oil recovery(EOR) has increased significantly, in which CO_(2) solubility in oil is a key parameter in predicting CO_(2) flooding performance. Hydrocarbons are the...The application of carbon dioxide(CO_(2)) in enhanced oil recovery(EOR) has increased significantly, in which CO_(2) solubility in oil is a key parameter in predicting CO_(2) flooding performance. Hydrocarbons are the major constituents of oil, thus the focus of this work lies in investigating the solubility of CO_(2) in hydrocarbons. However, current experimental measurements are time-consuming, and equations of state can be computationally complex. To address these challenges, we developed an artificial intelligence-based model to predict the solubility of CO_(2) in hydrocarbons under varying conditions of temperature, pressure, molecular weight, and density. Using experimental data from previous studies,we trained and predicted the solubility using four machine learning models: support vector regression(SVR), extreme gradient boosting(XGBoost), random forest(RF), and multilayer perceptron(MLP).Among four models, the XGBoost model has the best predictive performance, with an R^(2) of 0.9838.Additionally, sensitivity analysis and evaluation of the relative impacts of each input parameter indicate that the prediction of CO_(2) solubility in hydrocarbons is most sensitive to pressure. Furthermore, our trained model was compared with existing models, demonstrating higher accuracy and applicability of our model. The developed machine learning-based model provides a more efficient and accurate approach for predicting CO_(2) solubility in hydrocarbons, which may contribute to the advancement of CO_(2)-related applications in the petroleum industry.展开更多
基于Wi-Fi感知的室内入侵检测系统是一种无需在移动实体上附加任何设备即可检测移动实体的系统。针对目前检测方法忽略复杂的幅度变化和相位变化引起的潜在影响,提出了融合长短期记忆网络和支持向量机的室内入侵检测新方法LSID(Long Sho...基于Wi-Fi感知的室内入侵检测系统是一种无需在移动实体上附加任何设备即可检测移动实体的系统。针对目前检测方法忽略复杂的幅度变化和相位变化引起的潜在影响,提出了融合长短期记忆网络和支持向量机的室内入侵检测新方法LSID(Long Short-Term Memory and Support Vector Machine Intrusion Detection)。LSID方法采用一种新的特征值建模方式,利用长短期记忆网络可以学习到时序特征并且能捕捉时序信号长期的依赖关系,将信道状态信息真实值与长短期记忆神经网络的预测值之差作为特征值,能更准确地捕捉入侵者对信号状态信息的影响。该检测方法在学校实验室环境下经过多次实验验证,最终检测准确率达到99.21%,通过多组实验比对,结果显示LSID方法具有有效性和可行性,相比于其他入侵检测方法准确率明显提升。展开更多
目的采用4种机器学习算法分别构建结直肠癌患者术前营养不良的临床风险预测模型,探讨其预测价值。方法回顾性收集2023年1月—2024年5月在新疆医科大学附属肿瘤医院胃肠外科就诊的412例结直肠癌患者的术前资料;按7∶3的比例随机分为训练...目的采用4种机器学习算法分别构建结直肠癌患者术前营养不良的临床风险预测模型,探讨其预测价值。方法回顾性收集2023年1月—2024年5月在新疆医科大学附属肿瘤医院胃肠外科就诊的412例结直肠癌患者的术前资料;按7∶3的比例随机分为训练集(n=288)和验证集(n=124),采用单因素分析及二元logistic回归分析筛选出术前营养不良的预测因子;基于逻辑回归(LR)、支持向量机(SVM)、轻量级梯度提升(LightGBM)、多层感知机(MLP)4种机器学习算法分别构建结直肠癌患者术前营养不良风险预测模型,绘制ROC曲线评价4种算法模型预测效能,通过Delong检验比较4种模型的AUC差异。选择最优算法模型,采用校准曲线和临床决策曲线(DCA曲线)进行验证。结果(1)结直肠癌患者术前营养不良发生率为33.7%,年龄、Braden评分是其独立危险因素;(2)训练集中LightGBM算法模型预测结直肠癌患者术前发生营养不良的AUC高于LR、SVM、MLP算法模型(0.941 VS 0.874、0.830、0.831);(3)ROC曲线结果提示,LightGBM算法模型验证集中预测结直肠癌患者术前发生营养不良的AUC为0.926(95%CI:0.882~0.969);校准曲线显示,LightGBM算法模型预测结直肠癌患者术前发生营养不良的曲线与实际发生营养不良一致性良好;DCA曲线结果显示,LightGBM算法模型在阈值概率区间为0.16~0.79可以提供显著临床净收益。结论基于LightGBM算法构建的临床预测模型在预测结直肠癌患者术前发生营养不良中有较高价值,可以为临床人员实施营养管理提供参考。展开更多
基金Project supported by the National Natural Science Foundation of China (Grant No 60573065)the Natural Science Foundation of Shandong Province,China (Grant No Y2007G33)the Key Subject Research Foundation of Shandong Province,China(Grant No XTD0708)
文摘In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.
基金Hebei Province Key Research and Development Project(No.20313701D)Hebei Province Key Research and Development Project(No.19210404D)+13 种基金Mobile computing and universal equipment for the Beijing Key Laboratory Open Project,The National Social Science Fund of China(17AJL014)Beijing University of Posts and Telecommunications Construction of World-Class Disciplines and Characteristic Development Guidance Special Fund “Cultural Inheritance and Innovation”Project(No.505019221)National Natural Science Foundation of China(No.U1536112)National Natural Science Foundation of China(No.81673697)National Natural Science Foundation of China(61872046)The National Social Science Fund Key Project of China(No.17AJL014)“Blue Fire Project”(Huizhou)University of Technology Joint Innovation Project(CXZJHZ201729)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902218004)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902024006)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201901197007)Industry-University Cooperation Collaborative Education Project of the Ministry of Education(No.201901199005)The Ministry of Education Industry-University Cooperation Collaborative Education Project(No.201901197001)Shijiazhuang science and technology plan project(236240267A)Hebei Province key research and development plan project(20312701D)。
文摘The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.
基金Project supported by the National Natural Science Foundation of China (Grant Nos. 10674172 and 10874229)
文摘In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using 0 and 1. So we can use the support vector machine regression method to fit the core-ratio value and predict the protein binding sites. We also design a new group of physical and chemical descriptors to characterize the binding sites. The new descriptors are more effective, with an averaging procedure used. Our test shows that much better prediction results can be obtained by the support vector regression (SVR) method than by the support vector classification method.
基金Supported by the National Natural Science Foundation of China(11072035)
文摘Aiming at reducing the deficiency of the traditional fire pre-warning algorithms and the intelligent fire pre-warning algorithms such as artificial neural network,and then to improve the accuracy of fire prewarning for high-rise buildings,a composite fire pre-warning controller is designed according to the characteristic( nonlinear,less historical data,many influence factors),also a high-rise building fire pre-warning model is set up based on the support vector regression( SV R). Then the wood fire standard history data is applied to make empirical analysis. The research results can provide a reliable decision support framework for high-rise building fire pre-warning.
基金support from "973 Project" (Contract No. 2010CB226706)
文摘Eight casing failure modes and 32 risk factors in oil and gas wells are given in this paper. According to the quantitative analysis of the influence degree and occurrence probability of risk factors, the Borda counts for failure modes are obtained with the Borda method. The risk indexes of failure modes are derived from the Borda matrix. Based on the support vector machine (SVM), a casing life prediction model is established. In the prediction model, eight risk indexes are defined as input vectors and casing life is defined as the output vector. The ideal model parameters are determined with the training set from 19 wells with casing failure. The casing life prediction software is developed with the SVM model as a predictor. The residual life of 60 wells with casing failure is predicted with the software, and then compared with the actual casing life. The comparison results show that the casing life prediction software with the SVM model has high accuracy.
基金Sponsored by "985" Philosophy and Social Science Innovation Base of the Ministry of Education of China (107008200400024)
文摘A forecasting system of patent application counts is studied in this paper. The optimization model proposed in the research is based on support vector machines (SVM), in which cross-validation algorithm is used for preferences selection. Results of data simulation show that the proposed method has higher forecasting precision power and stronger generalization ability than BP neural network and RBF neural network. In addi- tion, it is feasible and effective in forecasting patent application counts.
文摘The use of support vector machines (SVM) for watermarking of 3D mesh models is investigated. SVMs have been widely explored for images, audio, and video watermarking but to date the potential of SVMs has not been explored in the 3D watermarking domain. The proposed approach utilizes SVM as a binary classifier for the selection of vertices for watermark embedding. The SVM is trained with feature vectors derived from the angular difference between the eigen normal and surface normals of a 1-ring neighborhood of vertices taken from normalized 3D mesh models. The SVM learns to classify vertices as appropriate or inappropriate candidates for modification in order to accommodate the watermark. Experimental results verify that the proposed algorithm is imperceptible and robust against attacks such as mesh smoothing, cropping and noise addition.
基金supported by the Fundamental Research Funds for the National Major Science and Technology Projects of China (No. 2017ZX05009-005)。
文摘The application of carbon dioxide(CO_(2)) in enhanced oil recovery(EOR) has increased significantly, in which CO_(2) solubility in oil is a key parameter in predicting CO_(2) flooding performance. Hydrocarbons are the major constituents of oil, thus the focus of this work lies in investigating the solubility of CO_(2) in hydrocarbons. However, current experimental measurements are time-consuming, and equations of state can be computationally complex. To address these challenges, we developed an artificial intelligence-based model to predict the solubility of CO_(2) in hydrocarbons under varying conditions of temperature, pressure, molecular weight, and density. Using experimental data from previous studies,we trained and predicted the solubility using four machine learning models: support vector regression(SVR), extreme gradient boosting(XGBoost), random forest(RF), and multilayer perceptron(MLP).Among four models, the XGBoost model has the best predictive performance, with an R^(2) of 0.9838.Additionally, sensitivity analysis and evaluation of the relative impacts of each input parameter indicate that the prediction of CO_(2) solubility in hydrocarbons is most sensitive to pressure. Furthermore, our trained model was compared with existing models, demonstrating higher accuracy and applicability of our model. The developed machine learning-based model provides a more efficient and accurate approach for predicting CO_(2) solubility in hydrocarbons, which may contribute to the advancement of CO_(2)-related applications in the petroleum industry.
文摘基于Wi-Fi感知的室内入侵检测系统是一种无需在移动实体上附加任何设备即可检测移动实体的系统。针对目前检测方法忽略复杂的幅度变化和相位变化引起的潜在影响,提出了融合长短期记忆网络和支持向量机的室内入侵检测新方法LSID(Long Short-Term Memory and Support Vector Machine Intrusion Detection)。LSID方法采用一种新的特征值建模方式,利用长短期记忆网络可以学习到时序特征并且能捕捉时序信号长期的依赖关系,将信道状态信息真实值与长短期记忆神经网络的预测值之差作为特征值,能更准确地捕捉入侵者对信号状态信息的影响。该检测方法在学校实验室环境下经过多次实验验证,最终检测准确率达到99.21%,通过多组实验比对,结果显示LSID方法具有有效性和可行性,相比于其他入侵检测方法准确率明显提升。
文摘目的采用4种机器学习算法分别构建结直肠癌患者术前营养不良的临床风险预测模型,探讨其预测价值。方法回顾性收集2023年1月—2024年5月在新疆医科大学附属肿瘤医院胃肠外科就诊的412例结直肠癌患者的术前资料;按7∶3的比例随机分为训练集(n=288)和验证集(n=124),采用单因素分析及二元logistic回归分析筛选出术前营养不良的预测因子;基于逻辑回归(LR)、支持向量机(SVM)、轻量级梯度提升(LightGBM)、多层感知机(MLP)4种机器学习算法分别构建结直肠癌患者术前营养不良风险预测模型,绘制ROC曲线评价4种算法模型预测效能,通过Delong检验比较4种模型的AUC差异。选择最优算法模型,采用校准曲线和临床决策曲线(DCA曲线)进行验证。结果(1)结直肠癌患者术前营养不良发生率为33.7%,年龄、Braden评分是其独立危险因素;(2)训练集中LightGBM算法模型预测结直肠癌患者术前发生营养不良的AUC高于LR、SVM、MLP算法模型(0.941 VS 0.874、0.830、0.831);(3)ROC曲线结果提示,LightGBM算法模型验证集中预测结直肠癌患者术前发生营养不良的AUC为0.926(95%CI:0.882~0.969);校准曲线显示,LightGBM算法模型预测结直肠癌患者术前发生营养不良的曲线与实际发生营养不良一致性良好;DCA曲线结果显示,LightGBM算法模型在阈值概率区间为0.16~0.79可以提供显著临床净收益。结论基于LightGBM算法构建的临床预测模型在预测结直肠癌患者术前发生营养不良中有较高价值,可以为临床人员实施营养管理提供参考。