As an important material for manufacturing resonant components of musical instruments,Paulownia has an important influence on the sound quality of Ruan.In this paper,a model for evaluating the sound quality of Ruan ba...As an important material for manufacturing resonant components of musical instruments,Paulownia has an important influence on the sound quality of Ruan.In this paper,a model for evaluating the sound quality of Ruan based on the vibration characteristics of wood is developed using machine learning methods.Generally,the selection of materials for Ruan manufacturing relies primarily on manually weighing,observing,striking,and listening by the instrument technician.Deficiencies in scientific theory have hindered the quality of the finished Ruan.In this study,nine Ruans were manufactured,and a prediction model of Ruan sound quality was proposed based on the raw material information of Ruans.Out of a total of 180 data sets,145 and 45 sets were chosen for training and validation,respec-tively.In this paper,typical correlation analysis was used to determine the correlation between two single indicators in two adjacent pairwise combinations of the measured objects in each stage of the production process in Ruan.The vibra-tion characteristics of the wood were tested,and a model for predicting the evaluation of Ruan’s acoustic qualities was developed by measuring the vibration characteristics of the resonating plate material.The acoustic quality of the Ruan sound board wood was evaluated and predicted using machine learning model generalized regression neural net-work.The results show that the prediction of Ruan sound quality can be achieved using Matlab simulation based on the vibration characteristics of the soundboard wood.When the model-predicted values were compared with the tradi-tional predicted results,it was found that the generalized regression neural network had good performance,achieving an accuracy of 93.8%which was highly consistent with the experimental results.It was concluded that the model can accurately predict the acoustic quality of the Ruan based on the vibration performance of the soundboards.展开更多
A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the develo...A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the development of proper featurization method based on physicochemical nature of target proprieties can improve the predictive power of ML models with a smaller database.In this work,we show that two new featurization methods,volume occupation spatial matrix and heat contribution spatial matrix,can improve the accuracy in predicting energetic materials' crystal density(ρ_(crystal)) and solid phase enthalpy of formation(H_(f,solid)) using a database containing 451 energetic molecules.Their mean absolute errors are reduced from 0.048 g/cm~3 and 24.67 kcal/mol to 0.035 g/cm~3 and 9.66 kcal/mol,respectively.By leave-one-out-cross-validation,the newly developed ML models can be used to determine the performance of most kinds of energetic materials except cubanes.Our ML models are applied to predict ρ_(crystal) and H_(f,solid) of CHON-based molecules of the 150 million sized PubChem database,and screened out 56 candidates with competitive detonation performance and reasonable chemical structures.With further improvement in future,spatial matrices have the potential of becoming multifunctional ML simulation tools that could provide even better predictions in wider fields of materials science.展开更多
The liquid loading is one of the most frequently encountered phenomena in the transportation of gas pipeline,reducing the transmission efficiency and threatening the flow assurance.However,most of the traditional mech...The liquid loading is one of the most frequently encountered phenomena in the transportation of gas pipeline,reducing the transmission efficiency and threatening the flow assurance.However,most of the traditional mechanism models are semi-empirical models,and have to be resolved under different working conditions with complex calculation process.The development of big data technology and artificial intelligence provides the possibility to establish data-driven models.This paper aims to establish a liquid loading prediction model for natural gas pipeline with high generalization ability based on machine learning.First,according to the characteristics of actual gas pipeline,a variety of reasonable combinations of working conditions such as different gas velocity,pipe diameters,water contents and outlet pressures were set,and multiple undulating pipeline topography with different elevation differences was established.Then a large number of simulations were performed by simulator OLGA to obtain the data required for machine learning.After data preprocessing,six supervised learning algorithms,including support vector machine(SVM),decision tree(DT),random forest(RF),artificial neural network(ANN),plain Bayesian classification(NBC),and K nearest neighbor algorithm(KNN),were compared to evaluate the performance of liquid loading prediction.Finally,the RF and KNN with better performance were selected for parameter tuning and then used to the actual pipeline for liquid loading location prediction.Compared with OLGA simulation,the established data-driven model not only improves calculation efficiency and reduces workload,but also can provide technical support for gas pipeline flow assurance.展开更多
Reservoir identification and production prediction are two of the most important tasks in petroleum exploration and development.Machine learning(ML)methods are used for petroleum-related studies,but have not been appl...Reservoir identification and production prediction are two of the most important tasks in petroleum exploration and development.Machine learning(ML)methods are used for petroleum-related studies,but have not been applied to reservoir identification and production prediction based on reservoir identification.Production forecasting studies are typically based on overall reservoir thickness and lack accuracy when reservoirs contain a water or dry layer without oil production.In this paper,a systematic ML method was developed using classification models for reservoir identification,and regression models for production prediction.The production models are based on the reservoir identification results.To realize the reservoir identification,seven optimized ML methods were used:four typical single ML methods and three ensemble ML methods.These methods classify the reservoir into five types of layers:water,dry and three levels of oil(I oil layer,II oil layer,III oil layer).The validation and test results of these seven optimized ML methods suggest the three ensemble methods perform better than the four single ML methods in reservoir identification.The XGBoost produced the model with the highest accuracy;up to 99%.The effective thickness of I and II oil layers determined during the reservoir identification was fed into the models for predicting production.Effective thickness considers the distribution of the water and the oil resulting in a more reasonable production prediction compared to predictions based on the overall reservoir thickness.To validate the superiority of the ML methods,reference models using overall reservoir thickness were built for comparison.The models based on effective thickness outperformed the reference models in every evaluation metric.The prediction accuracy of the ML models using effective thickness were 10%higher than that of reference model.Without the personal error or data distortion existing in traditional methods,this novel system realizes rapid analysis of data while reducing the time required to resolve reservoir classification and production prediction challenges.The ML models using the effective thickness obtained from reservoir identification were more accurate when predicting oil production compared to previous studies which use overall reservoir thickness.展开更多
Growth of high-quality single crystals is of great significance for research of condensed matter physics. The exploration of suitable growing conditions for single crystals is expensive and time-consuming, especially ...Growth of high-quality single crystals is of great significance for research of condensed matter physics. The exploration of suitable growing conditions for single crystals is expensive and time-consuming, especially for ternary compounds because of the lack of ternary phase diagram. Here we use machine learning(ML) trained on our experimental data to predict and instruct the growth. Four kinds of ML methods, including support vector machine(SVM), decision tree, random forest and gradient boosting decision tree, are adopted. The SVM method is relatively stable and works well, with an accuracy of 81% in predicting experimental results. By comparison,the accuracy of laboratory reaches 36%. The decision tree model is also used to reveal which features will take critical roles in growing processes.展开更多
A corrosion defect is recognized as one of the most severe phenomena for high-pressure pipelines,especially those served for a long time.Finite-element method and empirical formulas are thereby used for the strength p...A corrosion defect is recognized as one of the most severe phenomena for high-pressure pipelines,especially those served for a long time.Finite-element method and empirical formulas are thereby used for the strength prediction of such pipes with corrosion.However,it is time-consuming for finite-element method and there is a limited application range by using empirical formulas.In order to improve the prediction of strength,this paper investigates the burst pressure of line pipelines with a single corrosion defect subjected to internal pressure based on data-driven methods.Three supervised ML(machine learning)algorithms,including the ANN(artificial neural network),the SVM(support vector machine)and the LR(linear regression),are deployed to train models based on experimental data.Data analysis is first conducted to determine proper pipe features for training.Hyperparameter tuning to control the learning process is then performed to fit the best strength models for corroded pipelines.Among all the proposed data-driven models,the ANN model with three neural layers has the highest training accuracy,but also presents the largest variance.The SVM model provides both high training accuracy and high validation accuracy.The LR model has the best performance in terms of generalization ability.These models can be served as surrogate models by transfer learning with new coming data in future research,facilitating a sustainable and intelligent decision-making of corroded pipelines.展开更多
Extreme learning machine(ELM) has attracted much attention in recent years due to its fast convergence and good performance.Merging both ELM and support vector machine is an important trend,thus yielding an ELM kernel...Extreme learning machine(ELM) has attracted much attention in recent years due to its fast convergence and good performance.Merging both ELM and support vector machine is an important trend,thus yielding an ELM kernel.ELM kernel based methods are able to solve the nonlinear problems by inducing an explicit mapping compared with the commonly-used kernels such as Gaussian kernel.In this paper,the ELM kernel is extended to the least squares support vector regression(LSSVR),so ELM-LSSVR was proposed.ELM-LSSVR can be used to reduce the training and test time simultaneously without extra techniques such as sequential minimal optimization and pruning mechanism.Moreover,the memory space for the training and test was relieved.To confirm the efficacy and feasibility of the proposed ELM-LSSVR,the experiments are reported to demonstrate that ELM-LSSVR takes the advantage of training and test time with comparable accuracy to other algorithms.展开更多
The extended kernel ridge regression(EKRR)method with odd-even effects was adopted to improve the description of the nuclear charge radius using five commonly used nuclear models.These are:(i)the isospin-dependent A^(...The extended kernel ridge regression(EKRR)method with odd-even effects was adopted to improve the description of the nuclear charge radius using five commonly used nuclear models.These are:(i)the isospin-dependent A^(1∕3) formula,(ii)relativistic continuum Hartree-Bogoliubov(RCHB)theory,(iii)Hartree-Fock-Bogoliubov(HFB)model HFB25,(iv)the Weizsacker-Skyrme(WS)model WS*,and(v)HFB25*model.In the last two models,the charge radii were calculated using a five-parameter formula with the nuclear shell corrections and deformations obtained from the WS and HFB25 models,respectively.For each model,the resultant root-mean-square deviation for the 1014 nuclei with proton number Z≥8 can be significantly reduced to 0.009-0.013 fm after considering the modification with the EKRR method.The best among them was the RCHB model,with a root-mean-square deviation of 0.0092 fm.The extrapolation abilities of the KRR and EKRR methods for the neutron-rich region were examined,and it was found that after considering the odd-even effects,the extrapolation power was improved compared with that of the original KRR method.The strong odd-even staggering of nuclear charge radii of Ca and Cu isotopes and the abrupt kinks across the neutron N=126 and 82 shell closures were also calculated and could be reproduced quite well by calculations using the EKRR method.展开更多
The Balise Transmission Module(BTM)unit of the on-board train control system is a crucial component.Due to its unique installation position and complex environment,this unit has a higher fault rate within the on-board...The Balise Transmission Module(BTM)unit of the on-board train control system is a crucial component.Due to its unique installation position and complex environment,this unit has a higher fault rate within the on-board train control system.To conduct fault prediction for the BTM unit based on actual fault data,this study proposes a prediction method combining reliability statistics and machine learning,and achieves the fusion of prediction results from different dimensions through multi-method interactive validation.Firstly,a method for predicting equipment fault time targeting batch equipment is introduced.This method utilizes reliability statistics to construct a model of the remaining faultless operating time distribution considering uncertainty,thereby predicting the remaining faultless operating probability of the BTM unit.Secondly,considering the complexity of the BTM unit’s fault mechanism,the small sample size of fault cases,and the potential presence of multiple fault features in fault text records,an individual-oriented fault prediction method based on Bayesian-optimized Gradient Boosting Regression Tree(Bayes-GBRT)is proposed.This method achieves better prediction results compared to linear regression algorithms and random forest regression algorithms,with an average absolute error of only 0.224 years for predicting the fault time of this type of equipment.Finally,a multi-method interactive validation approach is proposed,enabling the fusion and validation of multi-dimensional results.The results indicate that the predicted fault time and the actual fault time conform to a log-normal distribution,and the parameter estimation results are basically consistent,verifying the accuracy and effectiveness of the prediction results.The above research findings can provide technical support for the maintenance and modification of BTM units,effectively reducing maintenance costs and ensuring the safe operation of high-speed railway,thus having practical engineering value for preventive maintenance.展开更多
基金supported by China Postdoctoral Science Foundation(2019M651240)National Natural Science Foundation of China(31670559).
文摘As an important material for manufacturing resonant components of musical instruments,Paulownia has an important influence on the sound quality of Ruan.In this paper,a model for evaluating the sound quality of Ruan based on the vibration characteristics of wood is developed using machine learning methods.Generally,the selection of materials for Ruan manufacturing relies primarily on manually weighing,observing,striking,and listening by the instrument technician.Deficiencies in scientific theory have hindered the quality of the finished Ruan.In this study,nine Ruans were manufactured,and a prediction model of Ruan sound quality was proposed based on the raw material information of Ruans.Out of a total of 180 data sets,145 and 45 sets were chosen for training and validation,respec-tively.In this paper,typical correlation analysis was used to determine the correlation between two single indicators in two adjacent pairwise combinations of the measured objects in each stage of the production process in Ruan.The vibra-tion characteristics of the wood were tested,and a model for predicting the evaluation of Ruan’s acoustic qualities was developed by measuring the vibration characteristics of the resonating plate material.The acoustic quality of the Ruan sound board wood was evaluated and predicted using machine learning model generalized regression neural net-work.The results show that the prediction of Ruan sound quality can be achieved using Matlab simulation based on the vibration characteristics of the soundboard wood.When the model-predicted values were compared with the tradi-tional predicted results,it was found that the generalized regression neural network had good performance,achieving an accuracy of 93.8%which was highly consistent with the experimental results.It was concluded that the model can accurately predict the acoustic quality of the Ruan based on the vibration performance of the soundboards.
基金support from the Ministry of Education(MOE) Singapore Tier 1 (RG8/20)。
文摘A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the development of proper featurization method based on physicochemical nature of target proprieties can improve the predictive power of ML models with a smaller database.In this work,we show that two new featurization methods,volume occupation spatial matrix and heat contribution spatial matrix,can improve the accuracy in predicting energetic materials' crystal density(ρ_(crystal)) and solid phase enthalpy of formation(H_(f,solid)) using a database containing 451 energetic molecules.Their mean absolute errors are reduced from 0.048 g/cm~3 and 24.67 kcal/mol to 0.035 g/cm~3 and 9.66 kcal/mol,respectively.By leave-one-out-cross-validation,the newly developed ML models can be used to determine the performance of most kinds of energetic materials except cubanes.Our ML models are applied to predict ρ_(crystal) and H_(f,solid) of CHON-based molecules of the 150 million sized PubChem database,and screened out 56 candidates with competitive detonation performance and reasonable chemical structures.With further improvement in future,spatial matrices have the potential of becoming multifunctional ML simulation tools that could provide even better predictions in wider fields of materials science.
基金supported by the National Science and Technology Major Project of China(2016ZX05066005-001)Zhejiang Province Key Research and Development Plan(2021C03152)Zhoushan Science and Technology Project(2021C21011)
文摘The liquid loading is one of the most frequently encountered phenomena in the transportation of gas pipeline,reducing the transmission efficiency and threatening the flow assurance.However,most of the traditional mechanism models are semi-empirical models,and have to be resolved under different working conditions with complex calculation process.The development of big data technology and artificial intelligence provides the possibility to establish data-driven models.This paper aims to establish a liquid loading prediction model for natural gas pipeline with high generalization ability based on machine learning.First,according to the characteristics of actual gas pipeline,a variety of reasonable combinations of working conditions such as different gas velocity,pipe diameters,water contents and outlet pressures were set,and multiple undulating pipeline topography with different elevation differences was established.Then a large number of simulations were performed by simulator OLGA to obtain the data required for machine learning.After data preprocessing,six supervised learning algorithms,including support vector machine(SVM),decision tree(DT),random forest(RF),artificial neural network(ANN),plain Bayesian classification(NBC),and K nearest neighbor algorithm(KNN),were compared to evaluate the performance of liquid loading prediction.Finally,the RF and KNN with better performance were selected for parameter tuning and then used to the actual pipeline for liquid loading location prediction.Compared with OLGA simulation,the established data-driven model not only improves calculation efficiency and reduces workload,but also can provide technical support for gas pipeline flow assurance.
文摘Reservoir identification and production prediction are two of the most important tasks in petroleum exploration and development.Machine learning(ML)methods are used for petroleum-related studies,but have not been applied to reservoir identification and production prediction based on reservoir identification.Production forecasting studies are typically based on overall reservoir thickness and lack accuracy when reservoirs contain a water or dry layer without oil production.In this paper,a systematic ML method was developed using classification models for reservoir identification,and regression models for production prediction.The production models are based on the reservoir identification results.To realize the reservoir identification,seven optimized ML methods were used:four typical single ML methods and three ensemble ML methods.These methods classify the reservoir into five types of layers:water,dry and three levels of oil(I oil layer,II oil layer,III oil layer).The validation and test results of these seven optimized ML methods suggest the three ensemble methods perform better than the four single ML methods in reservoir identification.The XGBoost produced the model with the highest accuracy;up to 99%.The effective thickness of I and II oil layers determined during the reservoir identification was fed into the models for predicting production.Effective thickness considers the distribution of the water and the oil resulting in a more reasonable production prediction compared to predictions based on the overall reservoir thickness.To validate the superiority of the ML methods,reference models using overall reservoir thickness were built for comparison.The models based on effective thickness outperformed the reference models in every evaluation metric.The prediction accuracy of the ML models using effective thickness were 10%higher than that of reference model.Without the personal error or data distortion existing in traditional methods,this novel system realizes rapid analysis of data while reducing the time required to resolve reservoir classification and production prediction challenges.The ML models using the effective thickness obtained from reservoir identification were more accurate when predicting oil production compared to previous studies which use overall reservoir thickness.
基金Supported by the National Key Research and Development Program of China under Grant Nos 2016YFA0401000 and2017YFA0302901the National Basic Research Program of China under Grant No 2015CB921000+2 种基金the National Natural Science Foundation of China under Grant Nos 11574371,11774399 and 11774398the Beijing Natural Science Foundation(Z180008)the Strategic Priority Research Program of Chinese Academy of Sciences under Grant No XDB28000000
文摘Growth of high-quality single crystals is of great significance for research of condensed matter physics. The exploration of suitable growing conditions for single crystals is expensive and time-consuming, especially for ternary compounds because of the lack of ternary phase diagram. Here we use machine learning(ML) trained on our experimental data to predict and instruct the growth. Four kinds of ML methods, including support vector machine(SVM), decision tree, random forest and gradient boosting decision tree, are adopted. The SVM method is relatively stable and works well, with an accuracy of 81% in predicting experimental results. By comparison,the accuracy of laboratory reaches 36%. The decision tree model is also used to reveal which features will take critical roles in growing processes.
文摘A corrosion defect is recognized as one of the most severe phenomena for high-pressure pipelines,especially those served for a long time.Finite-element method and empirical formulas are thereby used for the strength prediction of such pipes with corrosion.However,it is time-consuming for finite-element method and there is a limited application range by using empirical formulas.In order to improve the prediction of strength,this paper investigates the burst pressure of line pipelines with a single corrosion defect subjected to internal pressure based on data-driven methods.Three supervised ML(machine learning)algorithms,including the ANN(artificial neural network),the SVM(support vector machine)and the LR(linear regression),are deployed to train models based on experimental data.Data analysis is first conducted to determine proper pipe features for training.Hyperparameter tuning to control the learning process is then performed to fit the best strength models for corroded pipelines.Among all the proposed data-driven models,the ANN model with three neural layers has the highest training accuracy,but also presents the largest variance.The SVM model provides both high training accuracy and high validation accuracy.The LR model has the best performance in terms of generalization ability.These models can be served as surrogate models by transfer learning with new coming data in future research,facilitating a sustainable and intelligent decision-making of corroded pipelines.
基金Sponsored by the National Natural Science Foundation of China(51006052)
文摘Extreme learning machine(ELM) has attracted much attention in recent years due to its fast convergence and good performance.Merging both ELM and support vector machine is an important trend,thus yielding an ELM kernel.ELM kernel based methods are able to solve the nonlinear problems by inducing an explicit mapping compared with the commonly-used kernels such as Gaussian kernel.In this paper,the ELM kernel is extended to the least squares support vector regression(LSSVR),so ELM-LSSVR was proposed.ELM-LSSVR can be used to reduce the training and test time simultaneously without extra techniques such as sequential minimal optimization and pruning mechanism.Moreover,the memory space for the training and test was relieved.To confirm the efficacy and feasibility of the proposed ELM-LSSVR,the experiments are reported to demonstrate that ELM-LSSVR takes the advantage of training and test time with comparable accuracy to other algorithms.
基金This work was supported by the National Natural Science Foundation of China(Nos.11875027,11975096).
文摘The extended kernel ridge regression(EKRR)method with odd-even effects was adopted to improve the description of the nuclear charge radius using five commonly used nuclear models.These are:(i)the isospin-dependent A^(1∕3) formula,(ii)relativistic continuum Hartree-Bogoliubov(RCHB)theory,(iii)Hartree-Fock-Bogoliubov(HFB)model HFB25,(iv)the Weizsacker-Skyrme(WS)model WS*,and(v)HFB25*model.In the last two models,the charge radii were calculated using a five-parameter formula with the nuclear shell corrections and deformations obtained from the WS and HFB25 models,respectively.For each model,the resultant root-mean-square deviation for the 1014 nuclei with proton number Z≥8 can be significantly reduced to 0.009-0.013 fm after considering the modification with the EKRR method.The best among them was the RCHB model,with a root-mean-square deviation of 0.0092 fm.The extrapolation abilities of the KRR and EKRR methods for the neutron-rich region were examined,and it was found that after considering the odd-even effects,the extrapolation power was improved compared with that of the original KRR method.The strong odd-even staggering of nuclear charge radii of Ca and Cu isotopes and the abrupt kinks across the neutron N=126 and 82 shell closures were also calculated and could be reproduced quite well by calculations using the EKRR method.
基金supported by the Integrated Rail Transit Dispatch Control and Intermodal Transport Service Technology Project(Grant No.2022YFB4300500).
文摘The Balise Transmission Module(BTM)unit of the on-board train control system is a crucial component.Due to its unique installation position and complex environment,this unit has a higher fault rate within the on-board train control system.To conduct fault prediction for the BTM unit based on actual fault data,this study proposes a prediction method combining reliability statistics and machine learning,and achieves the fusion of prediction results from different dimensions through multi-method interactive validation.Firstly,a method for predicting equipment fault time targeting batch equipment is introduced.This method utilizes reliability statistics to construct a model of the remaining faultless operating time distribution considering uncertainty,thereby predicting the remaining faultless operating probability of the BTM unit.Secondly,considering the complexity of the BTM unit’s fault mechanism,the small sample size of fault cases,and the potential presence of multiple fault features in fault text records,an individual-oriented fault prediction method based on Bayesian-optimized Gradient Boosting Regression Tree(Bayes-GBRT)is proposed.This method achieves better prediction results compared to linear regression algorithms and random forest regression algorithms,with an average absolute error of only 0.224 years for predicting the fault time of this type of equipment.Finally,a multi-method interactive validation approach is proposed,enabling the fusion and validation of multi-dimensional results.The results indicate that the predicted fault time and the actual fault time conform to a log-normal distribution,and the parameter estimation results are basically consistent,verifying the accuracy and effectiveness of the prediction results.The above research findings can provide technical support for the maintenance and modification of BTM units,effectively reducing maintenance costs and ensuring the safe operation of high-speed railway,thus having practical engineering value for preventive maintenance.