[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-base...[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-based models that utilize either images data or environmental data.These methods fail to fully leverage multi-modal data to capture the diverse aspects of plant growth comprehensively.[Methods]To address this limitation,a two-stage phenotypic feature extraction(PFE)model based on deep learning algorithm of recurrent neural network(RNN)and long short-term memory(LSTM)was developed.The model integrated environment and plant information to provide a holistic understanding of the growth process,emploied phenotypic and temporal feature extractors to comprehensively capture both types of features,enabled a deeper understanding of the interaction between tomato plants and their environment,ultimately leading to highly accurate predictions of growth height.[Results and Discussions]The experimental results showed the model's ef‐fectiveness:When predicting the next two days based on the past five days,the PFE-based RNN and LSTM models achieved mean absolute percentage error(MAPE)of 0.81%and 0.40%,respectively,which were significantly lower than the 8.00%MAPE of the large language model(LLM)and 6.72%MAPE of the Transformer-based model.In longer-term predictions,the 10-day prediction for 4 days ahead and the 30-day prediction for 12 days ahead,the PFE-RNN model continued to outperform the other two baseline models,with MAPE of 2.66%and 14.05%,respectively.[Conclusions]The proposed method,which leverages phenotypic-temporal collaboration,shows great potential for intelligent,data-driven management of tomato cultivation,making it a promising approach for enhancing the efficiency and precision of smart tomato planting management.展开更多
For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to p...For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to process intercepted signals,which has a negative effect on signal classification. A classificationmethod based on spatial data mining is presented to address theabove challenge. Inspired by the idea of spatial data mining, theclassification method applies nuclear field to depicting the distributioninformation of pulse samples in feature space, and digs out thehidden cluster information by analyzing distribution characteristics.In addition, a membership-degree criterion to quantify the correlationamong all classes is established, which ensures classificationaccuracy of signal samples. Numerical experiments show that thepresented method can effectively prevent different working statesof multi-mode emitter from being classified as several emitters,and achieve higher classification accuracy.展开更多
There are multiple operating modes in the real industrial process, and the collected data follow the complex multimodal distribution, so most traditional process monitoring methods are no longer applicable because the...There are multiple operating modes in the real industrial process, and the collected data follow the complex multimodal distribution, so most traditional process monitoring methods are no longer applicable because their presumptions are that sampled-data should obey the single Gaussian distribution or non-Gaussian distribution. In order to solve these problems, a novel weighted local standardization(WLS) strategy is proposed to standardize the multimodal data, which can eliminate the multi-mode characteristics of the collected data, and normalize them into unimodal data distribution. After detailed analysis of the raised data preprocessing strategy, a new algorithm using WLS strategy with support vector data description(SVDD) is put forward to apply for multi-mode monitoring process. Unlike the strategy of building multiple local models, the developed method only contains a model without the prior knowledge of multi-mode process. To demonstrate the proposed method's validity, it is applied to a numerical example and a Tennessee Eastman(TE) process. Finally, the simulation results show that the WLS strategy is very effective to standardize multimodal data, and the WLS-SVDD monitoring method has great advantages over the traditional SVDD and PCA combined with a local standardization strategy(LNS-PCA) in multi-mode process monitoring.展开更多
The classic multi-mode input shapers(MMISs)are valid to decrease multi-mode residual vibration of manipulators or robots simultaneously.But these input shapers cannot suppress more residual vibration with a quick resp...The classic multi-mode input shapers(MMISs)are valid to decrease multi-mode residual vibration of manipulators or robots simultaneously.But these input shapers cannot suppress more residual vibration with a quick response time when the frequency bandwidth of each mode vibration is very different.The methodologies and various types of multi-mode classic and hybrid input shaping control schemes with positive impulses were introduced in this paper.Six types of two-mode hybrid input shapers with positive impulses of a 3 degree of freedom robot were established.The ability and robustness of these two-mode hybrid input shapers to suppress residual vibration were analyzed by vibration response curve and sensitivity curve via numerical simulation.The response time of the zero vibration-zero vibration and derivative(ZV-ZVD)input shaper is the fastest,but the robustness is the least.The robustness of the zero vibration and derivative-extra insensitive(ZVD-EI)input shaper is the best,while the response time is the longest.According to the frequency bandwidth at each mode and required system response time,the most appropriate multi-mode hybrid input shaper(MMHIS)can be selected in order to improve response time as much as possible under the condition of suppressing more residual vibration.展开更多
A memetic algorithm (MA) for a multi-mode resourceconstrained project scheduling problem (MRCPSP) is proposed. We use a new fitness function and two very effective local search procedures in the proposed MA. The f...A memetic algorithm (MA) for a multi-mode resourceconstrained project scheduling problem (MRCPSP) is proposed. We use a new fitness function and two very effective local search procedures in the proposed MA. The fitness function makes use of a mechanism called "strategic oscillation" to make the search process have a higher probability to visit solutions around a "feasible boundary". One of the local search procedures aims at improving the lower bound of project makespan to be less than a known upper bound, and another aims at improving a solution of an MRCPSP instance accepting infeasible solutions based on the new fitness function in the search process. A detailed computational experiment is set up using instances from the problem instance library PSPLIB. Computational results show that the proposed MA is very competitive with the state-of-the-art algorithms. The MA obtains improved solutions for one instance of set J30.展开更多
Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction mode...Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction models do not consider the features contained in the data,resulting in limited improvement of model accuracy.To address these challenges,this paper proposes a multi-dimensional multi-modal cold rolling vibration time series prediction model(MDMMVPM)based on the deep fusion of multi-level networks.In the model,the long-term and short-term modal features of multi-dimensional data are considered,and the appropriate prediction algorithms are selected for different data features.Based on the established prediction model,the effects of tension and rolling force on mill vibration are analyzed.Taking the 5th stand of a cold mill in a steel mill as the research object,the innovative model is applied to predict the mill vibration for the first time.The experimental results show that the correlation coefficient(R^(2))of the model proposed in this paper is 92.5%,and the root-mean-square error(RMSE)is 0.0011,which significantly improves the modeling accuracy compared with the existing models.The proposed model is also suitable for the hot rolling process,which provides a new method for the prediction of strip rolling vibration.展开更多
Intelligent personal assistants play a pivotal role in in-vehicle systems,significantly enhancing life efficiency,driving safety,and decision-making support.In this study,the multi-modal design elements of intelligent...Intelligent personal assistants play a pivotal role in in-vehicle systems,significantly enhancing life efficiency,driving safety,and decision-making support.In this study,the multi-modal design elements of intelligent personal assistants within the context of visual,auditory,and somatosensory interactions with drivers were discussed.Their impact on the driver’s psychological state through various modes such as visual imagery,voice interaction,and gesture interaction were explored.The study also introduced innovative designs for in-vehicle intelligent personal assistants,incorporating design principles such as driver-centricity,prioritizing passenger safety,and utilizing timely feedback as a criterion.Additionally,the study employed design methods like driver behavior research and driving situation analysis to enhance the emotional connection between drivers and their vehicles,ultimately improving driver satisfaction and trust.展开更多
A novel immune genetic algorithm with the elitist selection and elitist crossover was proposed, which is called the immune genetic algorithm with the elitism (IGAE). In IGAE, the new methods for computing antibody s...A novel immune genetic algorithm with the elitist selection and elitist crossover was proposed, which is called the immune genetic algorithm with the elitism (IGAE). In IGAE, the new methods for computing antibody similarity, expected reproduction probability, and clonal selection probability were given. IGAE has three features. The first is that the similarities of two antibodies in structure and quality are all defined in the form of percentage, which helps to describe the similarity of two antibodies more accurately and to reduce the computational burden effectively. The second is that with the elitist selection and elitist crossover strategy IGAE is able to find the globally optimal solution of a given problem. The third is that the formula of expected reproduction probability of antibody can be adjusted through a parameter r, which helps to balance the population diversity and the convergence speed of IGAE so that IGAE can find the globally optimal solution of a given problem more rapidly. Two different complex multi-modal functions were selected to test the validity of IGAE. The experimental results show that IGAE can find the globally maximum/minimum values of the two functions rapidly. The experimental results also confirm that IGAE is of better performance in convergence speed, solution variation behavior, and computational efficiency compared with the canonical genetic algorithm with the elitism and the immune genetic algorithm with the information entropy and elitism.展开更多
A new coarse-to-fine strategy was proposed for nonrigid registration of computed tomography(CT) and magnetic resonance(MR) images of a liver.This hierarchical framework consisted of an affine transformation and a B-sp...A new coarse-to-fine strategy was proposed for nonrigid registration of computed tomography(CT) and magnetic resonance(MR) images of a liver.This hierarchical framework consisted of an affine transformation and a B-splines free-form deformation(FFD).The affine transformation performed a rough registration targeting the mismatch between the CT and MR images.The B-splines FFD transformation performed a finer registration by correcting local motion deformation.In the registration algorithm,the normalized mutual information(NMI) was used as similarity measure,and the limited memory Broyden-Fletcher- Goldfarb-Shannon(L-BFGS) optimization method was applied for optimization process.The algorithm was applied to the fully automated registration of liver CT and MR images in three subjects.The results demonstrate that the proposed method not only significantly improves the registration accuracy but also reduces the running time,which is effective and efficient for nonrigid registration.展开更多
Deep multi-modal learning,a rapidly growing field with a wide range of practical applications,aims to effectively utilize and integrate information from multiple sources,known as modalities.Despite its impressive empi...Deep multi-modal learning,a rapidly growing field with a wide range of practical applications,aims to effectively utilize and integrate information from multiple sources,known as modalities.Despite its impressive empirical performance,the theoretical foundations of deep multi-modal learning have yet to be fully explored.In this paper,we will undertake a comprehensive survey of recent developments in multi-modal learning theories,focusing on the fundamental properties that govern this field.Our goal is to provide a thorough collection of current theoretical tools for analyzing multi-modal learning,to clarify their implications for practitioners,and to suggest future directions for the establishment of a solid theoretical foundation for deep multi-modal learning.展开更多
Laser cleaning is a highly nonlinear physical process for solving poor single-modal(e.g., acoustic or vision)detection performance and low inter-information utilization. In this study, a multi-modal feature fusion net...Laser cleaning is a highly nonlinear physical process for solving poor single-modal(e.g., acoustic or vision)detection performance and low inter-information utilization. In this study, a multi-modal feature fusion network model was constructed based on a laser paint removal experiment. The alignment of heterogeneous data under different modals was solved by combining the piecewise aggregate approximation and gramian angular field. Moreover, the attention mechanism was introduced to optimize the dual-path network and dense connection network, enabling the sampling characteristics to be extracted and integrated. Consequently, the multi-modal discriminant detection of laser paint removal was realized. According to the experimental results, the verification accuracy of the constructed model on the experimental dataset was 99.17%, which is 5.77% higher than the optimal single-modal detection results of the laser paint removal. The feature extraction network was optimized by the attention mechanism, and the model accuracy was increased by 3.3%. Results verify the improved classification performance of the constructed multi-modal feature fusion model in detecting laser paint removal, the effective integration of acoustic data and visual image data, and the accurate detection of laser paint removal.展开更多
随着因化石燃料过度消耗产生的温室效应成为了全球关注的焦点,我国正在大力推广利用太阳能等可再生清洁能源的发电技术,从而为电网的低碳运行做出贡献。其中,光伏发电的并网化和大型化必将是将来的主要发展趋势,提高光伏发电效率,增加...随着因化石燃料过度消耗产生的温室效应成为了全球关注的焦点,我国正在大力推广利用太阳能等可再生清洁能源的发电技术,从而为电网的低碳运行做出贡献。其中,光伏发电的并网化和大型化必将是将来的主要发展趋势,提高光伏发电效率,增加并网容量,都有助于发展低碳电网。由于光伏发电的随机性强,所以通过Matlab/Simulnik建造光伏电站的模型,根据真实的环境数据得出一年中的典型日光伏输出特性并进行了分析,最后提出了一种大型光伏电站的低碳调度。另一方面,随着光伏电站的容量不断增加,并网时需要考虑并解决更多的负面影响,而且要达到低碳运行的标准,这都对并网逆变器提出了更多的要求,所以提出了一种具有最大功率点跟踪(maximum power point tracking,MPPT)、无功及谐波电流补偿和有功控制相结合的大型光伏电站逆变器的多模式控制策略,并通过Matlab/Simulnik建模仿真验证了该控制策略的可行性及优点。展开更多
An indoor location system based on multilayer artificial neural network(ANN) with area division is proposed.The characteristics of recorded signal strength(RSS),or signal to noise ratio(SNR) from each available ...An indoor location system based on multilayer artificial neural network(ANN) with area division is proposed.The characteristics of recorded signal strength(RSS),or signal to noise ratio(SNR) from each available access points(APs),are utilized to establish the radio map in the off-line phase.And in the on-line phase,the two or three dimensional coordinates of mobile terminals(MTs) are estimated according to the similarity between the new recorded RSS or SNR and fingerprints pre-stored in radio map.Although the feed-forward ANN with three layers is sufficient to describe any nonlinear mapping relationship between inputs and outputs with finite discontinuous points,the efficient inputs for better training performances are difficult to be determined because of complex and dynamic indoor environment.Then,the discussion of distance relativity for different signal characteristics and optimal strategies for multi-mode phenomenon avoidance is presented.And also,the feasibility and effectiveness of this method are verified based on the experimental comparison with normal ANN without area division,K-nearest neighbor(KNN) and probability methods in typical office environment.展开更多
With the development of operationally responsive space(ORS) and on-board processing techniques, the end users canreceive the observation data from the ORS satellite directly. Tosatisfy the demand for reducing the re...With the development of operationally responsive space(ORS) and on-board processing techniques, the end users canreceive the observation data from the ORS satellite directly. Tosatisfy the demand for reducing the requirements-tasking-effectscycle from one day to hours, the various resources of the wholedata acquisition chain (including satellites, ground stations, dataprocessing centers, users, etc.) should be taken into an overallconsideration, and the traditional batch task planning mode shouldbe transformed into the user-oriented task planning mode. Consideringthere are many approaches for data acquisition due tothe new techniques of ORS satellite, the data acquisition chaintask planning problem for ORS satellite can be seen as the multimodalroute planning problem. Thereby, a framework is presentedusing label-constrained shortest path technique with the conflictresolution. To apply this framework to solve the ORS satellite taskplanning problem, the preprocessing and the conflict resolutionstrategies are discussed in detail. Based on the above work, theuser-oriented data acquisition chain task planning algorithm forORS satellite is proposed. The exact solution can be obtainedin polynomial time using the proposed algorithm. The simulationexperiments validate the feasibility and the adaptability of the proposedapproach.展开更多
A self-adaptive learning based immune algorithm (SALIA) is proposed to tackle diverse optimization problems, such as complex multi-modal and ill-conditioned prc,blems with the high robustness. The SALIA algorithm ad...A self-adaptive learning based immune algorithm (SALIA) is proposed to tackle diverse optimization problems, such as complex multi-modal and ill-conditioned prc,blems with the high robustness. The SALIA algorithm adopted a mutation strategy pool which consists of four effective mutation strategies to generate new antibodies. A self-adaptive learning framework is implemented to select the mutation strategies by learning from their previous performances in generating promising solutions. Twenty-six state-of-the-art optimization problems with different characteristics, such as uni-modality, multi-modality, rotation, ill-condition, mis-scale and noise, are used to verify the validity of SALIA. Experimental results show that the novel algorithm SALIA achieves a higher universality and robustness than clonal selection algorithms (CLONALG), and the mean error index of each test function in SALIA decreases by a factor of at least 1.0×10^7 in average.展开更多
As a key link in human-computer interaction,emotion recognition can enable robots to correctly perceive user emotions and provide dynamic and adjustable services according to the emotional needs of different users,whi...As a key link in human-computer interaction,emotion recognition can enable robots to correctly perceive user emotions and provide dynamic and adjustable services according to the emotional needs of different users,which is the key to improve the cognitive level of robot service.Emotion recognition based on facial expression and electrocardiogram has numerous industrial applications.First,three-dimensional convolutional neural network deep learning architecture is utilized to extract the spatial and temporal features from facial expression video data and electrocardiogram(ECG)data,and emotion classification is carried out.Then two modalities are fused in the data level and the decision level,respectively,and the emotion recognition results are then given.Finally,the emotion recognition results of single-modality and multi-modality are compared and analyzed.Through the comparative analysis of the experimental results of single-modality and multi-modality under the two fusion methods,it is concluded that the accuracy rate of multi-modal emotion recognition is greatly improved compared with that of single-modal emotion recognition,and decision-level fusion is easier to operate and more effective than data-level fusion.展开更多
A new multi-modal optimization algorithm called the self-organizing worm algorithm (SOWA) is presented for optimization of multi-modal functions. The main idea of this algorithm can be described as follows: dispers...A new multi-modal optimization algorithm called the self-organizing worm algorithm (SOWA) is presented for optimization of multi-modal functions. The main idea of this algorithm can be described as follows: disperse some worms equably in the domain; the worms exchange the information each other and creep toward the nearest high point; at last they will stop on the nearest high point. All peaks of multi-modal function can be found rapidly through studying and chasing among the worms. In contrast with the classical multi-modal optimization algorithms, SOWA is provided with a simple calculation, strong convergence, high precision, and does not need any prior knowledge. Several simulation experiments for SOWA are performed, and the complexity of SOWA is analyzed amply. The results show that SOWA is very effective in optimization of multi-modal functions.展开更多
文摘[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-based models that utilize either images data or environmental data.These methods fail to fully leverage multi-modal data to capture the diverse aspects of plant growth comprehensively.[Methods]To address this limitation,a two-stage phenotypic feature extraction(PFE)model based on deep learning algorithm of recurrent neural network(RNN)and long short-term memory(LSTM)was developed.The model integrated environment and plant information to provide a holistic understanding of the growth process,emploied phenotypic and temporal feature extractors to comprehensively capture both types of features,enabled a deeper understanding of the interaction between tomato plants and their environment,ultimately leading to highly accurate predictions of growth height.[Results and Discussions]The experimental results showed the model's ef‐fectiveness:When predicting the next two days based on the past five days,the PFE-based RNN and LSTM models achieved mean absolute percentage error(MAPE)of 0.81%and 0.40%,respectively,which were significantly lower than the 8.00%MAPE of the large language model(LLM)and 6.72%MAPE of the Transformer-based model.In longer-term predictions,the 10-day prediction for 4 days ahead and the 30-day prediction for 12 days ahead,the PFE-RNN model continued to outperform the other two baseline models,with MAPE of 2.66%and 14.05%,respectively.[Conclusions]The proposed method,which leverages phenotypic-temporal collaboration,shows great potential for intelligent,data-driven management of tomato cultivation,making it a promising approach for enhancing the efficiency and precision of smart tomato planting management.
基金supported by the National Natural Science Foundation of China(61371172)the International S&T Cooperation Program of China(2015DFR10220)+1 种基金the Ocean Engineering Project of National Key Laboratory Foundation(1213)the Fundamental Research Funds for the Central Universities(HEUCF1608)
文摘For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to process intercepted signals,which has a negative effect on signal classification. A classificationmethod based on spatial data mining is presented to address theabove challenge. Inspired by the idea of spatial data mining, theclassification method applies nuclear field to depicting the distributioninformation of pulse samples in feature space, and digs out thehidden cluster information by analyzing distribution characteristics.In addition, a membership-degree criterion to quantify the correlationamong all classes is established, which ensures classificationaccuracy of signal samples. Numerical experiments show that thepresented method can effectively prevent different working statesof multi-mode emitter from being classified as several emitters,and achieve higher classification accuracy.
基金Project(61374140)supported by the National Natural Science Foundation of China
文摘There are multiple operating modes in the real industrial process, and the collected data follow the complex multimodal distribution, so most traditional process monitoring methods are no longer applicable because their presumptions are that sampled-data should obey the single Gaussian distribution or non-Gaussian distribution. In order to solve these problems, a novel weighted local standardization(WLS) strategy is proposed to standardize the multimodal data, which can eliminate the multi-mode characteristics of the collected data, and normalize them into unimodal data distribution. After detailed analysis of the raised data preprocessing strategy, a new algorithm using WLS strategy with support vector data description(SVDD) is put forward to apply for multi-mode monitoring process. Unlike the strategy of building multiple local models, the developed method only contains a model without the prior knowledge of multi-mode process. To demonstrate the proposed method's validity, it is applied to a numerical example and a Tennessee Eastman(TE) process. Finally, the simulation results show that the WLS strategy is very effective to standardize multimodal data, and the WLS-SVDD monitoring method has great advantages over the traditional SVDD and PCA combined with a local standardization strategy(LNS-PCA) in multi-mode process monitoring.
基金Project(LQ12E05008)supported by Natural Science Foundation of Zhejiang Province,ChinaProject(201708330107)supported by China Scholarship Council
文摘The classic multi-mode input shapers(MMISs)are valid to decrease multi-mode residual vibration of manipulators or robots simultaneously.But these input shapers cannot suppress more residual vibration with a quick response time when the frequency bandwidth of each mode vibration is very different.The methodologies and various types of multi-mode classic and hybrid input shaping control schemes with positive impulses were introduced in this paper.Six types of two-mode hybrid input shapers with positive impulses of a 3 degree of freedom robot were established.The ability and robustness of these two-mode hybrid input shapers to suppress residual vibration were analyzed by vibration response curve and sensitivity curve via numerical simulation.The response time of the zero vibration-zero vibration and derivative(ZV-ZVD)input shaper is the fastest,but the robustness is the least.The robustness of the zero vibration and derivative-extra insensitive(ZVD-EI)input shaper is the best,while the response time is the longest.According to the frequency bandwidth at each mode and required system response time,the most appropriate multi-mode hybrid input shaper(MMHIS)can be selected in order to improve response time as much as possible under the condition of suppressing more residual vibration.
基金supported by the National Natural Science Foundation of China(71171038)
文摘A memetic algorithm (MA) for a multi-mode resourceconstrained project scheduling problem (MRCPSP) is proposed. We use a new fitness function and two very effective local search procedures in the proposed MA. The fitness function makes use of a mechanism called "strategic oscillation" to make the search process have a higher probability to visit solutions around a "feasible boundary". One of the local search procedures aims at improving the lower bound of project makespan to be less than a known upper bound, and another aims at improving a solution of an MRCPSP instance accepting infeasible solutions based on the new fitness function in the search process. A detailed computational experiment is set up using instances from the problem instance library PSPLIB. Computational results show that the proposed MA is very competitive with the state-of-the-art algorithms. The MA obtains improved solutions for one instance of set J30.
基金Project(2023JH26-10100002)supported by the Liaoning Science and Technology Major Project,ChinaProjects(U21A20117,52074085)supported by the National Natural Science Foundation of China+1 种基金Project(2022JH2/101300008)supported by the Liaoning Applied Basic Research Program Project,ChinaProject(22567612H)supported by the Hebei Provincial Key Laboratory Performance Subsidy Project,China。
文摘Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction models do not consider the features contained in the data,resulting in limited improvement of model accuracy.To address these challenges,this paper proposes a multi-dimensional multi-modal cold rolling vibration time series prediction model(MDMMVPM)based on the deep fusion of multi-level networks.In the model,the long-term and short-term modal features of multi-dimensional data are considered,and the appropriate prediction algorithms are selected for different data features.Based on the established prediction model,the effects of tension and rolling force on mill vibration are analyzed.Taking the 5th stand of a cold mill in a steel mill as the research object,the innovative model is applied to predict the mill vibration for the first time.The experimental results show that the correlation coefficient(R^(2))of the model proposed in this paper is 92.5%,and the root-mean-square error(RMSE)is 0.0011,which significantly improves the modeling accuracy compared with the existing models.The proposed model is also suitable for the hot rolling process,which provides a new method for the prediction of strip rolling vibration.
文摘Intelligent personal assistants play a pivotal role in in-vehicle systems,significantly enhancing life efficiency,driving safety,and decision-making support.In this study,the multi-modal design elements of intelligent personal assistants within the context of visual,auditory,and somatosensory interactions with drivers were discussed.Their impact on the driver’s psychological state through various modes such as visual imagery,voice interaction,and gesture interaction were explored.The study also introduced innovative designs for in-vehicle intelligent personal assistants,incorporating design principles such as driver-centricity,prioritizing passenger safety,and utilizing timely feedback as a criterion.Additionally,the study employed design methods like driver behavior research and driving situation analysis to enhance the emotional connection between drivers and their vehicles,ultimately improving driver satisfaction and trust.
基金Project(50275150) supported by the National Natural Science Foundation of ChinaProjects(20040533035, 20070533131) supported by the National Research Foundation for the Doctoral Program of Higher Education of China
文摘A novel immune genetic algorithm with the elitist selection and elitist crossover was proposed, which is called the immune genetic algorithm with the elitism (IGAE). In IGAE, the new methods for computing antibody similarity, expected reproduction probability, and clonal selection probability were given. IGAE has three features. The first is that the similarities of two antibodies in structure and quality are all defined in the form of percentage, which helps to describe the similarity of two antibodies more accurately and to reduce the computational burden effectively. The second is that with the elitist selection and elitist crossover strategy IGAE is able to find the globally optimal solution of a given problem. The third is that the formula of expected reproduction probability of antibody can be adjusted through a parameter r, which helps to balance the population diversity and the convergence speed of IGAE so that IGAE can find the globally optimal solution of a given problem more rapidly. Two different complex multi-modal functions were selected to test the validity of IGAE. The experimental results show that IGAE can find the globally maximum/minimum values of the two functions rapidly. The experimental results also confirm that IGAE is of better performance in convergence speed, solution variation behavior, and computational efficiency compared with the canonical genetic algorithm with the elitism and the immune genetic algorithm with the information entropy and elitism.
基金Project(61240010)supported by the National Natural Science Foundation of ChinaProject(20070007070)supported by Specialized Research Fund for the Doctoral Program of Higher Education of China
文摘A new coarse-to-fine strategy was proposed for nonrigid registration of computed tomography(CT) and magnetic resonance(MR) images of a liver.This hierarchical framework consisted of an affine transformation and a B-splines free-form deformation(FFD).The affine transformation performed a rough registration targeting the mismatch between the CT and MR images.The B-splines FFD transformation performed a finer registration by correcting local motion deformation.In the registration algorithm,the normalized mutual information(NMI) was used as similarity measure,and the limited memory Broyden-Fletcher- Goldfarb-Shannon(L-BFGS) optimization method was applied for optimization process.The algorithm was applied to the fully automated registration of liver CT and MR images in three subjects.The results demonstrate that the proposed method not only significantly improves the registration accuracy but also reduces the running time,which is effective and efficient for nonrigid registration.
基金Supported by Technology and Innovation Major Project of the Ministry of Science and Technology of China(2020AAA0108400, 2020AAA0108403)Tsinghua Precision Medicine Foundation(10001020109)。
文摘Deep multi-modal learning,a rapidly growing field with a wide range of practical applications,aims to effectively utilize and integrate information from multiple sources,known as modalities.Despite its impressive empirical performance,the theoretical foundations of deep multi-modal learning have yet to be fully explored.In this paper,we will undertake a comprehensive survey of recent developments in multi-modal learning theories,focusing on the fundamental properties that govern this field.Our goal is to provide a thorough collection of current theoretical tools for analyzing multi-modal learning,to clarify their implications for practitioners,and to suggest future directions for the establishment of a solid theoretical foundation for deep multi-modal learning.
基金Project(51875491) supported by the National Natural Science Foundation of ChinaProject(2021T3069) supported by the Fujian Science and Technology Plan STS Project,China。
文摘Laser cleaning is a highly nonlinear physical process for solving poor single-modal(e.g., acoustic or vision)detection performance and low inter-information utilization. In this study, a multi-modal feature fusion network model was constructed based on a laser paint removal experiment. The alignment of heterogeneous data under different modals was solved by combining the piecewise aggregate approximation and gramian angular field. Moreover, the attention mechanism was introduced to optimize the dual-path network and dense connection network, enabling the sampling characteristics to be extracted and integrated. Consequently, the multi-modal discriminant detection of laser paint removal was realized. According to the experimental results, the verification accuracy of the constructed model on the experimental dataset was 99.17%, which is 5.77% higher than the optimal single-modal detection results of the laser paint removal. The feature extraction network was optimized by the attention mechanism, and the model accuracy was increased by 3.3%. Results verify the improved classification performance of the constructed multi-modal feature fusion model in detecting laser paint removal, the effective integration of acoustic data and visual image data, and the accurate detection of laser paint removal.
文摘随着因化石燃料过度消耗产生的温室效应成为了全球关注的焦点,我国正在大力推广利用太阳能等可再生清洁能源的发电技术,从而为电网的低碳运行做出贡献。其中,光伏发电的并网化和大型化必将是将来的主要发展趋势,提高光伏发电效率,增加并网容量,都有助于发展低碳电网。由于光伏发电的随机性强,所以通过Matlab/Simulnik建造光伏电站的模型,根据真实的环境数据得出一年中的典型日光伏输出特性并进行了分析,最后提出了一种大型光伏电站的低碳调度。另一方面,随着光伏电站的容量不断增加,并网时需要考虑并解决更多的负面影响,而且要达到低碳运行的标准,这都对并网逆变器提出了更多的要求,所以提出了一种具有最大功率点跟踪(maximum power point tracking,MPPT)、无功及谐波电流补偿和有功控制相结合的大型光伏电站逆变器的多模式控制策略,并通过Matlab/Simulnik建模仿真验证了该控制策略的可行性及优点。
基金supported by the National High Technology Research and Development Program of China (863 Program)(2008AA12Z305)
文摘An indoor location system based on multilayer artificial neural network(ANN) with area division is proposed.The characteristics of recorded signal strength(RSS),or signal to noise ratio(SNR) from each available access points(APs),are utilized to establish the radio map in the off-line phase.And in the on-line phase,the two or three dimensional coordinates of mobile terminals(MTs) are estimated according to the similarity between the new recorded RSS or SNR and fingerprints pre-stored in radio map.Although the feed-forward ANN with three layers is sufficient to describe any nonlinear mapping relationship between inputs and outputs with finite discontinuous points,the efficient inputs for better training performances are difficult to be determined because of complex and dynamic indoor environment.Then,the discussion of distance relativity for different signal characteristics and optimal strategies for multi-mode phenomenon avoidance is presented.And also,the feasibility and effectiveness of this method are verified based on the experimental comparison with normal ANN without area division,K-nearest neighbor(KNN) and probability methods in typical office environment.
基金supported by the National Natural Science Foundation of China(6110118461174159)
文摘With the development of operationally responsive space(ORS) and on-board processing techniques, the end users canreceive the observation data from the ORS satellite directly. Tosatisfy the demand for reducing the requirements-tasking-effectscycle from one day to hours, the various resources of the wholedata acquisition chain (including satellites, ground stations, dataprocessing centers, users, etc.) should be taken into an overallconsideration, and the traditional batch task planning mode shouldbe transformed into the user-oriented task planning mode. Consideringthere are many approaches for data acquisition due tothe new techniques of ORS satellite, the data acquisition chaintask planning problem for ORS satellite can be seen as the multimodalroute planning problem. Thereby, a framework is presentedusing label-constrained shortest path technique with the conflictresolution. To apply this framework to solve the ORS satellite taskplanning problem, the preprocessing and the conflict resolutionstrategies are discussed in detail. Based on the above work, theuser-oriented data acquisition chain task planning algorithm forORS satellite is proposed. The exact solution can be obtainedin polynomial time using the proposed algorithm. The simulationexperiments validate the feasibility and the adaptability of the proposedapproach.
基金Project(2010ZC13012) supported by the Aviation Science Funds of China
文摘A self-adaptive learning based immune algorithm (SALIA) is proposed to tackle diverse optimization problems, such as complex multi-modal and ill-conditioned prc,blems with the high robustness. The SALIA algorithm adopted a mutation strategy pool which consists of four effective mutation strategies to generate new antibodies. A self-adaptive learning framework is implemented to select the mutation strategies by learning from their previous performances in generating promising solutions. Twenty-six state-of-the-art optimization problems with different characteristics, such as uni-modality, multi-modality, rotation, ill-condition, mis-scale and noise, are used to verify the validity of SALIA. Experimental results show that the novel algorithm SALIA achieves a higher universality and robustness than clonal selection algorithms (CLONALG), and the mean error index of each test function in SALIA decreases by a factor of at least 1.0×10^7 in average.
基金supported by the Open Funding Project of National Key Laboratory of Human Factors Engineering(Grant NO.6142222190309)。
文摘As a key link in human-computer interaction,emotion recognition can enable robots to correctly perceive user emotions and provide dynamic and adjustable services according to the emotional needs of different users,which is the key to improve the cognitive level of robot service.Emotion recognition based on facial expression and electrocardiogram has numerous industrial applications.First,three-dimensional convolutional neural network deep learning architecture is utilized to extract the spatial and temporal features from facial expression video data and electrocardiogram(ECG)data,and emotion classification is carried out.Then two modalities are fused in the data level and the decision level,respectively,and the emotion recognition results are then given.Finally,the emotion recognition results of single-modality and multi-modality are compared and analyzed.Through the comparative analysis of the experimental results of single-modality and multi-modality under the two fusion methods,it is concluded that the accuracy rate of multi-modal emotion recognition is greatly improved compared with that of single-modal emotion recognition,and decision-level fusion is easier to operate and more effective than data-level fusion.
基金the National Natural Science Foundation of China (70572045).
文摘A new multi-modal optimization algorithm called the self-organizing worm algorithm (SOWA) is presented for optimization of multi-modal functions. The main idea of this algorithm can be described as follows: disperse some worms equably in the domain; the worms exchange the information each other and creep toward the nearest high point; at last they will stop on the nearest high point. All peaks of multi-modal function can be found rapidly through studying and chasing among the worms. In contrast with the classical multi-modal optimization algorithms, SOWA is provided with a simple calculation, strong convergence, high precision, and does not need any prior knowledge. Several simulation experiments for SOWA are performed, and the complexity of SOWA is analyzed amply. The results show that SOWA is very effective in optimization of multi-modal functions.