A multi-stage influence diagram is used to model the pilot's sequential decision making in one on one air combat. The model based on the multi-stage influence diagram graphically describes the elements of decision pr...A multi-stage influence diagram is used to model the pilot's sequential decision making in one on one air combat. The model based on the multi-stage influence diagram graphically describes the elements of decision process, and contains a point-mass model for the dynamics of an aircraft and takes into account the decision maker's preferences under uncertain conditions. Considering an active opponent, the opponent's maneuvers can be modeled stochastically. The solution of multistage influence diagram can be obtained by converting the multistage influence diagram into a two-level optimization problem. The simulation results show the model is effective.展开更多
Air-to-air combat tactical decisions for multiple unmanned aerial vehicles(ACTDMU)are a key decision-making step in beyond visual range combat.Complex influencing factors,strong antagonism and real-time requirements n...Air-to-air combat tactical decisions for multiple unmanned aerial vehicles(ACTDMU)are a key decision-making step in beyond visual range combat.Complex influencing factors,strong antagonism and real-time requirements need to be considered in the ACTDMU problem.In this paper,we propose a multicriteria game approach to ACTDMU.This approach consists of a multicriteria game model and a Pareto Nash equilibrium algorithm.In this model,we form the strategy profiles for the integration of air-to-air combat tactics and weapon target assignment strategies by considering the correlation between them,and we design the vector payoff functions based on predominance factors.We propose a algorithm of Pareto Nash equilibrium based on preference relations using threshold constraints(PNE-PRTC),and we prove that the solutions obtained by this algorithm are refinements of Pareto Nash equilibrium solutions.The numerical experiments indicate that PNE-PRTC algorithm is considerably faster than the baseline algorithms and the performance is better.Especially on large-scale instances,the Pareto Nash equilibrium solutions can be calculated by PNEPRTC algorithm at the second level.The simulation experiments show that the multicriteria game approach is more effective than one-side decision approaches such as multiple-attribute decision-making and randomly chosen decisions.展开更多
A new approach to knowledge acquisition in incomplete information system with fuzzy decisions is proposed. In such incomplete information system, the universe of discourse is classified by the maximal tolerance classe...A new approach to knowledge acquisition in incomplete information system with fuzzy decisions is proposed. In such incomplete information system, the universe of discourse is classified by the maximal tolerance classes, and fuzzy approximations are defined based on them. Three types of relative reducts of maximal tolerance classes are then proposed, and three types of fuzzy decision rules based on the proposed attribute description are defined. The judgment theorems and approximation discernibility functions with respect to them are presented to compute the relative reduct by using Boolean reasoning techniques, from which we can derive optimal fuzzy decision rules from the systems. At last, three types of relative reducts of the system and their computing methods are given.展开更多
A Receiver Operating Characteristic(ROC)analysis of a power is important and useful in clinical trials.A Classical Conditional Power(CCP)is a probability of a classical rejection region given values of true treatment ...A Receiver Operating Characteristic(ROC)analysis of a power is important and useful in clinical trials.A Classical Conditional Power(CCP)is a probability of a classical rejection region given values of true treatment effect and interim result.For hypotheses and reversed hypotheses under normal models,we obtain analytical expressions of the ROC curves of the CCP,find optimal ROC curves of the CCP,investigate the superiority of the ROC curves of the CCP,calculate critical values of the False Positive Rate(FPR),True Positive Rate(TPR),and cutoff of the optimal CCP,and give go/no go decisions at the interim of the optimal CCP.In addition,extensive numerical experiments are carried out to exemplify our theoretical results.Finally,a real data example is performed to illustrate the go/no go decisions of the optimal CCP.展开更多
Developmental and reproductive toxicity(DART)endpoint entails a toxicological assessment of all developmental stages and reproductive cycles of an organism.In silico tools to predict DART will provide a method to asse...Developmental and reproductive toxicity(DART)endpoint entails a toxicological assessment of all developmental stages and reproductive cycles of an organism.In silico tools to predict DART will provide a method to assess this complex toxicity endpoint and will be valuable for screening emerging pollutants as well as for m anaging new chemicals in China.Currently,there are few published DART prediction models in China,but many related research and development projects are in progress.In 2013,WU et al.published an expert rule-based DART decision tree(DT).This DT relies on known chemical structures linked to DART to forecast DART potential of a given chemical.Within this procedure,an accurate DART data interpretation is the foundation of building and expanding the DT.This paper excerpted case studies demonstrating DART data curation and interpretation of four chemicals(including 8-hydroxyquinoline,3,5,6-trichloro-2-pyridinol,thiacloprid,and imidacloprid)to expand the existing DART DT.Chemicals were first selected from the database of Solid Waste and Chemicals Management Center,Ministry of Ecology and Environment(MEESCC)in China.The structures of these 4 chemicals were analyzed and preliminarily grouped by chemists based on core structural features,functional groups,receptor binding property,metabolism,and possible mode of actions.Then,the DART conclusion was derived by collecting chemical information,searching,integrating,and interpreting DART data by the toxicologists.Finally,these chemicals were classified into either an existing category or a new category via integrating their chemical features,DART conclusions,and biological properties.The results showed that 8-hydroxyquinoline impacted estrous cyclicity,s exual organ weights,and embryonal development,and 3,5,6-trichloro-2-pyridinol caused central nervous system(CNS)malformations,which were added to an existing subcategory 8e(aromatic compounds with multi-halogen and nitro groups)of the DT.Thiacloprid caused dystocia and fetal skeletal malformation,and imidacloprid disrupted the endocrine system and male fertility.They both contain 2-chloro-5-methylpyridine substituted imidazolidine c yclic ring,which were expected to create a new category of neonicotinoids.The current work delineates a t ransparent process of curating toxicological data for the purpose of DART data interpretation.In the presence of sufficient related structures and DART data,the DT can be expanded by iteratively adding chemicals within the a pplicable domain of each category or subcategory.This DT can potentially serve as a tool for screening emerging pollutants and assessing new chemicals in China.展开更多
To address the confrontation decision-making issues in multi-round air combat,a dynamic game decision method is proposed based on decision tree for the confrontation of unmanned aerial vehicle(UAV)air combat.Based on ...To address the confrontation decision-making issues in multi-round air combat,a dynamic game decision method is proposed based on decision tree for the confrontation of unmanned aerial vehicle(UAV)air combat.Based on game the-ory and the confrontation characteristics of air combat,a dynamic game process is constructed including the strategy sets,the situation information,and the maneuver decisions for both sides of air combat.By analyzing the UAV’s flight dyna-mics and the both sides’information,a payment matrix is estab-lished through the situation advantage function,performance advantage function,and profit function.Furthermore,the dynamic game decision problem is solved based on the linear induction method to obtain the Nash equilibrium solution,where the decision tree method is introduced to obtain the optimal maneuver decision,thereby improving the situation advantage in the next round of confrontation.According to the analysis,the simulation results for the confrontation scenarios of multi-round air combat are presented to verify the effectiveness and advan-tages of the proposed method.展开更多
System upgrades in unmanned systems have made Unmanned Aerial Vehicle(UAV)-based patrolling and monitoring a preferred solution for ocean surveillance.However,dynamic environments and large-scale deployments pose sign...System upgrades in unmanned systems have made Unmanned Aerial Vehicle(UAV)-based patrolling and monitoring a preferred solution for ocean surveillance.However,dynamic environments and large-scale deployments pose significant challenges for efficient decision-making,necessitating a modular multiagent control system.Deep Reinforcement Learning(DRL)and Decision Tree(DT)have been utilized for these complex decision-making tasks,but each has its limitations:DRL is highly adaptive but lacks interpretability,while DT is inherently interpretable but has limited adaptability.To overcome these challenges,we propose the Adaptive Interpretable Decision Tree(AIDT),an evolutionary-based algorithm that is both adaptable to diverse environmental settings and highly interpretable in its decision-making processes.We first construct a Markov decision process(MDP)-based simulation environment using the Cooperative Submarine Search task as a representative scenario for training and testing the proposed method.Specifically,we use the heat map as a state variable to address the issue of multi-agent input state proliferation.Next,we introduce the curiosity-guiding intrinsic reward to encourage comprehensive exploration and enhance algorithm performance.Additionally,we incorporate decision tree size as an influence factor in the adaptation process to balance task completion with computational efficiency.To further improve the generalization capability of the decision tree,we apply a normalization method to ensure consistent processing of input states.Finally,we validate the proposed algorithm in different environmental settings,and the results demonstrate both its adaptability and interpretability.展开更多
This paper proposes a reliability evaluation model for a multi-dimensional network system,which has potential to be applied to the internet of things or other practical networks.A multi-dimensional network system with...This paper proposes a reliability evaluation model for a multi-dimensional network system,which has potential to be applied to the internet of things or other practical networks.A multi-dimensional network system with one source element and multiple sink elements is considered first.Each element can con-nect with other elements within a stochastic connection ranges.The system is regarded as successful as long as the source ele-ment remains connected with all sink elements.An importance measure is proposed to evaluate the performance of non-source elements.Furthermore,to calculate the system reliability and the element importance measure,a multi-valued decision diagram based approach is structured and its complexity is analyzed.Finally,a numerical example about the signal transfer station system is illustrated to analyze the system reliability and the ele-ment importance measure.展开更多
Aiming at the triangular fuzzy(TF)multi-attribute decision making(MADM)problem with a preference for the distribution density of attribute(DDA),a decision making method with TF number two-dimensional density(TFTD)oper...Aiming at the triangular fuzzy(TF)multi-attribute decision making(MADM)problem with a preference for the distribution density of attribute(DDA),a decision making method with TF number two-dimensional density(TFTD)operator is proposed based on the density operator theory for the decision maker(DM).Firstly,a simple TF vector clustering method is proposed,which considers the feature of TF number and the geometric distance of vectors.Secondly,the least deviation sum of squares method is used in the program model to obtain the density weight vector.Then,two TFTD operators are defined,and the MADM method based on the TFTD operator is proposed.Finally,a numerical example is given to illustrate the superiority of this method,which can not only solve the TF MADM problem with a preference for the DDA but also help the DM make an overall comparison.展开更多
Aiming at the problem of multi-UAV pursuit-evasion confrontation, a UAV cooperative maneuver method based on an improved multi-agent deep reinforcement learning(MADRL) is proposed. In this method, an improved Comm Net...Aiming at the problem of multi-UAV pursuit-evasion confrontation, a UAV cooperative maneuver method based on an improved multi-agent deep reinforcement learning(MADRL) is proposed. In this method, an improved Comm Net network based on a communication mechanism is introduced into a deep reinforcement learning algorithm to solve the multi-agent problem. A layer of gated recurrent unit(GRU) is added to the actor-network structure to remember historical environmental states. Subsequently,another GRU is designed as a communication channel in the Comm Net core network layer to refine communication information between UAVs. Finally, the simulation results of the algorithm in two sets of scenarios are given, and the results show that the method has good effectiveness and applicability.展开更多
To solve the problem of the low interference success rate of air defense missile radio fuzes due to the unified interference form of the traditional fuze interference system,an interference decision method based Q-lea...To solve the problem of the low interference success rate of air defense missile radio fuzes due to the unified interference form of the traditional fuze interference system,an interference decision method based Q-learning algorithm is proposed.First,dividing the distance between the missile and the target into multiple states to increase the quantity of state spaces.Second,a multidimensional motion space is utilized,and the search range of which changes with the distance of the projectile,to select parameters and minimize the amount of ineffective interference parameters.The interference effect is determined by detecting whether the fuze signal disappears.Finally,a weighted reward function is used to determine the reward value based on the range state,output power,and parameter quantity information of the interference form.The effectiveness of the proposed method in selecting the range of motion space parameters and designing the discrimination degree of the reward function has been verified through offline experiments involving full-range missile rendezvous.The optimal interference form for each distance state has been obtained.Compared with the single-interference decision method,the proposed decision method can effectively improve the success rate of interference.展开更多
This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with u...This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.展开更多
A novel variable step-size modified super-exponential iteration(MSEI)decision feedback blind equalization(DFE)algorithm with second-order digital phase-locked loop is put forward to improve the convergence performance...A novel variable step-size modified super-exponential iteration(MSEI)decision feedback blind equalization(DFE)algorithm with second-order digital phase-locked loop is put forward to improve the convergence performance of super-exponential iteration DFE algorithm.Based on the MSEI-DFE algorithm,it is first proposed to develop an error function as an improvement to the error function of MSEI,which effectively achieves faster convergence speed of the algorithm.Subsequently,a hyperbolic tangent function variable step-size algorithm is developed considering the high variation rate of the hyperbolic tangent function around zero,so as to further improve the convergence speed of the algorithm.In the end,a second-order digital phase-locked loop is introduced into the decision feedback equalizer to track and compensate for the phase rotation of equalizer input signals.For the multipath underwater acoustic channel with mixed phase and phase rotation,quadrature phase shift keying(QPSK)and 16 quadrature amplitude modulation(16QAM)modulated signals are used in the computer simulation of the algorithm in terms of convergence and carrier recovery performance.The results show that the proposed algorithm can considerably improve convergence speed and steady-state error,make effective compensation for phase rotation,and efficiently facilitate carrier recovery.展开更多
In order to solve the problem of uncertainty and fuzzy information in the process of weapon equipment system selec-tion,a multi-attribute decision-making(MADM)method based on probabilistic hesitant fuzzy set(PHFS)is p...In order to solve the problem of uncertainty and fuzzy information in the process of weapon equipment system selec-tion,a multi-attribute decision-making(MADM)method based on probabilistic hesitant fuzzy set(PHFS)is proposed.Firstly,we introduce the concept of probability and fuzzy entropy to mea-sure the ambiguity,hesitation and uncertainty of probabilistic hesitant fuzzy elements(PHFEs).Sequentially,the expert trust network is constructed,and the importance of each expert in the network can be obtained by calculating the cumulative trust value under multiple trust propagation paths,so as to obtain the expert weight vector.Finally,we put forward an MADM method combining the probabilistic hesitant fuzzy entropy and grey rela-tion analysis(GRA)model,and an illustrative case is employed to prove the feasibility and effectiveness of the method when solving the weapon system selection decision-making problem.展开更多
文摘A multi-stage influence diagram is used to model the pilot's sequential decision making in one on one air combat. The model based on the multi-stage influence diagram graphically describes the elements of decision process, and contains a point-mass model for the dynamics of an aircraft and takes into account the decision maker's preferences under uncertain conditions. Considering an active opponent, the opponent's maneuvers can be modeled stochastically. The solution of multistage influence diagram can be obtained by converting the multistage influence diagram into a two-level optimization problem. The simulation results show the model is effective.
基金the National Natural Science Foundation of China(71971075,71871079,71671059)the Anhui Provincial Natural Science Foundation(1808085MG213).
文摘Air-to-air combat tactical decisions for multiple unmanned aerial vehicles(ACTDMU)are a key decision-making step in beyond visual range combat.Complex influencing factors,strong antagonism and real-time requirements need to be considered in the ACTDMU problem.In this paper,we propose a multicriteria game approach to ACTDMU.This approach consists of a multicriteria game model and a Pareto Nash equilibrium algorithm.In this model,we form the strategy profiles for the integration of air-to-air combat tactics and weapon target assignment strategies by considering the correlation between them,and we design the vector payoff functions based on predominance factors.We propose a algorithm of Pareto Nash equilibrium based on preference relations using threshold constraints(PNE-PRTC),and we prove that the solutions obtained by this algorithm are refinements of Pareto Nash equilibrium solutions.The numerical experiments indicate that PNE-PRTC algorithm is considerably faster than the baseline algorithms and the performance is better.Especially on large-scale instances,the Pareto Nash equilibrium solutions can be calculated by PNEPRTC algorithm at the second level.The simulation experiments show that the multicriteria game approach is more effective than one-side decision approaches such as multiple-attribute decision-making and randomly chosen decisions.
基金supported by the National Natural Science Foundation of China (61070241)the Natural Science Foundation of Shandong Province (ZR2010FM035)Science Research Foundation of University of Jinan (XKY0808)
文摘A new approach to knowledge acquisition in incomplete information system with fuzzy decisions is proposed. In such incomplete information system, the universe of discourse is classified by the maximal tolerance classes, and fuzzy approximations are defined based on them. Three types of relative reducts of maximal tolerance classes are then proposed, and three types of fuzzy decision rules based on the proposed attribute description are defined. The judgment theorems and approximation discernibility functions with respect to them are presented to compute the relative reduct by using Boolean reasoning techniques, from which we can derive optimal fuzzy decision rules from the systems. At last, three types of relative reducts of the system and their computing methods are given.
基金supported by the National Social Science Fund of China(Grand No.21XTJ001).
文摘A Receiver Operating Characteristic(ROC)analysis of a power is important and useful in clinical trials.A Classical Conditional Power(CCP)is a probability of a classical rejection region given values of true treatment effect and interim result.For hypotheses and reversed hypotheses under normal models,we obtain analytical expressions of the ROC curves of the CCP,find optimal ROC curves of the CCP,investigate the superiority of the ROC curves of the CCP,calculate critical values of the False Positive Rate(FPR),True Positive Rate(TPR),and cutoff of the optimal CCP,and give go/no go decisions at the interim of the optimal CCP.In addition,extensive numerical experiments are carried out to exemplify our theoretical results.Finally,a real data example is performed to illustrate the go/no go decisions of the optimal CCP.
文摘Developmental and reproductive toxicity(DART)endpoint entails a toxicological assessment of all developmental stages and reproductive cycles of an organism.In silico tools to predict DART will provide a method to assess this complex toxicity endpoint and will be valuable for screening emerging pollutants as well as for m anaging new chemicals in China.Currently,there are few published DART prediction models in China,but many related research and development projects are in progress.In 2013,WU et al.published an expert rule-based DART decision tree(DT).This DT relies on known chemical structures linked to DART to forecast DART potential of a given chemical.Within this procedure,an accurate DART data interpretation is the foundation of building and expanding the DT.This paper excerpted case studies demonstrating DART data curation and interpretation of four chemicals(including 8-hydroxyquinoline,3,5,6-trichloro-2-pyridinol,thiacloprid,and imidacloprid)to expand the existing DART DT.Chemicals were first selected from the database of Solid Waste and Chemicals Management Center,Ministry of Ecology and Environment(MEESCC)in China.The structures of these 4 chemicals were analyzed and preliminarily grouped by chemists based on core structural features,functional groups,receptor binding property,metabolism,and possible mode of actions.Then,the DART conclusion was derived by collecting chemical information,searching,integrating,and interpreting DART data by the toxicologists.Finally,these chemicals were classified into either an existing category or a new category via integrating their chemical features,DART conclusions,and biological properties.The results showed that 8-hydroxyquinoline impacted estrous cyclicity,s exual organ weights,and embryonal development,and 3,5,6-trichloro-2-pyridinol caused central nervous system(CNS)malformations,which were added to an existing subcategory 8e(aromatic compounds with multi-halogen and nitro groups)of the DT.Thiacloprid caused dystocia and fetal skeletal malformation,and imidacloprid disrupted the endocrine system and male fertility.They both contain 2-chloro-5-methylpyridine substituted imidazolidine c yclic ring,which were expected to create a new category of neonicotinoids.The current work delineates a t ransparent process of curating toxicological data for the purpose of DART data interpretation.In the presence of sufficient related structures and DART data,the DT can be expanded by iteratively adding chemicals within the a pplicable domain of each category or subcategory.This DT can potentially serve as a tool for screening emerging pollutants and assessing new chemicals in China.
基金supported by the Major Projects for Science and Technology Innovation 2030(2018AAA0100805).
文摘To address the confrontation decision-making issues in multi-round air combat,a dynamic game decision method is proposed based on decision tree for the confrontation of unmanned aerial vehicle(UAV)air combat.Based on game the-ory and the confrontation characteristics of air combat,a dynamic game process is constructed including the strategy sets,the situation information,and the maneuver decisions for both sides of air combat.By analyzing the UAV’s flight dyna-mics and the both sides’information,a payment matrix is estab-lished through the situation advantage function,performance advantage function,and profit function.Furthermore,the dynamic game decision problem is solved based on the linear induction method to obtain the Nash equilibrium solution,where the decision tree method is introduced to obtain the optimal maneuver decision,thereby improving the situation advantage in the next round of confrontation.According to the analysis,the simulation results for the confrontation scenarios of multi-round air combat are presented to verify the effectiveness and advan-tages of the proposed method.
文摘System upgrades in unmanned systems have made Unmanned Aerial Vehicle(UAV)-based patrolling and monitoring a preferred solution for ocean surveillance.However,dynamic environments and large-scale deployments pose significant challenges for efficient decision-making,necessitating a modular multiagent control system.Deep Reinforcement Learning(DRL)and Decision Tree(DT)have been utilized for these complex decision-making tasks,but each has its limitations:DRL is highly adaptive but lacks interpretability,while DT is inherently interpretable but has limited adaptability.To overcome these challenges,we propose the Adaptive Interpretable Decision Tree(AIDT),an evolutionary-based algorithm that is both adaptable to diverse environmental settings and highly interpretable in its decision-making processes.We first construct a Markov decision process(MDP)-based simulation environment using the Cooperative Submarine Search task as a representative scenario for training and testing the proposed method.Specifically,we use the heat map as a state variable to address the issue of multi-agent input state proliferation.Next,we introduce the curiosity-guiding intrinsic reward to encourage comprehensive exploration and enhance algorithm performance.Additionally,we incorporate decision tree size as an influence factor in the adaptation process to balance task completion with computational efficiency.To further improve the generalization capability of the decision tree,we apply a normalization method to ensure consistent processing of input states.Finally,we validate the proposed algorithm in different environmental settings,and the results demonstrate both its adaptability and interpretability.
基金supported by the National Natural Science Foundation of China(72101025,72271049),the Interdisciplinary Research Project for Young Teachers of USTB(Fundamental Research Funds for the Central Universities,FRF-IDRY-24-024)the Hebei Natural Science Foundation(F2023501011)+1 种基金the Fundamental Research Funds for the Central Universities(FRF-TP-20-073A1)the R&D Program of Beijing Municipal Education Commission(KM202411232015).
文摘This paper proposes a reliability evaluation model for a multi-dimensional network system,which has potential to be applied to the internet of things or other practical networks.A multi-dimensional network system with one source element and multiple sink elements is considered first.Each element can con-nect with other elements within a stochastic connection ranges.The system is regarded as successful as long as the source ele-ment remains connected with all sink elements.An importance measure is proposed to evaluate the performance of non-source elements.Furthermore,to calculate the system reliability and the element importance measure,a multi-valued decision diagram based approach is structured and its complexity is analyzed.Finally,a numerical example about the signal transfer station system is illustrated to analyze the system reliability and the ele-ment importance measure.
基金supported by the Natural Science Foundation of Hunan Province(2023JJ50047,2023JJ40306)the Research Foundation of Education Bureau of Hunan Province(23A0494,20B260)the Key R&D Projects of Hunan Province(2019SK2331)。
文摘Aiming at the triangular fuzzy(TF)multi-attribute decision making(MADM)problem with a preference for the distribution density of attribute(DDA),a decision making method with TF number two-dimensional density(TFTD)operator is proposed based on the density operator theory for the decision maker(DM).Firstly,a simple TF vector clustering method is proposed,which considers the feature of TF number and the geometric distance of vectors.Secondly,the least deviation sum of squares method is used in the program model to obtain the density weight vector.Then,two TFTD operators are defined,and the MADM method based on the TFTD operator is proposed.Finally,a numerical example is given to illustrate the superiority of this method,which can not only solve the TF MADM problem with a preference for the DDA but also help the DM make an overall comparison.
基金supported in part by the National Key Laboratory of Air-based Information Perception and Fusion and the Aeronautical Science Foundation of China (Grant No. 20220001068001)National Natural Science Foundation of China (Grant No.61673327)+1 种基金Natural Science Basic Research Plan in Shaanxi Province,China (Grant No. 2023-JC-QN-0733)China IndustryUniversity-Research Innovation Foundation (Grant No. 2022IT188)。
文摘Aiming at the problem of multi-UAV pursuit-evasion confrontation, a UAV cooperative maneuver method based on an improved multi-agent deep reinforcement learning(MADRL) is proposed. In this method, an improved Comm Net network based on a communication mechanism is introduced into a deep reinforcement learning algorithm to solve the multi-agent problem. A layer of gated recurrent unit(GRU) is added to the actor-network structure to remember historical environmental states. Subsequently,another GRU is designed as a communication channel in the Comm Net core network layer to refine communication information between UAVs. Finally, the simulation results of the algorithm in two sets of scenarios are given, and the results show that the method has good effectiveness and applicability.
基金National Natural Science Foundation of China(61973037)National 173 Program Project(2019-JCJQ-ZD-324).
文摘To solve the problem of the low interference success rate of air defense missile radio fuzes due to the unified interference form of the traditional fuze interference system,an interference decision method based Q-learning algorithm is proposed.First,dividing the distance between the missile and the target into multiple states to increase the quantity of state spaces.Second,a multidimensional motion space is utilized,and the search range of which changes with the distance of the projectile,to select parameters and minimize the amount of ineffective interference parameters.The interference effect is determined by detecting whether the fuze signal disappears.Finally,a weighted reward function is used to determine the reward value based on the range state,output power,and parameter quantity information of the interference form.The effectiveness of the proposed method in selecting the range of motion space parameters and designing the discrimination degree of the reward function has been verified through offline experiments involving full-range missile rendezvous.The optimal interference form for each distance state has been obtained.Compared with the single-interference decision method,the proposed decision method can effectively improve the success rate of interference.
基金supported by the National Natural Science Foundation of China(Grant No.12072090)。
文摘This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.
基金supported by the National Natural Science Foundation of China(61671461)。
文摘A novel variable step-size modified super-exponential iteration(MSEI)decision feedback blind equalization(DFE)algorithm with second-order digital phase-locked loop is put forward to improve the convergence performance of super-exponential iteration DFE algorithm.Based on the MSEI-DFE algorithm,it is first proposed to develop an error function as an improvement to the error function of MSEI,which effectively achieves faster convergence speed of the algorithm.Subsequently,a hyperbolic tangent function variable step-size algorithm is developed considering the high variation rate of the hyperbolic tangent function around zero,so as to further improve the convergence speed of the algorithm.In the end,a second-order digital phase-locked loop is introduced into the decision feedback equalizer to track and compensate for the phase rotation of equalizer input signals.For the multipath underwater acoustic channel with mixed phase and phase rotation,quadrature phase shift keying(QPSK)and 16 quadrature amplitude modulation(16QAM)modulated signals are used in the computer simulation of the algorithm in terms of convergence and carrier recovery performance.The results show that the proposed algorithm can considerably improve convergence speed and steady-state error,make effective compensation for phase rotation,and efficiently facilitate carrier recovery.
基金supported by the National Natural Science Foundation of China(71901214).
文摘In order to solve the problem of uncertainty and fuzzy information in the process of weapon equipment system selec-tion,a multi-attribute decision-making(MADM)method based on probabilistic hesitant fuzzy set(PHFS)is proposed.Firstly,we introduce the concept of probability and fuzzy entropy to mea-sure the ambiguity,hesitation and uncertainty of probabilistic hesitant fuzzy elements(PHFEs).Sequentially,the expert trust network is constructed,and the importance of each expert in the network can be obtained by calculating the cumulative trust value under multiple trust propagation paths,so as to obtain the expert weight vector.Finally,we put forward an MADM method combining the probabilistic hesitant fuzzy entropy and grey rela-tion analysis(GRA)model,and an illustrative case is employed to prove the feasibility and effectiveness of the method when solving the weapon system selection decision-making problem.