The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to...The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to ground-to-air confrontation,there is low efficiency in dealing with complex tasks,and there are interactive conflicts in multiagent systems.This study proposes a multiagent architecture based on a one-general agent with multiple narrow agents(OGMN)to reduce task assignment conflicts.Considering the slow speed of traditional dynamic task assignment algorithms,this paper proposes the proximal policy optimization for task assignment of general and narrow agents(PPOTAGNA)algorithm.The algorithm based on the idea of the optimal assignment strategy algorithm and combined with the training framework of deep reinforcement learning(DRL)adds a multihead attention mechanism and a stage reward mechanism to the bilateral band clipping PPO algorithm to solve the problem of low training efficiency.Finally,simulation experiments are carried out in the digital battlefield.The multiagent architecture based on OGMN combined with the PPO-TAGNA algorithm can obtain higher rewards faster and has a higher win ratio.By analyzing agent behavior,the efficiency,superiority and rationality of resource utilization of this method are verified.展开更多
According to the requirements of the live-virtual-constructive(LVC)tactical confrontation(TC)on the virtual entity(VE)decision model of graded combat capability,diversified actions,real-time decision-making,and genera...According to the requirements of the live-virtual-constructive(LVC)tactical confrontation(TC)on the virtual entity(VE)decision model of graded combat capability,diversified actions,real-time decision-making,and generalization for the enemy,the confrontation process is modeled as a zero-sum stochastic game(ZSG).By introducing the theory of dynamic relative power potential field,the problem of reward sparsity in the model can be solved.By reward shaping,the problem of credit assignment between agents can be solved.Based on the idea of meta-learning,an extensible multi-agent deep reinforcement learning(EMADRL)framework and solving method is proposed to improve the effectiveness and efficiency of model solving.Experiments show that the model meets the requirements well and the algorithm learning efficiency is high.展开更多
The multi-target assignment(MTA)problem,a crucial challenge in command control,mission planning,and a fundamental research focus in military operations,has garnered significant attention over the years.Extensively stu...The multi-target assignment(MTA)problem,a crucial challenge in command control,mission planning,and a fundamental research focus in military operations,has garnered significant attention over the years.Extensively studied across various domains such as land,sea,air,space,and electronics,the MTA problem has led to the emergence of numerous models and algorithms.To delve deeper into this field,this paper starts by conducting a bibliometric analysis on 463 Scopus database papers using CiteSpace software.The analysis includes examining keyword clustering,co-occurrence,and burst,with visual representations of the results.Following this,the paper provides an overview of current classification and modeling techniques for addressing the MTA problem,distinguishing between static multi-target assignment(SMTA)and dynamic multi-target assignment(DMTA).Subsequently,existing solution algorithms for the MTA problem are reviewed,generally falling into three categories:exact algorithms,heuristic algorithms,and machine learning algorithms.Finally,a development framework is proposed based on the"HIGH"model(high-speed,integrated,great,harmonious)to guide future research and intelligent weapon system development concerning the MTA problem.This framework emphasizes application scenarios,modeling mechanisms,solution algorithms,and system efficiency to offer a roadmap for future exploration in this area.展开更多
A differential game guidance scheme with obstacle avoidance,based on the formulation of a combined linear quadratic and norm-bounded differential game,is designed for a three-player engagement scenario,which includes ...A differential game guidance scheme with obstacle avoidance,based on the formulation of a combined linear quadratic and norm-bounded differential game,is designed for a three-player engagement scenario,which includes a pursuer,an interceptor,and an evader.The confrontation between the players is divided into four phases(P1-P4)by introducing the switching time,and proposing different guidance strategies according to the phase where the static obstacle is located:the linear quadratic game method is employed to devise the guidance scheme for the energy optimization when the obstacle is located in the P1 and P3 stages;the norm-bounded differential game guidance strategy is presented to satisfy the acceleration constraint under the circumstance that the obstacle is located in the P2 and P4 phases.Furthermore,the radii of the static obstacle and the interceptor are taken as the design parameters to derive the combined guidance strategy through the dead-zone function,which guarantees that the pursuer avoids the static obstacle,and the interceptor,and attacks the evader.Finally,the nonlinear numerical simulations verify the performance of the game guidance strategy.展开更多
The direct cause for the voidance of economic policy is that the government has ignored the game behavior of the microeconomy.Only if we formulate and implement policy strictly according to the game rules can it be pr...The direct cause for the voidance of economic policy is that the government has ignored the game behavior of the microeconomy.Only if we formulate and implement policy strictly according to the game rules can it be promoted to realize its desired objective.展开更多
基金the Project of National Natural Science Foundation of China(Grant No.62106283)the Project of National Natural Science Foundation of China(Grant No.72001214)to provide fund for conducting experimentsthe Project of Natural Science Foundation of Shaanxi Province(Grant No.2020JQ-484)。
文摘The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to ground-to-air confrontation,there is low efficiency in dealing with complex tasks,and there are interactive conflicts in multiagent systems.This study proposes a multiagent architecture based on a one-general agent with multiple narrow agents(OGMN)to reduce task assignment conflicts.Considering the slow speed of traditional dynamic task assignment algorithms,this paper proposes the proximal policy optimization for task assignment of general and narrow agents(PPOTAGNA)algorithm.The algorithm based on the idea of the optimal assignment strategy algorithm and combined with the training framework of deep reinforcement learning(DRL)adds a multihead attention mechanism and a stage reward mechanism to the bilateral band clipping PPO algorithm to solve the problem of low training efficiency.Finally,simulation experiments are carried out in the digital battlefield.The multiagent architecture based on OGMN combined with the PPO-TAGNA algorithm can obtain higher rewards faster and has a higher win ratio.By analyzing agent behavior,the efficiency,superiority and rationality of resource utilization of this method are verified.
基金supported by the Military Scentific Research Project(41405030302,41401020301).
文摘According to the requirements of the live-virtual-constructive(LVC)tactical confrontation(TC)on the virtual entity(VE)decision model of graded combat capability,diversified actions,real-time decision-making,and generalization for the enemy,the confrontation process is modeled as a zero-sum stochastic game(ZSG).By introducing the theory of dynamic relative power potential field,the problem of reward sparsity in the model can be solved.By reward shaping,the problem of credit assignment between agents can be solved.Based on the idea of meta-learning,an extensible multi-agent deep reinforcement learning(EMADRL)framework and solving method is proposed to improve the effectiveness and efficiency of model solving.Experiments show that the model meets the requirements well and the algorithm learning efficiency is high.
基金the financial support provided by the National Natural Science Foundation of China(NSFC)(Grant No.62173274)the National Key R&D Program of China(Grant No.2019YFA0405300)+4 种基金the Natural Science Foundation of Hunan Province of China(Grant No.2021JJ10045)the Practice and Innovation Funds for Graduate Students of Northwestern Polytechnical University(Grant No.PF2023046)the Open Research Subject of State Key Laboratory of Intelligent Game(Grant No.ZBKF-24-01)the Postdoctoral Fellowship Program of CPSF(No.GZB20240989)the China Postdoctoral Science Foundation(Grant No.2024M754304)。
文摘The multi-target assignment(MTA)problem,a crucial challenge in command control,mission planning,and a fundamental research focus in military operations,has garnered significant attention over the years.Extensively studied across various domains such as land,sea,air,space,and electronics,the MTA problem has led to the emergence of numerous models and algorithms.To delve deeper into this field,this paper starts by conducting a bibliometric analysis on 463 Scopus database papers using CiteSpace software.The analysis includes examining keyword clustering,co-occurrence,and burst,with visual representations of the results.Following this,the paper provides an overview of current classification and modeling techniques for addressing the MTA problem,distinguishing between static multi-target assignment(SMTA)and dynamic multi-target assignment(DMTA).Subsequently,existing solution algorithms for the MTA problem are reviewed,generally falling into three categories:exact algorithms,heuristic algorithms,and machine learning algorithms.Finally,a development framework is proposed based on the"HIGH"model(high-speed,integrated,great,harmonious)to guide future research and intelligent weapon system development concerning the MTA problem.This framework emphasizes application scenarios,modeling mechanisms,solution algorithms,and system efficiency to offer a roadmap for future exploration in this area.
基金supported by National Natural Science Foundation(NNSF)of China under(Grant No.62273119)。
文摘A differential game guidance scheme with obstacle avoidance,based on the formulation of a combined linear quadratic and norm-bounded differential game,is designed for a three-player engagement scenario,which includes a pursuer,an interceptor,and an evader.The confrontation between the players is divided into four phases(P1-P4)by introducing the switching time,and proposing different guidance strategies according to the phase where the static obstacle is located:the linear quadratic game method is employed to devise the guidance scheme for the energy optimization when the obstacle is located in the P1 and P3 stages;the norm-bounded differential game guidance strategy is presented to satisfy the acceleration constraint under the circumstance that the obstacle is located in the P2 and P4 phases.Furthermore,the radii of the static obstacle and the interceptor are taken as the design parameters to derive the combined guidance strategy through the dead-zone function,which guarantees that the pursuer avoids the static obstacle,and the interceptor,and attacks the evader.Finally,the nonlinear numerical simulations verify the performance of the game guidance strategy.
文摘The direct cause for the voidance of economic policy is that the government has ignored the game behavior of the microeconomy.Only if we formulate and implement policy strictly according to the game rules can it be promoted to realize its desired objective.