The rapid development of military technology has prompted different types of equipment to break the limits of operational domains and emerged through complex interactions to form a vast combat system of systems(CSoS),...The rapid development of military technology has prompted different types of equipment to break the limits of operational domains and emerged through complex interactions to form a vast combat system of systems(CSoS),which can be abstracted as a heterogeneous combat network(HCN).It is of great military significance to study the disintegration strategy of combat networks to achieve the breakdown of the enemy’s CSoS.To this end,this paper proposes an integrated framework called HCN disintegration based on double deep Q-learning(HCN-DDQL).Firstly,the enemy’s CSoS is abstracted as an HCN,and an evaluation index based on the capability and attack costs of nodes is proposed.Meanwhile,a mathematical optimization model for HCN disintegration is established.Secondly,the learning environment and double deep Q-network model of HCN-DDQL are established to train the HCN’s disintegration strategy.Then,based on the learned HCN-DDQL model,an algorithm for calculating the HCN’s optimal disintegration strategy under different states is proposed.Finally,a case study is used to demonstrate the reliability and effectiveness of HCNDDQL,and the results demonstrate that HCN-DDQL can disintegrate HCNs more effectively than baseline methods.展开更多
To extract and display the significant information of combat systems,this paper introduces the methodology of functional cartography into combat networks and proposes an integrated framework named“functional cartogra...To extract and display the significant information of combat systems,this paper introduces the methodology of functional cartography into combat networks and proposes an integrated framework named“functional cartography of heterogeneous combat networks based on the operational chain”(FCBOC).In this framework,a functional module detection algorithm named operational chain-based label propagation algorithm(OCLPA),which considers the cooperation and interactions among combat entities and can thus naturally tackle network heterogeneity,is proposed to identify the functional modules of the network.Then,the nodes and their modules are classified into different roles according to their properties.A case study shows that FCBOC can provide a simplified description of disorderly information of combat networks and enable us to identify their functional and structural network characteristics.The results provide useful information to help commanders make precise and accurate decisions regarding the protection,disintegration or optimization of combat networks.Three algorithms are also compared with OCLPA to show that FCBOC can most effectively find functional modules with practical meaning.展开更多
Beyond-visual-range(BVR)air combat threat assessment has attracted wide attention as the support of situation awareness and autonomous decision-making.However,the traditional threat assessment method is flawed in its ...Beyond-visual-range(BVR)air combat threat assessment has attracted wide attention as the support of situation awareness and autonomous decision-making.However,the traditional threat assessment method is flawed in its failure to consider the intention and event of the target,resulting in inaccurate assessment results.In view of this,an integrated threat assessment method is proposed to address the existing problems,such as overly subjective determination of index weight and imbalance of situation.The process and characteristics of BVR air combat are analyzed to establish a threat assessment model in terms of target intention,event,situation,and capability.On this basis,a distributed weight-solving algorithm is proposed to determine index and attribute weight respectively.Then,variable weight and game theory are introduced to effectively deal with the situation imbalance and achieve the combination of subjective and objective.The performance of the model and algorithm is evaluated through multiple simulation experiments.The assessment results demonstrate the accuracy of the proposed method in BVR air combat,indicating its potential practical significance in real air combat scenarios.展开更多
Reinforcement learning has been applied to air combat problems in recent years,and the idea of curriculum learning is often used for reinforcement learning,but traditional curriculum learning suffers from the problem ...Reinforcement learning has been applied to air combat problems in recent years,and the idea of curriculum learning is often used for reinforcement learning,but traditional curriculum learning suffers from the problem of plasticity loss in neural networks.Plasticity loss is the difficulty of learning new knowledge after the network has converged.To this end,we propose a motivational curriculum learning distributed proximal policy optimization(MCLDPPO)algorithm,through which trained agents can significantly outperform the predictive game tree and mainstream reinforcement learning methods.The motivational curriculum learning is designed to help the agent gradually improve its combat ability by observing the agent's unsatisfactory performance and providing appropriate rewards as a guide.Furthermore,a complete tactical maneuver is encapsulated based on the existing air combat knowledge,and through the flexible use of these maneuvers,some tactics beyond human knowledge can be realized.In addition,we designed an interruption mechanism for the agent to increase the frequency of decisionmaking when the agent faces an emergency.When the number of threats received by the agent changes,the current action is interrupted in order to reacquire observations and make decisions again.Using the interruption mechanism can significantly improve the performance of the agent.To simulate actual air combat better,we use digital twin technology to simulate real air battles and propose a parallel battlefield mechanism that can run multiple simulation environments simultaneously,effectively improving data throughput.The experimental results demonstrate that the agent can fully utilize the situational information to make reasonable decisions and provide tactical adaptation in the air combat,verifying the effectiveness of the algorithmic framework proposed in this paper.展开更多
Today’s air combat has reached a high level of uncertainty where continuous or discrete variables with crisp values cannot be properly represented using fuzzy sets. With a set of membership functions, fuzzy logic is ...Today’s air combat has reached a high level of uncertainty where continuous or discrete variables with crisp values cannot be properly represented using fuzzy sets. With a set of membership functions, fuzzy logic is well-suited to tackle such complex states and actions. However, it is not necessary to fuzzify the variables that have definite discrete semantics.Hence, the aim of this study is to improve the level of model abstraction by proposing multiple levels of cascaded hierarchical structures from the perspective of function, namely, the functional decision tree. This method is developed to represent behavioral modeling of air combat systems, and its metamodel,execution mechanism, and code generation can provide a sound basis for function-based behavioral modeling. As a proof of concept, an air combat simulation is developed to validate this method and the results show that the fighter Alpha built using the proposed framework provides better performance than that using default scripts.展开更多
With continuous growth in scale,topology complexity,mission phases,and mission diversity,challenges have been placed for efficient capability evaluation of modern combat systems.Aiming at the problems of insufficient ...With continuous growth in scale,topology complexity,mission phases,and mission diversity,challenges have been placed for efficient capability evaluation of modern combat systems.Aiming at the problems of insufficient mission consideration and single evaluation dimension in the existing evaluation approaches,this study proposes a mission-oriented capability evaluation method for combat systems based on operation loop.Firstly,a combat network model is given that takes into account the capability properties of combat nodes.Then,based on the transition matrix between combat nodes,an efficient algorithm for operation loop identification is proposed based on the Breadth-First Search.Given the mission-capability satisfaction of nodes,the effectiveness evaluation indexes for operation loops and combat network are proposed,followed by node importance measure.Through a case study of the combat scenario involving space-based support against surface ships under different strategies,the effectiveness of the proposed method is verified.The results indicated that the ROI-priority attack method has a notable impact on reducing the overall efficiency of the network,whereas the O-L betweenness-priority attack is more effective in obstructing the successful execution of enemy attack missions.展开更多
In the air combat process,confrontation position is the critical factor to determine the confrontation situation,attack effect and escape probability of UAVs.Therefore,selecting the optimal confrontation position beco...In the air combat process,confrontation position is the critical factor to determine the confrontation situation,attack effect and escape probability of UAVs.Therefore,selecting the optimal confrontation position becomes the primary goal of maneuver decision-making.By taking the position as the UAV’s maneuver strategy,this paper constructs the optimal confrontation position selecting games(OCPSGs)model.In the OCPSGs model,the payoff function of each UAV is defined by the difference between the comprehensive advantages of both sides,and the strategy space of each UAV at every step is defined by its accessible space determined by the maneuverability.Then we design the limit approximation of mixed strategy Nash equilibrium(LAMSNQ)algorithm,which provides a method to determine the optimal probability distribution of positions in the strategy space.In the simulation phase,we assume the motions on three directions are independent and the strategy space is a cuboid to simplify the model.Several simulations are performed to verify the feasibility,effectiveness and stability of the algorithm.展开更多
Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the comb...Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the combat group,this task suffers from credit assignment problem more than other rein-forcement learning tasks.This study uses reward shaping to relieve the credit assignment problem and improve policy train-ing for the new generation of large-scale unmanned combat operations.We first prove that multiple reward shaping func-tions would not change the Nash Equilibrium in stochastic games,providing theoretical support for their use.According to the characteristics of combat operations,we propose tactical reward shaping(TRS)that comprises maneuver shaping advice and threat assessment-based attack shaping advice.Then,we investigate the effects of different types and combinations of shaping advice on combat policies through experiments.The results show that TRS improves both the efficiency and attack accuracy of combat policies,with the combination of maneuver reward shaping advice and ally-focused attack shaping advice achieving the best performance compared with that of the base-line strategy.展开更多
基金supported by the National Natural Science Foundation of China(7200120972231011+2 种基金72071206)the Science and Technology Innovative Research Team in Higher Educational Institutions of Hunan Province(2020RC4046)the Science Foundation for Outstanding Youth Scholars of Hunan Province(2022JJ20047).
文摘The rapid development of military technology has prompted different types of equipment to break the limits of operational domains and emerged through complex interactions to form a vast combat system of systems(CSoS),which can be abstracted as a heterogeneous combat network(HCN).It is of great military significance to study the disintegration strategy of combat networks to achieve the breakdown of the enemy’s CSoS.To this end,this paper proposes an integrated framework called HCN disintegration based on double deep Q-learning(HCN-DDQL).Firstly,the enemy’s CSoS is abstracted as an HCN,and an evaluation index based on the capability and attack costs of nodes is proposed.Meanwhile,a mathematical optimization model for HCN disintegration is established.Secondly,the learning environment and double deep Q-network model of HCN-DDQL are established to train the HCN’s disintegration strategy.Then,based on the learned HCN-DDQL model,an algorithm for calculating the HCN’s optimal disintegration strategy under different states is proposed.Finally,a case study is used to demonstrate the reliability and effectiveness of HCNDDQL,and the results demonstrate that HCN-DDQL can disintegrate HCNs more effectively than baseline methods.
文摘To extract and display the significant information of combat systems,this paper introduces the methodology of functional cartography into combat networks and proposes an integrated framework named“functional cartography of heterogeneous combat networks based on the operational chain”(FCBOC).In this framework,a functional module detection algorithm named operational chain-based label propagation algorithm(OCLPA),which considers the cooperation and interactions among combat entities and can thus naturally tackle network heterogeneity,is proposed to identify the functional modules of the network.Then,the nodes and their modules are classified into different roles according to their properties.A case study shows that FCBOC can provide a simplified description of disorderly information of combat networks and enable us to identify their functional and structural network characteristics.The results provide useful information to help commanders make precise and accurate decisions regarding the protection,disintegration or optimization of combat networks.Three algorithms are also compared with OCLPA to show that FCBOC can most effectively find functional modules with practical meaning.
基金National Natural Science Foundation of China(62006193,62103338)Aeronautical Science Foundation of China(2022Z023053001)+1 种基金Key Research and Development Program of Shaanxi Province(2024GX-YBXM-115)Fundamental Research Funds for the Central Universities(D5000230150)。
文摘Beyond-visual-range(BVR)air combat threat assessment has attracted wide attention as the support of situation awareness and autonomous decision-making.However,the traditional threat assessment method is flawed in its failure to consider the intention and event of the target,resulting in inaccurate assessment results.In view of this,an integrated threat assessment method is proposed to address the existing problems,such as overly subjective determination of index weight and imbalance of situation.The process and characteristics of BVR air combat are analyzed to establish a threat assessment model in terms of target intention,event,situation,and capability.On this basis,a distributed weight-solving algorithm is proposed to determine index and attribute weight respectively.Then,variable weight and game theory are introduced to effectively deal with the situation imbalance and achieve the combination of subjective and objective.The performance of the model and algorithm is evaluated through multiple simulation experiments.The assessment results demonstrate the accuracy of the proposed method in BVR air combat,indicating its potential practical significance in real air combat scenarios.
文摘Reinforcement learning has been applied to air combat problems in recent years,and the idea of curriculum learning is often used for reinforcement learning,but traditional curriculum learning suffers from the problem of plasticity loss in neural networks.Plasticity loss is the difficulty of learning new knowledge after the network has converged.To this end,we propose a motivational curriculum learning distributed proximal policy optimization(MCLDPPO)algorithm,through which trained agents can significantly outperform the predictive game tree and mainstream reinforcement learning methods.The motivational curriculum learning is designed to help the agent gradually improve its combat ability by observing the agent's unsatisfactory performance and providing appropriate rewards as a guide.Furthermore,a complete tactical maneuver is encapsulated based on the existing air combat knowledge,and through the flexible use of these maneuvers,some tactics beyond human knowledge can be realized.In addition,we designed an interruption mechanism for the agent to increase the frequency of decisionmaking when the agent faces an emergency.When the number of threats received by the agent changes,the current action is interrupted in order to reacquire observations and make decisions again.Using the interruption mechanism can significantly improve the performance of the agent.To simulate actual air combat better,we use digital twin technology to simulate real air battles and propose a parallel battlefield mechanism that can run multiple simulation environments simultaneously,effectively improving data throughput.The experimental results demonstrate that the agent can fully utilize the situational information to make reasonable decisions and provide tactical adaptation in the air combat,verifying the effectiveness of the algorithmic framework proposed in this paper.
基金This work was supported by the National Natural Science Foundation of China(62003359).
文摘Today’s air combat has reached a high level of uncertainty where continuous or discrete variables with crisp values cannot be properly represented using fuzzy sets. With a set of membership functions, fuzzy logic is well-suited to tackle such complex states and actions. However, it is not necessary to fuzzify the variables that have definite discrete semantics.Hence, the aim of this study is to improve the level of model abstraction by proposing multiple levels of cascaded hierarchical structures from the perspective of function, namely, the functional decision tree. This method is developed to represent behavioral modeling of air combat systems, and its metamodel,execution mechanism, and code generation can provide a sound basis for function-based behavioral modeling. As a proof of concept, an air combat simulation is developed to validate this method and the results show that the fighter Alpha built using the proposed framework provides better performance than that using default scripts.
文摘With continuous growth in scale,topology complexity,mission phases,and mission diversity,challenges have been placed for efficient capability evaluation of modern combat systems.Aiming at the problems of insufficient mission consideration and single evaluation dimension in the existing evaluation approaches,this study proposes a mission-oriented capability evaluation method for combat systems based on operation loop.Firstly,a combat network model is given that takes into account the capability properties of combat nodes.Then,based on the transition matrix between combat nodes,an efficient algorithm for operation loop identification is proposed based on the Breadth-First Search.Given the mission-capability satisfaction of nodes,the effectiveness evaluation indexes for operation loops and combat network are proposed,followed by node importance measure.Through a case study of the combat scenario involving space-based support against surface ships under different strategies,the effectiveness of the proposed method is verified.The results indicated that the ROI-priority attack method has a notable impact on reducing the overall efficiency of the network,whereas the O-L betweenness-priority attack is more effective in obstructing the successful execution of enemy attack missions.
基金National Key R&D Program of China(Grant No.2021YFA1000402)National Natural Science Foundation of China(Grant No.72071159)to provide fund for conducting experiments。
文摘In the air combat process,confrontation position is the critical factor to determine the confrontation situation,attack effect and escape probability of UAVs.Therefore,selecting the optimal confrontation position becomes the primary goal of maneuver decision-making.By taking the position as the UAV’s maneuver strategy,this paper constructs the optimal confrontation position selecting games(OCPSGs)model.In the OCPSGs model,the payoff function of each UAV is defined by the difference between the comprehensive advantages of both sides,and the strategy space of each UAV at every step is defined by its accessible space determined by the maneuverability.Then we design the limit approximation of mixed strategy Nash equilibrium(LAMSNQ)algorithm,which provides a method to determine the optimal probability distribution of positions in the strategy space.In the simulation phase,we assume the motions on three directions are independent and the strategy space is a cuboid to simplify the model.Several simulations are performed to verify the feasibility,effectiveness and stability of the algorithm.
文摘Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the combat group,this task suffers from credit assignment problem more than other rein-forcement learning tasks.This study uses reward shaping to relieve the credit assignment problem and improve policy train-ing for the new generation of large-scale unmanned combat operations.We first prove that multiple reward shaping func-tions would not change the Nash Equilibrium in stochastic games,providing theoretical support for their use.According to the characteristics of combat operations,we propose tactical reward shaping(TRS)that comprises maneuver shaping advice and threat assessment-based attack shaping advice.Then,we investigate the effects of different types and combinations of shaping advice on combat policies through experiments.The results show that TRS improves both the efficiency and attack accuracy of combat policies,with the combination of maneuver reward shaping advice and ally-focused attack shaping advice achieving the best performance compared with that of the base-line strategy.