This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired traje...This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired trajectory. Input the fixed time-varying formation template to the leader and start executing, this process also needs to track the desired trajectory, and the follower needs to converge to the convex hull that the leader crosses. Firstly, the dynamic models of nonholonomic systems are linearized to second-order dynamics. Then, based on the desired trajectory and formation template, the FC control protocols are proposed. Sufficient conditions to achieve FC are introduced and an algorithm is proposed to resolve the control parameters by solving an algebraic Riccati equation. The system is demonstrated to achieve FC, with the average position and velocity of the leaders converging asymptotically to the desired trajectory. Finally, the theoretical achievements are verified in simulations by a multi-agent system composed of virtual human individuals.展开更多
Consensus tracking control problems for single-integrator dynamics of multi-agent systems with switching topology are investigated. In order to design effective consensus tracking protocols for a more general class of...Consensus tracking control problems for single-integrator dynamics of multi-agent systems with switching topology are investigated. In order to design effective consensus tracking protocols for a more general class of networks, which are aimed at ensuring that the concerned states of agents converge to a constant or time-varying reference state, new consensus tracking protocols with a constant and time-varying reference state are proposed, respectively. Particularly, by contrast with spanning tree, an improved condition of switching interaction topology is presented. And then, convergence analysis of two consensus tracking protocols is provided by Lyapunov stability theory. Moreover, consensus tracking protocol with a time-varying reference state is extended to achieve the fbrmation control. By introducing formation structure set, each agent can gain its individual desired trajectory. Finally, several simulations are worked out to illustrate the effectiveness of theoretical results. The test results show that the states of agents can converge to a desired constant or time-varying reference state. In addition, by selecting appropriate structure set, agents can maintain the expected formation under random switching interaction topologies.展开更多
Consensus problems for discrete-time multi-agent systems were focused on. In order to design effective consensus protocols, which were aimed at ensuring that the concerned states of agents converged to a common value,...Consensus problems for discrete-time multi-agent systems were focused on. In order to design effective consensus protocols, which were aimed at ensuring that the concerned states of agents converged to a common value, a new consensus protocol for general discrete-time multi-agent system was proposed based on Lyapunov stability theory. For discrete-time multi-agent systems with desired trajectory, trajectory tracking and formation control problems were studied. The main idea of trajectory tracking problems was to design trajectory controller such that each agent tracked desired trajectory. For a type of formation problem with fixed formation structure, the formation structure set was introduced. According to the formation structure set, each agent can track its individual desired trajectory. Finally, simulations were provided to demonstrate the effectiveness of the theoretical results. The mlmerical results show that the states of agents converge to zero with consensus protocol, which is said to achieve a consensus asymptotically. In addition, through designing appropriate trajectory controllers, the simulation results show that agents converge to the desired trajectory asymptotically and can form different formations.展开更多
Two protocols are presented,which can make agents reach consensus while achieving and preserving the desired formation in fixed topology with and without communication timedelay for multi-agent network.First,the proto...Two protocols are presented,which can make agents reach consensus while achieving and preserving the desired formation in fixed topology with and without communication timedelay for multi-agent network.First,the protocol without considering the communication time-delay is presented,and by using Lyapunov stability theory,the sufficient condition of stability for this multi-agent system is presented.Further,considering the communication time-delay,the effectiveness of the protocol based on Lyapunov-Krasovskii function is demonstrated.The main contribution of the proposed protocols is that,as well as the velocity consensus is considered,the formation control is concerned for multi-agent systems described as the second-order equations.Finally,numerical examples are presented to illustrate the effectiveness of the proposed protocols.展开更多
The delayed-state-derivative feedback (DSDF) is in- troduced into the existing consensus protocol to simultaneously improve the robustness to communication delay and accele- rate the convergence speed of achieving t...The delayed-state-derivative feedback (DSDF) is in- troduced into the existing consensus protocol to simultaneously improve the robustness to communication delay and accele- rate the convergence speed of achieving the consensus. The frequency-domain analysis, together with the algebra graph the- ory, is employed to derive the sufficient and necessary condition guaranteeing the average consensus. It is shown that introduc- ing the DSDF with the proper intensity in the existing consensus protocol can improve the robustness to communication delay. By analyzing the effect of DSDF on the closed-loop poles, this pa- per proves that for a supercritical-delay multi-agent system, this strategy can also accelerate the convergence speed of achieving the consensus with provided the proper intensity of the DSDE Simulations are provided to demonstrate the effectiveness of the theoretical results.展开更多
Based on the strategy of information feedback from followers to the leader, flocking control of a group of agents with a leader is studied. The leader tracks a pre-defined trajectory and at the same time the leader us...Based on the strategy of information feedback from followers to the leader, flocking control of a group of agents with a leader is studied. The leader tracks a pre-defined trajectory and at the same time the leader uses the feedback information from followers to the leader to modify its motion. The advantage of this control scheme is that it reduces the tracking errors and improves the robustness of the team cohesion to followers' faults. The results of simulation are provided to illustrate that information feedback can improve the performance of the system.展开更多
Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the comb...Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the combat group,this task suffers from credit assignment problem more than other rein-forcement learning tasks.This study uses reward shaping to relieve the credit assignment problem and improve policy train-ing for the new generation of large-scale unmanned combat operations.We first prove that multiple reward shaping func-tions would not change the Nash Equilibrium in stochastic games,providing theoretical support for their use.According to the characteristics of combat operations,we propose tactical reward shaping(TRS)that comprises maneuver shaping advice and threat assessment-based attack shaping advice.Then,we investigate the effects of different types and combinations of shaping advice on combat policies through experiments.The results show that TRS improves both the efficiency and attack accuracy of combat policies,with the combination of maneuver reward shaping advice and ally-focused attack shaping advice achieving the best performance compared with that of the base-line strategy.展开更多
The health monitoring for large-scale structures need to resolve a large number of difficulties,such as the data transmission and distributing information handling.To solve these problems,the technology of multi-agent...The health monitoring for large-scale structures need to resolve a large number of difficulties,such as the data transmission and distributing information handling.To solve these problems,the technology of multi-agent is a good candidate to be used in the field of structural health monitoring.A structural health monitoring system architecture based on multi-agent technology is proposed.The measurement system for aircraft airfoil is designed with FBG,strain gage,and corresponding signal processing circuit.The experiment to determine the location of the concentrate loading on the structure is carried on with the system combined with technologies of pattern recognition and multi-agent.The results show that the system can locate the concentrate loading of the aircraft airfoil at the accuracy of 91.2%.展开更多
As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication ...As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents.展开更多
In this paper,the distributed fuzzy fault-tolerant tracking consensus problem of leader-follower multi-agent systems(MASs)is studied.The objective system includes actuator faults,mismatched parameter uncertainties,non...In this paper,the distributed fuzzy fault-tolerant tracking consensus problem of leader-follower multi-agent systems(MASs)is studied.The objective system includes actuator faults,mismatched parameter uncertainties,nonlinear functions,and exogenous disturbances under switching communication topologies.To solve this problem,a distributed fuzzy fault-tolerant controller is proposed for each follower by adaptive mechanisms to track the state of the leader.Furthermore,the fuzzy logic system is utilized to approximate the unknown nonlinear dynamics.An error estimator is introduced between the mismatched parameter matrix and the input matrix.Then,a selective adaptive law with relative state information is adopted and applied.When calculating the Lyapunov function’s derivative,the coupling terms related to consensus error and mismatched parameter uncertainties can be eliminated.Finally,a numerical simulation is given to validate the effectiveness of the proposed protocol.展开更多
Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune s...Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune system (BIS) according to the similarity of the defense mechanism and characteristics between the CASoSSWF and the BIS, and then designs the models of components and the architecture for a monitoring agent, a regulating agent, a killer agent, a pre-warning agent and a communicating agent by making use of the theories and methods of the artificial immune system, the multi-agent system (MAS), the vaccine and the danger theory (DT). Moreover a new immune multi-agent model using vaccine based on DT (IMMUVBDT) for the cooperative air-defense SoS is advanced. The immune response and immune mechanism of the CASoSSWF are analyzed. The model has a capability of memory, evolution, commendable dynamic environment adaptability and self-learning, and embodies adequately the cooperative air-defense mechanism for the CASoSSWF. Therefore it shows a novel idea for the CASoSSWF which can provide conception models for a surface warship formation operation simulation system.展开更多
To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model wit...To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model without boundary constraints are built,and the criteria for successful target capture are given.Then,the cooperative hunting problem of a USV fleet is modeled as a decentralized partially observable Markov decision process(Dec-POMDP),and a distributed partially observable multitarget hunting Proximal Policy Optimization(DPOMH-PPO)algorithm applicable to USVs is proposed.In addition,an observation model,a reward function and the action space applicable to multi-target hunting tasks are designed.To deal with the dynamic change of observational feature dimension input by partially observable systems,a feature embedding block is proposed.By combining the two feature compression methods of column-wise max pooling(CMP)and column-wise average-pooling(CAP),observational feature encoding is established.Finally,the centralized training and decentralized execution framework is adopted to complete the training of hunting strategy.Each USV in the fleet shares the same policy and perform actions independently.Simulation experiments have verified the effectiveness of the DPOMH-PPO algorithm in the test scenarios with different numbers of USVs.Moreover,the advantages of the proposed model are comprehensively analyzed from the aspects of algorithm performance,migration effect in task scenarios and self-organization capability after being damaged,the potential deployment and application of DPOMH-PPO in the real environment is verified.展开更多
In multi-agent systems(MAS),finding agents which are able to service properly in an open and dynamic environment are the key issue in problem solving.However,it is difficult to find agent resources quickly and positio...In multi-agent systems(MAS),finding agents which are able to service properly in an open and dynamic environment are the key issue in problem solving.However,it is difficult to find agent resources quickly and position agents accurately and complete the system integration by the keyword matching method,due to the lack of clear semantic information of the classical agent model.An semantic-based agent dynamic positioning mechanism was proposed to assist in the system dynamic integration.According to the semantic agent model and the description method,a two-stage process including the domain positioning stage and the service semantic matching positioning stage,was discussed.With this mechanism,proper agents that provide appropriate service to assign sub-tasks for task completion can be found quickly and accurately.Finally,the effectiveness of the positioning mechanism was validated through the in-depth performance analysis in the application of simulation experiments to the system dynamic integration.展开更多
文摘This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired trajectory. Input the fixed time-varying formation template to the leader and start executing, this process also needs to track the desired trajectory, and the follower needs to converge to the convex hull that the leader crosses. Firstly, the dynamic models of nonholonomic systems are linearized to second-order dynamics. Then, based on the desired trajectory and formation template, the FC control protocols are proposed. Sufficient conditions to achieve FC are introduced and an algorithm is proposed to resolve the control parameters by solving an algebraic Riccati equation. The system is demonstrated to achieve FC, with the average position and velocity of the leaders converging asymptotically to the desired trajectory. Finally, the theoretical achievements are verified in simulations by a multi-agent system composed of virtual human individuals.
基金Projects(61075065,60774045) supported by the National Natural Science Foundation of ChinaProject supported by the Graduate Degree Thesis Innovation Foundation of Central South University,China
文摘Consensus tracking control problems for single-integrator dynamics of multi-agent systems with switching topology are investigated. In order to design effective consensus tracking protocols for a more general class of networks, which are aimed at ensuring that the concerned states of agents converge to a constant or time-varying reference state, new consensus tracking protocols with a constant and time-varying reference state are proposed, respectively. Particularly, by contrast with spanning tree, an improved condition of switching interaction topology is presented. And then, convergence analysis of two consensus tracking protocols is provided by Lyapunov stability theory. Moreover, consensus tracking protocol with a time-varying reference state is extended to achieve the fbrmation control. By introducing formation structure set, each agent can gain its individual desired trajectory. Finally, several simulations are worked out to illustrate the effectiveness of theoretical results. The test results show that the states of agents can converge to a desired constant or time-varying reference state. In addition, by selecting appropriate structure set, agents can maintain the expected formation under random switching interaction topologies.
基金Projects(60474029,60774045,60604005) supported by the National Natural Science Foundation of ChinaProject supported by the Graduate Degree Thesis Innovation Foundation of Central South University,China
文摘Consensus problems for discrete-time multi-agent systems were focused on. In order to design effective consensus protocols, which were aimed at ensuring that the concerned states of agents converged to a common value, a new consensus protocol for general discrete-time multi-agent system was proposed based on Lyapunov stability theory. For discrete-time multi-agent systems with desired trajectory, trajectory tracking and formation control problems were studied. The main idea of trajectory tracking problems was to design trajectory controller such that each agent tracked desired trajectory. For a type of formation problem with fixed formation structure, the formation structure set was introduced. According to the formation structure set, each agent can track its individual desired trajectory. Finally, simulations were provided to demonstrate the effectiveness of the theoretical results. The mlmerical results show that the states of agents converge to zero with consensus protocol, which is said to achieve a consensus asymptotically. In addition, through designing appropriate trajectory controllers, the simulation results show that agents converge to the desired trajectory asymptotically and can form different formations.
基金supported by the National Natural Science Foundation of China (6093400361074065)+1 种基金the National Basic Research Program of China (973 Program) (2010CB731800)the Key Project for Natural Science Research of Hebei Education Department (ZD200908)
文摘Two protocols are presented,which can make agents reach consensus while achieving and preserving the desired formation in fixed topology with and without communication timedelay for multi-agent network.First,the protocol without considering the communication time-delay is presented,and by using Lyapunov stability theory,the sufficient condition of stability for this multi-agent system is presented.Further,considering the communication time-delay,the effectiveness of the protocol based on Lyapunov-Krasovskii function is demonstrated.The main contribution of the proposed protocols is that,as well as the velocity consensus is considered,the formation control is concerned for multi-agent systems described as the second-order equations.Finally,numerical examples are presented to illustrate the effectiveness of the proposed protocols.
基金supported by the National Natural Science Foundation of China (60574088 60874053)
文摘The delayed-state-derivative feedback (DSDF) is in- troduced into the existing consensus protocol to simultaneously improve the robustness to communication delay and accele- rate the convergence speed of achieving the consensus. The frequency-domain analysis, together with the algebra graph the- ory, is employed to derive the sufficient and necessary condition guaranteeing the average consensus. It is shown that introduc- ing the DSDF with the proper intensity in the existing consensus protocol can improve the robustness to communication delay. By analyzing the effect of DSDF on the closed-loop poles, this pa- per proves that for a supercritical-delay multi-agent system, this strategy can also accelerate the convergence speed of achieving the consensus with provided the proper intensity of the DSDE Simulations are provided to demonstrate the effectiveness of the theoretical results.
基金supported by the National Natural Science Foundation of China(60574088).
文摘Based on the strategy of information feedback from followers to the leader, flocking control of a group of agents with a leader is studied. The leader tracks a pre-defined trajectory and at the same time the leader uses the feedback information from followers to the leader to modify its motion. The advantage of this control scheme is that it reduces the tracking errors and improves the robustness of the team cohesion to followers' faults. The results of simulation are provided to illustrate that information feedback can improve the performance of the system.
文摘Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the combat group,this task suffers from credit assignment problem more than other rein-forcement learning tasks.This study uses reward shaping to relieve the credit assignment problem and improve policy train-ing for the new generation of large-scale unmanned combat operations.We first prove that multiple reward shaping func-tions would not change the Nash Equilibrium in stochastic games,providing theoretical support for their use.According to the characteristics of combat operations,we propose tactical reward shaping(TRS)that comprises maneuver shaping advice and threat assessment-based attack shaping advice.Then,we investigate the effects of different types and combinations of shaping advice on combat policies through experiments.The results show that TRS improves both the efficiency and attack accuracy of combat policies,with the combination of maneuver reward shaping advice and ally-focused attack shaping advice achieving the best performance compared with that of the base-line strategy.
基金supported by the Key Program of the National Science Foundation of China(50830201)Aviation Research Foundation(20060952)+1 种基金the National High Technology Research and Development of China(2007AA03Z117)the Natural Science Foundation of Jiansu Province(08kjd560009)
文摘The health monitoring for large-scale structures need to resolve a large number of difficulties,such as the data transmission and distributing information handling.To solve these problems,the technology of multi-agent is a good candidate to be used in the field of structural health monitoring.A structural health monitoring system architecture based on multi-agent technology is proposed.The measurement system for aircraft airfoil is designed with FBG,strain gage,and corresponding signal processing circuit.The experiment to determine the location of the concentrate loading on the structure is carried on with the system combined with technologies of pattern recognition and multi-agent.The results show that the system can locate the concentrate loading of the aircraft airfoil at the accuracy of 91.2%.
文摘As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents.
基金This work was supported by Tianjin Natural Science Foundation of China(20JCYBJC01060,20JCQNJC01450)the National Natural Science Foundation of China(61973175)Tianjin Postgraduate Scientific Research and Innovation Project(2020YJSZXB03,2020YJSZXB12).
文摘In this paper,the distributed fuzzy fault-tolerant tracking consensus problem of leader-follower multi-agent systems(MASs)is studied.The objective system includes actuator faults,mismatched parameter uncertainties,nonlinear functions,and exogenous disturbances under switching communication topologies.To solve this problem,a distributed fuzzy fault-tolerant controller is proposed for each follower by adaptive mechanisms to track the state of the leader.Furthermore,the fuzzy logic system is utilized to approximate the unknown nonlinear dynamics.An error estimator is introduced between the mismatched parameter matrix and the input matrix.Then,a selective adaptive law with relative state information is adopted and applied.When calculating the Lyapunov function’s derivative,the coupling terms related to consensus error and mismatched parameter uncertainties can be eliminated.Finally,a numerical simulation is given to validate the effectiveness of the proposed protocol.
文摘Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune system (BIS) according to the similarity of the defense mechanism and characteristics between the CASoSSWF and the BIS, and then designs the models of components and the architecture for a monitoring agent, a regulating agent, a killer agent, a pre-warning agent and a communicating agent by making use of the theories and methods of the artificial immune system, the multi-agent system (MAS), the vaccine and the danger theory (DT). Moreover a new immune multi-agent model using vaccine based on DT (IMMUVBDT) for the cooperative air-defense SoS is advanced. The immune response and immune mechanism of the CASoSSWF are analyzed. The model has a capability of memory, evolution, commendable dynamic environment adaptability and self-learning, and embodies adequately the cooperative air-defense mechanism for the CASoSSWF. Therefore it shows a novel idea for the CASoSSWF which can provide conception models for a surface warship formation operation simulation system.
基金financial support from National Natural Science Foundation of China(Grant No.61601491)Natural Science Foundation of Hubei Province,China(Grant No.2018CFC865)Military Research Project of China(-Grant No.YJ2020B117)。
文摘To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model without boundary constraints are built,and the criteria for successful target capture are given.Then,the cooperative hunting problem of a USV fleet is modeled as a decentralized partially observable Markov decision process(Dec-POMDP),and a distributed partially observable multitarget hunting Proximal Policy Optimization(DPOMH-PPO)algorithm applicable to USVs is proposed.In addition,an observation model,a reward function and the action space applicable to multi-target hunting tasks are designed.To deal with the dynamic change of observational feature dimension input by partially observable systems,a feature embedding block is proposed.By combining the two feature compression methods of column-wise max pooling(CMP)and column-wise average-pooling(CAP),observational feature encoding is established.Finally,the centralized training and decentralized execution framework is adopted to complete the training of hunting strategy.Each USV in the fleet shares the same policy and perform actions independently.Simulation experiments have verified the effectiveness of the DPOMH-PPO algorithm in the test scenarios with different numbers of USVs.Moreover,the advantages of the proposed model are comprehensively analyzed from the aspects of algorithm performance,migration effect in task scenarios and self-organization capability after being damaged,the potential deployment and application of DPOMH-PPO in the real environment is verified.
基金Projects(61173026,61373045,61202039)supported by the National Natural Science Foundation of ChinaProject(2012AA02A603)supported by the National High Technology Research and Development Program of China+1 种基金Projects(K5051223008,K5051223002)supported by the Fundamental Research Funds for the Central Universities of ChinaProject(513***103E)supported by the Pre-Research Project of the"Twelfth Five-Year-Plan"of China
文摘In multi-agent systems(MAS),finding agents which are able to service properly in an open and dynamic environment are the key issue in problem solving.However,it is difficult to find agent resources quickly and position agents accurately and complete the system integration by the keyword matching method,due to the lack of clear semantic information of the classical agent model.An semantic-based agent dynamic positioning mechanism was proposed to assist in the system dynamic integration.According to the semantic agent model and the description method,a two-stage process including the domain positioning stage and the service semantic matching positioning stage,was discussed.With this mechanism,proper agents that provide appropriate service to assign sub-tasks for task completion can be found quickly and accurately.Finally,the effectiveness of the positioning mechanism was validated through the in-depth performance analysis in the application of simulation experiments to the system dynamic integration.