This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired traje...This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired trajectory. Input the fixed time-varying formation template to the leader and start executing, this process also needs to track the desired trajectory, and the follower needs to converge to the convex hull that the leader crosses. Firstly, the dynamic models of nonholonomic systems are linearized to second-order dynamics. Then, based on the desired trajectory and formation template, the FC control protocols are proposed. Sufficient conditions to achieve FC are introduced and an algorithm is proposed to resolve the control parameters by solving an algebraic Riccati equation. The system is demonstrated to achieve FC, with the average position and velocity of the leaders converging asymptotically to the desired trajectory. Finally, the theoretical achievements are verified in simulations by a multi-agent system composed of virtual human individuals.展开更多
Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune s...Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune system (BIS) according to the similarity of the defense mechanism and characteristics between the CASoSSWF and the BIS, and then designs the models of components and the architecture for a monitoring agent, a regulating agent, a killer agent, a pre-warning agent and a communicating agent by making use of the theories and methods of the artificial immune system, the multi-agent system (MAS), the vaccine and the danger theory (DT). Moreover a new immune multi-agent model using vaccine based on DT (IMMUVBDT) for the cooperative air-defense SoS is advanced. The immune response and immune mechanism of the CASoSSWF are analyzed. The model has a capability of memory, evolution, commendable dynamic environment adaptability and self-learning, and embodies adequately the cooperative air-defense mechanism for the CASoSSWF. Therefore it shows a novel idea for the CASoSSWF which can provide conception models for a surface warship formation operation simulation system.展开更多
The multi-agent system is the optimal solution to complex intelligent problems. In accordance with the game theory, the concept of loyalty is introduced to analyze the relationship between agents' individual incom...The multi-agent system is the optimal solution to complex intelligent problems. In accordance with the game theory, the concept of loyalty is introduced to analyze the relationship between agents' individual income and global benefits and build the logical architecture of the multi-agent system. Besides, to verify the feasibility of the method, the cyclic neural network is optimized, the bi-directional coordination network is built as the training network for deep learning, and specific training scenes are simulated as the training background. After a certain number of training iterations, the model can learn simple strategies autonomously. Also,as the training time increases, the complexity of learning strategies rises gradually. Strategies such as obstacle avoidance, firepower distribution and collaborative cover are adopted to demonstrate the achievability of the model. The model is verified to be realizable by the examples of obstacle avoidance, fire distribution and cooperative cover. Under the same resource background, the model exhibits better convergence than other deep learning training networks, and it is not easy to fall into the local endless loop.Furthermore, the ability of the learning strategy is stronger than that of the training model based on rules, which is of great practical values.展开更多
Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the comb...Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the combat group,this task suffers from credit assignment problem more than other rein-forcement learning tasks.This study uses reward shaping to relieve the credit assignment problem and improve policy train-ing for the new generation of large-scale unmanned combat operations.We first prove that multiple reward shaping func-tions would not change the Nash Equilibrium in stochastic games,providing theoretical support for their use.According to the characteristics of combat operations,we propose tactical reward shaping(TRS)that comprises maneuver shaping advice and threat assessment-based attack shaping advice.Then,we investigate the effects of different types and combinations of shaping advice on combat policies through experiments.The results show that TRS improves both the efficiency and attack accuracy of combat policies,with the combination of maneuver reward shaping advice and ally-focused attack shaping advice achieving the best performance compared with that of the base-line strategy.展开更多
As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication ...As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents.展开更多
We have examined the theoretical implications of combining two main and three auxiliary ligands to form several Ir(Ⅲ)complexes featuring a transition metal as their core atom to identify some appropriate organic ligh...We have examined the theoretical implications of combining two main and three auxiliary ligands to form several Ir(Ⅲ)complexes featuring a transition metal as their core atom to identify some appropriate organic lightemitting diode(OLED)materials.By utilizing electronic structure,frontier molecular orbitals,minimum single-line absorption,triplet excited states,and emission spectral data derived from the density functional theory,the usefulness of these Ir(Ⅲ)complexes,including(piq)_(2)Ir(acac),(piq)_(2)Ir(tmd),(piq)_(2)Ir(tpip),(fpiq)_(2)Ir(acac),(fpiq)_(2)Ir(tmd),and(fpiq)_(2)Ir(tpip),in OLEDs was examined,where piq=1-phenylisoquinoline,fpiq=1-(4-fluorophenyl)isoquinoline,acac=(3Z)-4-hydroxypent-3-en-2-one,tmd=(4Z)-5-hydroxy-2,2,6,6-tetramethylhept-4-en-3-one,and tpip=tetraphenylimido-diphosphonate.These complexes all have low-efficiency roll-off properties,especially(fpiq)_(2)Ir(tpip).Some researchers have successfully synthesized complexes extremely similar to(piq)_(2)Ir(acac)through the Suzuki-Miyaura coupling reaction.展开更多
The intersection is a widely used traffic line structure from the shallow tunnel to the deep roadway,and determining the subsidence hidden danger area of the roof is the key to its stability control.However,applying t...The intersection is a widely used traffic line structure from the shallow tunnel to the deep roadway,and determining the subsidence hidden danger area of the roof is the key to its stability control.However,applying traditional maximum equivalent span beam(MESB)theory to determine deformation range,peak point,and angle influence poses a challenge.Considering the overall structure of the intersection roof,the maximum equivalent triangular plate(METP)theory is proposed,and its geometric parameter calculation formula and deflection calculation formula are obtained.The application of the two theories in 18 models with different intersection angles,roadway types,and surrounding rock lithology is verified by numerical analysis.The results show that:1)The METP structure of the intersection roof established by the simulation results of each model successfully determined the location of the roof’s high displacement zone;2)The area comparison method of the METP theory can be reasonably explained:①The roof subsidence of the intersection decreases with the increase of the intersection angle;②The roof subsidence at the intersection of different roadway types has a rectangular type>arch type>circular type;③The roof subsidence of the intersection with weak surrounding rock is significantly larger than that of the intersection with hard surrounding rock.According to the application results of the two theories,the four advantages of the METP theory are compared and clarified in the basic assumptions,mechanical models,main viewpoints,and mechanism analysis.The large deformation inducement of the intersection roof is then explored.The J 2 peak area of the roof drives the large deformation of the area,the peak point of which is consistent with the center of gravity position of the METP.Furthermore,the change in the range of this peak is consistent with the change law of the METP’s area.Hence,this theory clarifies the large deformation area of the intersection roof,which provides a clear guiding basis for its initial support design,mid-term monitoring,and late local reinforcement.展开更多
文摘This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired trajectory. Input the fixed time-varying formation template to the leader and start executing, this process also needs to track the desired trajectory, and the follower needs to converge to the convex hull that the leader crosses. Firstly, the dynamic models of nonholonomic systems are linearized to second-order dynamics. Then, based on the desired trajectory and formation template, the FC control protocols are proposed. Sufficient conditions to achieve FC are introduced and an algorithm is proposed to resolve the control parameters by solving an algebraic Riccati equation. The system is demonstrated to achieve FC, with the average position and velocity of the leaders converging asymptotically to the desired trajectory. Finally, the theoretical achievements are verified in simulations by a multi-agent system composed of virtual human individuals.
文摘Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune system (BIS) according to the similarity of the defense mechanism and characteristics between the CASoSSWF and the BIS, and then designs the models of components and the architecture for a monitoring agent, a regulating agent, a killer agent, a pre-warning agent and a communicating agent by making use of the theories and methods of the artificial immune system, the multi-agent system (MAS), the vaccine and the danger theory (DT). Moreover a new immune multi-agent model using vaccine based on DT (IMMUVBDT) for the cooperative air-defense SoS is advanced. The immune response and immune mechanism of the CASoSSWF are analyzed. The model has a capability of memory, evolution, commendable dynamic environment adaptability and self-learning, and embodies adequately the cooperative air-defense mechanism for the CASoSSWF. Therefore it shows a novel idea for the CASoSSWF which can provide conception models for a surface warship formation operation simulation system.
基金supported by the National Natural Science Foundation of China(61503407,61806219,61703426,61876189,61703412)the China Postdoctoral Science Foundation(2016 M602996)。
文摘The multi-agent system is the optimal solution to complex intelligent problems. In accordance with the game theory, the concept of loyalty is introduced to analyze the relationship between agents' individual income and global benefits and build the logical architecture of the multi-agent system. Besides, to verify the feasibility of the method, the cyclic neural network is optimized, the bi-directional coordination network is built as the training network for deep learning, and specific training scenes are simulated as the training background. After a certain number of training iterations, the model can learn simple strategies autonomously. Also,as the training time increases, the complexity of learning strategies rises gradually. Strategies such as obstacle avoidance, firepower distribution and collaborative cover are adopted to demonstrate the achievability of the model. The model is verified to be realizable by the examples of obstacle avoidance, fire distribution and cooperative cover. Under the same resource background, the model exhibits better convergence than other deep learning training networks, and it is not easy to fall into the local endless loop.Furthermore, the ability of the learning strategy is stronger than that of the training model based on rules, which is of great practical values.
文摘Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the combat group,this task suffers from credit assignment problem more than other rein-forcement learning tasks.This study uses reward shaping to relieve the credit assignment problem and improve policy train-ing for the new generation of large-scale unmanned combat operations.We first prove that multiple reward shaping func-tions would not change the Nash Equilibrium in stochastic games,providing theoretical support for their use.According to the characteristics of combat operations,we propose tactical reward shaping(TRS)that comprises maneuver shaping advice and threat assessment-based attack shaping advice.Then,we investigate the effects of different types and combinations of shaping advice on combat policies through experiments.The results show that TRS improves both the efficiency and attack accuracy of combat policies,with the combination of maneuver reward shaping advice and ally-focused attack shaping advice achieving the best performance compared with that of the base-line strategy.
文摘As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents.
文摘We have examined the theoretical implications of combining two main and three auxiliary ligands to form several Ir(Ⅲ)complexes featuring a transition metal as their core atom to identify some appropriate organic lightemitting diode(OLED)materials.By utilizing electronic structure,frontier molecular orbitals,minimum single-line absorption,triplet excited states,and emission spectral data derived from the density functional theory,the usefulness of these Ir(Ⅲ)complexes,including(piq)_(2)Ir(acac),(piq)_(2)Ir(tmd),(piq)_(2)Ir(tpip),(fpiq)_(2)Ir(acac),(fpiq)_(2)Ir(tmd),and(fpiq)_(2)Ir(tpip),in OLEDs was examined,where piq=1-phenylisoquinoline,fpiq=1-(4-fluorophenyl)isoquinoline,acac=(3Z)-4-hydroxypent-3-en-2-one,tmd=(4Z)-5-hydroxy-2,2,6,6-tetramethylhept-4-en-3-one,and tpip=tetraphenylimido-diphosphonate.These complexes all have low-efficiency roll-off properties,especially(fpiq)_(2)Ir(tpip).Some researchers have successfully synthesized complexes extremely similar to(piq)_(2)Ir(acac)through the Suzuki-Miyaura coupling reaction.
基金Project(52204164)supported by the National Natural Science Foundation of ChinaProject(2021QNRC001)supported by the Young Elite Scientists Sponsorship Program by CAST,China。
文摘The intersection is a widely used traffic line structure from the shallow tunnel to the deep roadway,and determining the subsidence hidden danger area of the roof is the key to its stability control.However,applying traditional maximum equivalent span beam(MESB)theory to determine deformation range,peak point,and angle influence poses a challenge.Considering the overall structure of the intersection roof,the maximum equivalent triangular plate(METP)theory is proposed,and its geometric parameter calculation formula and deflection calculation formula are obtained.The application of the two theories in 18 models with different intersection angles,roadway types,and surrounding rock lithology is verified by numerical analysis.The results show that:1)The METP structure of the intersection roof established by the simulation results of each model successfully determined the location of the roof’s high displacement zone;2)The area comparison method of the METP theory can be reasonably explained:①The roof subsidence of the intersection decreases with the increase of the intersection angle;②The roof subsidence at the intersection of different roadway types has a rectangular type>arch type>circular type;③The roof subsidence of the intersection with weak surrounding rock is significantly larger than that of the intersection with hard surrounding rock.According to the application results of the two theories,the four advantages of the METP theory are compared and clarified in the basic assumptions,mechanical models,main viewpoints,and mechanism analysis.The large deformation inducement of the intersection roof is then explored.The J 2 peak area of the roof drives the large deformation of the area,the peak point of which is consistent with the center of gravity position of the METP.Furthermore,the change in the range of this peak is consistent with the change law of the METP’s area.Hence,this theory clarifies the large deformation area of the intersection roof,which provides a clear guiding basis for its initial support design,mid-term monitoring,and late local reinforcement.