检索结果-维普期刊中文期刊服务平台

A novel trajectories optimizing method for dynamic soaring based on deep reinforcement learning: 1; 作者 Wanyong Zou Ni Li +2 位作者 Fengcheng An Kaibo Wang Changyin Dong 《Defence Technology(防务技术)》 2025年第4期99-108,共10页; Dynamic soaring,inspired by the wind-riding flight of birds such as albatrosses,is a biomimetic technique which leverages wind fields to enhance the endurance of unmanned aerial vehicles(UAVs).Achieving a precise soar... 展开更多; 关键词 Dynamic soaring Differential flatness Trajectory optimization proximal policy optimization; 在线阅读下载PDF 职称材料

基于多智能体深度强化学习的无人机路径规划被引量：10: 2; 作者司鹏搏吴兵 +2 位作者杨睿哲李萌孙艳华《北京工业大学学报》 CAS CSCD 北大核心 2023年第4期449-458,共10页; 为解决多无人机(unmanned aerial vehicle, UAV)在复杂环境下的路径规划问题,提出一个多智能体深度强化学习UAV路径规划框架.该框架首先将路径规划问题建模为部分可观测马尔可夫过程,采用近端策略优化算法将其扩展至多智能体,通过设计UA... 展开更多; 关键词无人机(unmanned aerial vehicle UAV) 复杂环境路径规划马尔可夫决策过程多智能体近端策略优化算法(multi-agent proximal policy optimization MAPPO) 网络剪枝(network pruning NP); 在线阅读下载PDF 职称材料

Task assignment in ground-to-air confrontation based on multiagent deep reinforcement learning 被引量：4: 3; 作者 Jia-yi Liu Gang Wang +2 位作者 Qiang Fu Shao-hua Yue Si-yuan Wang 《Defence Technology（防务技术）》 SCIE EI CAS CSCD 2023年第1期210-219,共10页; The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to... 展开更多; 关键词 Ground-to-air confrontation Task assignment General and narrow agents Deep reinforcement learning proximal policy optimization(PPO); 在线阅读下载PDF 职称材料

Cooperative multi-target hunting by unmanned surface vehicles based on multi-agent reinforcement learning 被引量：2: 4; 作者 Jiawei Xia Yasong Luo +3 位作者 Zhikun Liu Yalun Zhang Haoran Shi Zhong Liu 《Defence Technology（防务技术）》 SCIE EI CAS CSCD 2023年第11期80-94,共15页; To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model wit... 展开更多; 关键词 Unmanned surface vehicles Multi-agent deep reinforcement learning Cooperative hunting Feature embedding proximal policy optimization; 在线阅读下载PDF 职称材料

题名A novel trajectories optimizing method for dynamic soaring based on deep reinforcement learning: 1; 作者 Wanyong Zou Ni Li Fengcheng An Kaibo Wang Changyin Dong; 机构 School of Aeronautics National Key Laboratory of Aircraft Configuration Design; 出处《Defence Technology(防务技术)》 2025年第4期99-108,共10页; 基金 support received by the National Natural Science Foundation of China(Grant Nos.52372398&62003272).; 文摘 Dynamic soaring,inspired by the wind-riding flight of birds such as albatrosses,is a biomimetic technique which leverages wind fields to enhance the endurance of unmanned aerial vehicles(UAVs).Achieving a precise soaring trajectory is crucial for maximizing energy efficiency during flight.Existing nonlinear programming methods are heavily dependent on the choice of initial values which is hard to determine.Therefore,this paper introduces a deep reinforcement learning method based on a differentially flat model for dynamic soaring trajectory planning and optimization.Initially,the gliding trajectory is parameterized using Fourier basis functions,achieving a flexible trajectory representation with a minimal number of hyperparameters.Subsequently,the trajectory optimization problem is formulated as a dynamic interactive process of Markov decision-making.The hyperparameters of the trajectory are optimized using the Proximal Policy Optimization(PPO2)algorithm from deep reinforcement learning(DRL),reducing the strong reliance on initial value settings in the optimization process.Finally,a comparison between the proposed method and the nonlinear programming method reveals that the trajectory generated by the proposed approach is smoother while meeting the same performance requirements.Specifically,the proposed method achieves a 34%reduction in maximum thrust,a 39.4%decrease in maximum thrust difference,and a 33%reduction in maximum airspeed difference.; 关键词 Dynamic soaring Differential flatness Trajectory optimization proximal policy optimization; 分类号 V279 [航空宇航科学与技术—飞行器设计]; 在线阅读下载PDF 职称材料

题名基于多智能体深度强化学习的无人机路径规划被引量：10: 2; 作者司鹏搏吴兵杨睿哲李萌孙艳华; 机构北京工业大学信息学部; 出处《北京工业大学学报》 CAS CSCD 北大核心 2023年第4期449-458,共10页; 基金国家自然科学基金资助项目(61901011) 北京市教育委员会科技项目(KM202010005017,KM202110005021)。; 文摘为解决多无人机(unmanned aerial vehicle, UAV)在复杂环境下的路径规划问题,提出一个多智能体深度强化学习UAV路径规划框架.该框架首先将路径规划问题建模为部分可观测马尔可夫过程,采用近端策略优化算法将其扩展至多智能体,通过设计UAV的状态观测空间、动作空间及奖赏函数等实现多UAV无障碍路径规划;其次,为适应UAV搭载的有限计算资源条件,进一步提出基于网络剪枝的多智能体近端策略优化(network pruning-based multi-agent proximal policy optimization, NP-MAPPO)算法,提高了训练效率.仿真结果验证了提出的多UAV路径规划框架在各参数配置下的有效性及NP-MAPPO算法在训练时间上的优越性.; 关键词无人机(unmanned aerial vehicle UAV) 复杂环境路径规划马尔可夫决策过程多智能体近端策略优化算法(multi-agent proximal policy optimization MAPPO) 网络剪枝(network pruning NP); Keywords unmanned aerial vehicle(UAV) complex environment path planning Markov decision process multi-agent proximal policy optimization(MAPPO)algorithm network pruning(NP); 分类号 U461 [机械工程—车辆工程] TP308 [自动化与计算机技术—计算机系统结构]; 在线阅读下载PDF 职称材料

题名Task assignment in ground-to-air confrontation based on multiagent deep reinforcement learning 被引量：4: 3; 作者 Jia-yi Liu Gang Wang Qiang Fu Shao-hua Yue Si-yuan Wang; 机构 Air and Missile Defense College Air Force Engineering University; 出处《Defence Technology（防务技术）》 SCIE EI CAS CSCD 2023年第1期210-219,共10页; 基金 the Project of National Natural Science Foundation of China(Grant No.62106283) the Project of National Natural Science Foundation of China(Grant No.72001214)to provide fund for conducting experiments the Project of Natural Science Foundation of Shaanxi Province(Grant No.2020JQ-484)。; 文摘 The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to ground-to-air confrontation,there is low efficiency in dealing with complex tasks,and there are interactive conflicts in multiagent systems.This study proposes a multiagent architecture based on a one-general agent with multiple narrow agents(OGMN)to reduce task assignment conflicts.Considering the slow speed of traditional dynamic task assignment algorithms,this paper proposes the proximal policy optimization for task assignment of general and narrow agents(PPOTAGNA)algorithm.The algorithm based on the idea of the optimal assignment strategy algorithm and combined with the training framework of deep reinforcement learning(DRL)adds a multihead attention mechanism and a stage reward mechanism to the bilateral band clipping PPO algorithm to solve the problem of low training efficiency.Finally,simulation experiments are carried out in the digital battlefield.The multiagent architecture based on OGMN combined with the PPO-TAGNA algorithm can obtain higher rewards faster and has a higher win ratio.By analyzing agent behavior,the efficiency,superiority and rationality of resource utilization of this method are verified.; 关键词 Ground-to-air confrontation Task assignment General and narrow agents Deep reinforcement learning proximal policy optimization(PPO); 分类号 E91 [军事]; 在线阅读下载PDF 职称材料

题名Cooperative multi-target hunting by unmanned surface vehicles based on multi-agent reinforcement learning 被引量：2: 4; 作者 Jiawei Xia Yasong Luo Zhikun Liu Yalun Zhang Haoran Shi Zhong Liu; 机构 College of Weaponry Engineering Institute of Vibration and Noise; 出处《Defence Technology（防务技术）》 SCIE EI CAS CSCD 2023年第11期80-94,共15页; 基金 financial support from National Natural Science Foundation of China(Grant No.61601491) Natural Science Foundation of Hubei Province,China(Grant No.2018CFC865) Military Research Project of China(-Grant No.YJ2020B117)。; 文摘 To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model without boundary constraints are built,and the criteria for successful target capture are given.Then,the cooperative hunting problem of a USV fleet is modeled as a decentralized partially observable Markov decision process(Dec-POMDP),and a distributed partially observable multitarget hunting Proximal Policy Optimization(DPOMH-PPO)algorithm applicable to USVs is proposed.In addition,an observation model,a reward function and the action space applicable to multi-target hunting tasks are designed.To deal with the dynamic change of observational feature dimension input by partially observable systems,a feature embedding block is proposed.By combining the two feature compression methods of column-wise max pooling(CMP)and column-wise average-pooling(CAP),observational feature encoding is established.Finally,the centralized training and decentralized execution framework is adopted to complete the training of hunting strategy.Each USV in the fleet shares the same policy and perform actions independently.Simulation experiments have verified the effectiveness of the DPOMH-PPO algorithm in the test scenarios with different numbers of USVs.Moreover,the advantages of the proposed model are comprehensively analyzed from the aspects of algorithm performance,migration effect in task scenarios and self-organization capability after being damaged,the potential deployment and application of DPOMH-PPO in the real environment is verified.; 关键词 Unmanned surface vehicles Multi-agent deep reinforcement learning Cooperative hunting Feature embedding proximal policy optimization; 分类号 TP3 [自动化与计算机技术—计算机科学与技术]; 在线阅读下载PDF 职称材料

	题名	作者	出处	发文年	被引量	操作
1	A novel trajectories optimizing method for dynamic soaring based on deep reinforcement learning	Wanyong Zou Ni Li Fengcheng An Kaibo Wang Changyin Dong	《Defence Technology(防务技术)》	2025	0	在线阅读下载PDF 职称材料
2	基于多智能体深度强化学习的无人机路径规划	司鹏搏吴兵杨睿哲李萌孙艳华	《北京工业大学学报》 CAS CSCD 北大核心	2023	10	在线阅读下载PDF 职称材料
3	Task assignment in ground-to-air confrontation based on multiagent deep reinforcement learning	Jia-yi Liu Gang Wang Qiang Fu Shao-hua Yue Si-yuan Wang	《Defence Technology（防务技术）》 SCIE EI CAS CSCD	2023	4	在线阅读下载PDF 职称材料
4	Cooperative multi-target hunting by unmanned surface vehicles based on multi-agent reinforcement learning	Jiawei Xia Yasong Luo Zhikun Liu Yalun Zhang Haoran Shi Zhong Liu	《Defence Technology（防务技术）》 SCIE EI CAS CSCD	2023	2	在线阅读下载PDF 职称材料