期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
改进Deep Q Networks的交通信号均衡调度算法
1
作者 贺道坤 《机械设计与制造》 北大核心 2025年第4期135-140,共6页
为进一步缓解城市道路高峰时段十字路口的交通拥堵现象,实现路口各道路车流均衡通过,基于改进Deep Q Networks提出了一种的交通信号均衡调度算法。提取十字路口与交通信号调度最相关的特征,分别建立单向十字路口交通信号模型和线性双向... 为进一步缓解城市道路高峰时段十字路口的交通拥堵现象,实现路口各道路车流均衡通过,基于改进Deep Q Networks提出了一种的交通信号均衡调度算法。提取十字路口与交通信号调度最相关的特征,分别建立单向十字路口交通信号模型和线性双向十字路口交通信号模型,并基于此构建交通信号调度优化模型;针对Deep Q Networks算法在交通信号调度问题应用中所存在的收敛性、过估计等不足,对Deep Q Networks进行竞争网络改进、双网络改进以及梯度更新策略改进,提出相适应的均衡调度算法。通过与经典Deep Q Networks仿真比对,验证论文算法对交通信号调度问题的适用性和优越性。基于城市道路数据,分别针对两种场景进行仿真计算,仿真结果表明该算法能够有效缩减十字路口车辆排队长度,均衡各路口车流通行量,缓解高峰出行方向的道路拥堵现象,有利于十字路口交通信号调度效益的提升。 展开更多
关键词 交通信号调度 十字路口 deep q networks 深度强化学习 智能交通
在线阅读 下载PDF
基于Deep Q Networks的机械臂推动和抓握协同控制 被引量:3
2
作者 贺道坤 《现代制造工程》 CSCD 北大核心 2021年第7期23-28,共6页
针对目前机械臂在复杂场景应用不足以及推动和抓握自主协同控制研究不多的现状,发挥深度Q网络(Deep Q Networks)无规则、自主学习优势,提出了一种基于Deep Q Networks的机械臂推动和抓握协同控制方法。通过2个完全卷积网络将场景信息映... 针对目前机械臂在复杂场景应用不足以及推动和抓握自主协同控制研究不多的现状,发挥深度Q网络(Deep Q Networks)无规则、自主学习优势,提出了一种基于Deep Q Networks的机械臂推动和抓握协同控制方法。通过2个完全卷积网络将场景信息映射至推动或抓握动作,经过马尔可夫过程,采取目光长远奖励机制,选取最佳行为函数,实现对复杂场景机械臂推动和抓握动作的自主协同控制。在仿真和真实场景实验中,该方法在复杂场景中能够通过推动和抓握自主协同操控实现对物块的快速抓取,并获得更高的动作效率和抓取成功率。 展开更多
关键词 机械臂 抓握 推动 深度q网络(deep q networks) 协同控制
在线阅读 下载PDF
基于文件工作流和强化学习的工程项目文件管理优化方法
3
作者 司鹏搏 庞睿 +2 位作者 杨睿哲 孙艳华 李萌 《北京工业大学学报》 北大核心 2025年第10期1162-1170,共9页
为了解决大型工程项目中文件的传输时间与成本问题,提出一个基于文件工作流的工程项目文件管理优化方法。首先,构建了工程项目文件管理环境和具有逻辑顺序的文件工作流模型,分析了文件的传输和缓存。在此基础上,将文件管理优化问题建模... 为了解决大型工程项目中文件的传输时间与成本问题,提出一个基于文件工作流的工程项目文件管理优化方法。首先,构建了工程项目文件管理环境和具有逻辑顺序的文件工作流模型,分析了文件的传输和缓存。在此基础上,将文件管理优化问题建模为马尔可夫过程,通过设计状态空间、动作空间及奖励函数等实现文件工作流的任务完成时间与缓存成本的联合优化。其次,采用对抗式双重深度Q网络(dueling double deep Q network,D3QN)来降低训练时间,提高训练效率。仿真结果验证了提出方案在不同参数配置下文件传输的有效性,并且在任务体量增大时仍能保持较好的优化能力。 展开更多
关键词 文件工作流 传输时间 马尔可夫过程 对抗式双重深度q网络(dueling double deep q network D3qN) 文件管理 联合优化
在线阅读 下载PDF
Deep reinforcement learning for UAV swarm rendezvous behavior 被引量:2
4
作者 ZHANG Yaozhong LI Yike +1 位作者 WU Zhuoran XU Jialin 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2023年第2期360-373,共14页
The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the mai... The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the main trends of UAV development in the future.This paper studies the behavior decision-making process of UAV swarm rendezvous task based on the double deep Q network(DDQN)algorithm.We design a guided reward function to effectively solve the problem of algorithm convergence caused by the sparse return problem in deep reinforcement learning(DRL)for the long period task.We also propose the concept of temporary storage area,optimizing the memory playback unit of the traditional DDQN algorithm,improving the convergence speed of the algorithm,and speeding up the training process of the algorithm.Different from traditional task environment,this paper establishes a continuous state-space task environment model to improve the authentication process of UAV task environment.Based on the DDQN algorithm,the collaborative tasks of UAV swarm in different task scenarios are trained.The experimental results validate that the DDQN algorithm is efficient in terms of training UAV swarm to complete the given collaborative tasks while meeting the requirements of UAV swarm for centralization and autonomy,and improving the intelligence of UAV swarm collaborative task execution.The simulation results show that after training,the proposed UAV swarm can carry out the rendezvous task well,and the success rate of the mission reaches 90%. 展开更多
关键词 double deep q network(DDqN)algorithms unmanned aerial vehicle(UAV)swarm task decision deep reinforcement learning(DRL) sparse returns
在线阅读 下载PDF
Real-time UAV path planning based on LSTM network 被引量:2
5
作者 ZHANG Jiandong GUO Yukun +3 位作者 ZHENG Lihui YANG Qiming SHI Guoqing WU Yong 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第2期374-385,共12页
To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on... To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning. 展开更多
关键词 deep q network path planning neural network unmanned aerial vehicle(UAV) long short-term memory(LSTM)
在线阅读 下载PDF
Situational continuity-based air combat autonomous maneuvering decision-making 被引量:5
6
作者 Jian-dong Zhang Yi-fei Yu +3 位作者 Li-hui Zheng Qi-ming Yang Guo-qing Shi Yong Wu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第11期66-79,共14页
In order to improve the performance of UAV's autonomous maneuvering decision-making,this paper proposes a decision-making method based on situational continuity.The algorithm in this paper designs a situation eval... In order to improve the performance of UAV's autonomous maneuvering decision-making,this paper proposes a decision-making method based on situational continuity.The algorithm in this paper designs a situation evaluation function with strong guidance,then trains the Long Short-Term Memory(LSTM)under the framework of Deep Q Network(DQN)for air combat maneuvering decision-making.Considering the continuity between adjacent situations,the method takes multiple consecutive situations as one input of the neural network.To reflect the difference between adjacent situations,the method takes the difference of situation evaluation value as the reward of reinforcement learning.In different scenarios,the algorithm proposed in this paper is compared with the algorithm based on the Fully Neural Network(FNN)and the algorithm based on statistical principles respectively.The results show that,compared with the FNN algorithm,the algorithm proposed in this paper is more accurate and forwardlooking.Compared with the algorithm based on the statistical principles,the decision-making of the algorithm proposed in this paper is more efficient and its real-time performance is better. 展开更多
关键词 UAV Maneuvering decision-making Situational continuity Long short-term memory(LSTM) deep q network(DqN) Fully neural network(FNN)
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部