期刊文献+

基于深度强化学习的二连杆机械臂运动控制方法 被引量:21

Motion control method of two-link manipulator based on deep reinforcement learning
在线阅读 下载PDF
导出
摘要 针对二连杆机械臂的运动控制问题,提出了一种基于深度强化学习的控制方法。首先,搭建机械臂仿真环境,包括二连杆机械臂、目标物与障碍物;然后,根据环境模型的目标设置、状态变量和奖罚机制来建立三种深度强化学习模型进行训练,最后实现二连杆机械臂的运动控制。对比分析所提出的三种模型后,选择深度确定性策略梯度(DDPG)算法进行进一步研究来改进其适用性,从而缩短机械臂模型的调试时间,顺利避开障碍物到达目标。实验结果表明,所提深度强化学习方法能够有效控制二连杆机械臂的运动,改进后的DDPG算法控制模型的收敛速度提升了两倍并且收敛后的稳定性增强。相较于传统控制方法,所提深度强化学习控制方法效率更高,适用性更强。 Aiming at the motion control problem of two-link manipulator,a new control method based on deep reinforcement learning was proposed.Firstly,the simulation environment of manipulator was built,which includes the two-link manipulator,target and obstacle.Then,according to the target setting,state variables as well as reward and punishment mechanism of the environment model,three kinds of deep reinforcement learning models were established for training.Finally,the motion control of the two-link manipulator was realized.After comparing and analyzing the three proposed models,Deep Deterministic Policy Gradient(DDPG)algorithm was selected for further research to improve its applicability,so as to shorten the debugging time of the manipulator model,and avoided the obstacle to reach the target smoothly.Experimental results show that,the proposed deep reinforcement learning method can effectively control the motion of two-link manipulator,the improved DDPG algorithm control model has the convergence speed increased by two times and the stability after convergence enhances.Compared with the traditional control method,the proposed deep reinforcement learning control method has higher efficiency and stronger applicability.
作者 王建平 王刚 毛晓彬 马恩琪 WANG Jianping;WANG Gang;MAO Xiaobin;MA Enqi(School of Mechanical and Precision Instrument Engineering,Xi’an University of Technology,Xi’an Shaanxi 710048,China)
出处 《计算机应用》 CSCD 北大核心 2021年第6期1799-1804,共6页 journal of Computer Applications
关键词 深度强化学习 二连杆机械臂 运动控制 奖罚机制 深度确定性策略梯度算法 deep reinforcement learning two-link manipulator motion control reward and punishment mechanism Deep Deterministic Policy Gradient(DDPG)algorithm
作者简介 王建平(1970-),男,山西代县人,副教授,博士,主要研究方向:非线性系统动力学、智能控制;通信作者:王刚(1996-),男,陕西宝鸡人,硕士研究生,主要研究方向:智能控制、深度强化学习,电子邮箱:1123016209@qq.com;毛晓彬(1998-),男,山西临汾人,硕士研究生,主要研究方向:智能控制;马恩琪(1998-),男,陕西渭南人,硕士研究生,主要研究方向:智能控制。
  • 相关文献

参考文献3

二级参考文献9

共引文献12

同被引文献192

引证文献21

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部