期刊文献+

基于增量式RBF网络的Q学习算法 被引量:7

Q-Learning Algorithm Based on Incremental RBF Network
原文传递
导出
摘要 为提升机器人的行为智能水平,提出一种基于增量式径向基函数网络(IRBFN)的Q学习(IRBFN-QL)算法.其核心是通过结构的自适应增长与参数的在线学习,实现对Q值函数的学习与存储,从而使机器人可以在未知环境中自主增量式地学习行为策略.首先,采用近似线性独立(ALD)准则在线增加网络节点,使机器人的记忆容量伴随状态空间的拓展自适应增长.同时,节点的增加意味着网络拓扑内部连接的改变.采用核递归最小二乘(KRLS)算法更新网络拓扑连接关系及参数,使机器人不断扩展与优化自身的行为策略.此外,为避免过拟合问题,将L2正则项融合到KRLS算法中,得到L2约束下的核递归最小二乘算法(L2KRLS).实验结果表明,IRBFN-QL算法能够实现机器人与未知环境的自主交互,并逐步提高移动机器人在走廊环境中的导航行为能力. An IRBFN(incremental radial basis function network)based Q-learning(IRBFN-QL)algorithm is proposed to upgrade the behavioural intelligence of robots.The key is to learn and store Q-value function based on adaptive growth of the structure and online learning of the parameters,to make robots learn the behavioral strategy autonomously and incrementally in unknown environment.Firstly,approximate linear independence(ALD)criterion is used to online increase the network nodes,thus the memory capacity of robots can grow adaptively along with the expansion of state space.The new added nodes change the inner connection of network topology.Kernel recursive least square(KRLS)algorithm is used to update the connection of network topology and its parameters,therefore the robot can extend and optimize its behavioral strategy constantly.Besides,L2 regularization term is integrated to KRLS algorithm to avoid the overfitting problem,which forms the L2 constrained KRLS(L2 KRLS)algorithm.The experimental results show that IRBFN-QL algorithm can realize autonomous interaction between the robot and the unknown environment and gradually improve the navigation behavior ability of mobile robot in corridor environments.
作者 胡艳明 李德才 何玉庆 韩建达 HU Yanming;LI Decai;HE Yuqing;HAN Jianda(The State Key Laboratory of Robotics,Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang 110016,China;Institutes for Robotics and Intelligent Manufacturing,Chinese Academy of Sciences,Shenyang 110016,China;University of Chinese Academy of Sciences,Beijing 100049,China;College of Artificial Intelligence,Nankai University,Tianjing 300071,China)
出处 《机器人》 EI CSCD 北大核心 2019年第5期562-573,共12页 Robot
基金 国家自然科学基金(U1608253,91748208)
关键词 核方法 最小二乘算法 增量式学习 移动机器人 Q学习 kernel method least square algorithm incremental learning mobile robot Q learning
作者简介 胡艳明(1991–),男,博士生.研究领域:机器人学习,路径规划;李德才(1983–),男,博士,副研究员.研究领域:无人船,无人车;通信作者:何玉庆(1980–),男,博士,研究员.研究领域:无人机,海陆空协作.heyuqing@sia.cn
  • 相关文献

参考文献1

二级参考文献6

  • 1Weng Juyang, Luciw M D, Zhang Qi.Brain-like emergent temporal processing:emergent open states[J].IEEE Trans- actions on Autonomous Mental Development,2013,5(2): 89-116.
  • 2Brooks R, Breazeal C, Irie R, et al.Alternative essences of intelligence[C]//Proceedings of the American Association of Artificial Intelligence.Madison, Wisconsin: MIT Press, 1998.
  • 3Gordon S M, Kawamura K, Wilkes D M.Neuromorphically inspired appraisal-based decision making in a cognitive robot[J].IEEE Trans on Autonomous Mental Dev,2010, 2(1):17-39.
  • 4Pfeifer R,Bongard J C.How the body shapes the way we think: a new view of intelligence[M].Cambridge, MA: MIT Press, 2006.
  • 5Cederborg T, Oudeyer P Y.From language to motor gavagai: unified imitation learning of multiple linguistic and non- linguistic sensorimotor skills[J].IEEE Transactions on Au- tonomous Mental Development,2013,5(3) :222-239.
  • 6Shen Furao, Ouyang Qiubao, Kasai W, et al.A general associative memory based on self-organizing incremental neural network[J].Neurocomputing, 2013,104 : 57-71.

共引文献1

同被引文献78

引证文献7

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部