期刊文献+

基于经验知识的Q-学习算法 被引量:7

Q-Learning based on the Experience knowledge
在线阅读 下载PDF
导出
摘要 为了提高智能体系统中的典型的强化学习——Q-学习的学习速度和收敛速度,使学习过程充分利用环境信息,本文提出了一种基于经验知识的Q-学习算法。该算法利用具有经验知识信息的函数,使智能体在进行无模型学习的同时学习系统模型,避免对环境模型的重复学习,从而加速智能体的学习速度。仿真实验结果表明:该算法使学习过程建立在较好的学习基础上,从而更快地趋近于最优状态,其学习效率和收敛速度明显优于标准的Q-学习。 In order to enhance the study speed and the convergence rate of Q-learning algorithm, an algorithm that based on the experience knowledge about environment is proposed. Based on the experienced information function, the agent can learn the system model and avoid the repeated learning. Compared with the standard Q-leaming, the results showed that the proposed algorithm has faster speed to converge and better performance.
出处 《自动化技术与应用》 2006年第11期10-12,共3页 Techniques of Automation and Applications
关键词 强化学习 Q-学习算法 智能体 经验知识 reinforcement leaming Q-learning: agent: exoerience knowledge
作者简介 宋清昆(1964-),男,教授,研究生导师,研究方向:人工智能。
  • 相关文献

参考文献5

二级参考文献20

  • 1庄晓东,孟庆春,魏天滨,王旭柱,谭锐,李筱菁.Robot path planning in dynamic environment based on reinforcement learning[J].Journal of Harbin Institute of Technology(New Series),2001,8(3):253-255. 被引量:3
  • 2Watkins C J C H. Learning from Delayed Rewards:[Ph.D.thesis]. Cambridge University, 1989.
  • 3Watkins C J C H. Dayan P. Technical not:Q-learning. Machine Learning, 1992,8:279~292.
  • 4Ohashi T ,et al. State transition rate based reinforcement learning Systems, Man, and Cybernetics. In: 2000 IEEE Intl. Cord.Volume: 1, 2000. 236~241.
  • 5Yamagnchi T,et al. Propagating learned behaviors from a virtual agent to a physical robot in reinforcement learnins, In..Proe. IEEE Int. Conf. on Evolutionary Computation, 1996. 855~859.
  • 6Yamagnchi T,et al. Reinforcement learning for a real robot in a real environment. In: European Conf. on Artificial Intelligence,Aug. 1996. 694~698.
  • 7Hailu G. Sommer G. Embedding knowledge in reinforcement·learning. In: Proc. 8^th Int. Conf. on Artificial Neural Networks.Sep. 1998. 1133~1138.
  • 8Huber M. A hybrid architecture for hierarchical reinforcement learning. In: Proc. IEEE Int. Conf. on Robotics & Automation,April 2000. 3290~3295.
  • 9Peng J, Bhanu B. Closed loop object recognition using reinforcement learning. IEEE Trans. on Pattern Analysis and Machine Intelligence, 1998,20(2) : 139~154.
  • 10Schwartz J T,Shirir M. A survey of motion planning and related geometric algorithm. Artif. Intell. J. , 1988,37 : 157~169.

共引文献56

同被引文献56

引证文献7

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部