期刊文献+

逆强化学习算法、理论与应用研究综述 被引量:1

A Survey of Inverse Reinforcement Learning Algorithms,Theory and Applications
在线阅读 下载PDF
导出
摘要 随着高维特征表示与逼近能力的提高,强化学习(Reinforcement learning,RL)在博弈与优化决策、智能驾驶等现实问题中的应用也取得显著进展.然而强化学习在智能体与环境的交互中存在人工设计奖励函数难的问题,因此研究者提出了逆强化学习(Inverse reinforcement learning,IRL)这一研究方向.如何从专家演示中学习奖励函数和进行策略优化是一个重要的研究课题,在人工智能领域具有十分重要的研究意义.本文综合介绍了逆强化学习算法的最新进展,首先介绍了逆强化学习在理论方面的新进展,然后分析了逆强化学习面临的挑战以及未来的发展趋势,最后讨论了逆强化学习的应用进展和应用前景. With the research and development of deep reinforcement learning,the application of reinforcement learning(RL)in real-world problems such as game and optimization decision,and intelligent driving has also made significant progress.However,reinforcement learning has difficulty in manually designing the reward function in the interaction between an agent and its environment,so researchers have proposed the research direction of inverse reinforcement learning(IRL).How to learn reward functions from expert demonstrations and perform strategy optimization is a novel and important research topic with very important research implications in the field of artificial intelligence.This paper presents a comprehensive overview of the recent progress of inverse reinforcement learning algorithms.Firstly,new advances in the theory of inverse reinforcement learning are introduced,then the challenges faced by inverse reinforcement learning and the future development trends are analyzed,and finally the progress and application prospects of inverse reinforcement learning are discussed.
作者 宋莉 李大字 徐昕 SONG Li;LI Da-Zi;XU Xin(College of Information Science and Technology,Beijing University of Chemical Technology,Beijing 100029;College of Intelligence Science and Technology,National University of Defense Technology,Changsha 410073)
出处 《自动化学报》 EI CAS CSCD 北大核心 2024年第9期1704-1723,共20页 Acta Automatica Sinica
基金 国家自然科学基金(62273026)资助。
关键词 强化学习 逆强化学习 线性逆强化学习 深度逆强化学习 对抗逆强化学习 Reinforcement learning(RL) inverse reinforcement learning(IRL) linear inverse reinforcement learning deep inverse reinforcement learning adversarial inverse reinforcement learning
作者简介 宋莉,北京化工大学信息科学与技术学院博士研究生.主要研究方向为强化学习,深度学习,逆强化学习.E-mail:slili516@foxmail.com;通信作者:李大字,北京化工大学信息科学与技术学院教授.主要研究方向为机器学习与人工智能,先进控制,分数阶系统,复杂系统建模与优化.E-mail:lidz@mail.buct.edu.cn;徐昕,国防科技大学智能科学学院教授.主要研究方向为智能控制,强化学习,机器学习,机器人和智能车辆.E-mail:xinxu@nudt.edu.cn。
  • 相关文献

参考文献6

二级参考文献142

  • 1柴天佑,郑秉霖,胡毅,黄肖玲.制造执行系统的研究现状和发展趋势[J].控制工程,2005,12(6):505-510. 被引量:79
  • 2Ratliff D N,Bagnell J A,Zinkevich M.Maximummargin planning[].Proceedings of the rd Inter-national Conference on Machine Learning.2006
  • 3Ng Y,Russell J S.Algorithms for inverse reinforce-ment learning[].Proceedings of the SeventeenthInternational Conference on Machine Learning.2000
  • 4Abbeel P Y,Ng Y.Apprenticeship learning via in-verse reinforcement learning[].Proceedings of theTwenty-first International Conference on MachineLearning.2004
  • 5Kolter J Z,Abbeel P Y,Ng A.Hierarchical appren-ticeship learning with application to quadruped loco-motion[].Advances in Neural Information Process-ing Systems.2008
  • 6Taskar B,Lacoste-Julien S,Jordan M.Structuredprediction via the extragradient method[].Proceed-ings of Neural Information Processing Systems.2005
  • 7Pieter A,Andrew Y N.Exploration and apprentice-ship learning in reinforcement learning[].Proceed-ings of the nd International Conference on MachineLearning.2005
  • 8Kolter J Z,Rodgers M P,Ng A Y.A complete con-trol architecture for quadruped locomotion over roughterrain[].Proceedings of the International Confer-ence on Robotics and Automation.2008
  • 9Rebula J R,Neuhaus P D,Bonnlander B V,et al.Acontroller for the littledog quadruped walking onrough terrain[].IEEE International Conference onRobotics and Automation.2007
  • 10Ratliff N,Bagnell J A,Srinivasa S.Imitation learn-ing for locomotion and manipulation. CMU-RI-TR-07-45 . 2007

共引文献200

同被引文献41

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部