In this paper,the optimal control of non-linear switching system is investigated without knowing the system dynamics.First,the Hamilton-Jacobi-Bellman(HJB)equation is derived with the consideration of hybrid action sp...In this paper,the optimal control of non-linear switching system is investigated without knowing the system dynamics.First,the Hamilton-Jacobi-Bellman(HJB)equation is derived with the consideration of hybrid action space.Then,a novel data-based hybrid Q-learning(HQL)algorithm is proposed to find the optimal solution in an iterative manner.In addition,the theoretical analysis is provided to illustrate the convergence and optimality of the proposed algorithm.Finally,the algorithm is implemented with the actor-critic(AC)structure,and two linear-in-parameter neural networks are utilized to approximate the functions.Simulation results validate the effectiveness of the data-driven method.展开更多
基金supported by the National Key R&D Program of China(2018AAA0101400)the Natural Science Foundation of Jiangsu Province of China(BK20202006)the National Natural Science Foundation of China(61921004,62173251).
文摘In this paper,the optimal control of non-linear switching system is investigated without knowing the system dynamics.First,the Hamilton-Jacobi-Bellman(HJB)equation is derived with the consideration of hybrid action space.Then,a novel data-based hybrid Q-learning(HQL)algorithm is proposed to find the optimal solution in an iterative manner.In addition,the theoretical analysis is provided to illustrate the convergence and optimality of the proposed algorithm.Finally,the algorithm is implemented with the actor-critic(AC)structure,and two linear-in-parameter neural networks are utilized to approximate the functions.Simulation results validate the effectiveness of the data-driven method.