期刊文献+
共找到8篇文章
< 1 >
每页显示 20 50 100
Policy iteration optimal tracking control for chaotic systems by using an adaptive dynamic programming approach 被引量:2
1
作者 魏庆来 刘德荣 徐延才 《Chinese Physics B》 SCIE EI CAS CSCD 2015年第3期87-94,共8页
A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking prob... A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking problem is transformed into an optimal regulation one. The policy iteration algorithm for discrete-time chaotic systems is first described. Then,the convergence and admissibility properties of the developed policy iteration algorithm are presented, which show that the transformed chaotic system can be stabilized under an arbitrary iterative control law and the iterative performance index function simultaneously converges to the optimum. By implementing the policy iteration algorithm via neural networks,the developed optimal tracking control scheme for chaotic systems is verified by a simulation. 展开更多
关键词 adaptive critic designs adaptive dynamic programming approximate dynamic programming neuro-dynamic programming
在线阅读 下载PDF
Chaotic system optimal tracking using data-based synchronous method with unknown dynamics and disturbances
2
作者 宋睿卓 魏庆来 《Chinese Physics B》 SCIE EI CAS CSCD 2017年第3期268-275,共8页
We develop an optimal tracking control method for chaotic system with unknown dynamics and disturbances. The method allows the optimal cost function and the corresponding tracking control to update synchronously. Acco... We develop an optimal tracking control method for chaotic system with unknown dynamics and disturbances. The method allows the optimal cost function and the corresponding tracking control to update synchronously. According to the tracking error and the reference dynamics, the augmented system is constructed. Then the optimal tracking control problem is defined. The policy iteration (PI) is introduced to solve the rain-max optimization problem. The off-policy adaptive dynamic programming (ADP) algorithm is then proposed to find the solution of the tracking Hamilton-Jacobi- Isaacs (HJI) equation online only using measured data and without any knowledge about the system dynamics. Critic neural network (CNN), action neural network (ANN), and disturbance neural network (DNN) are used to approximate the cost function, control, and disturbance. The weights of these networks compose the augmented weight matrix, and the uniformly ultimately bounded (UUB) of which is proven. The convergence of the tracking error system is also proven. Two examples are given to show the effectiveness of the proposed synchronous solution method for the chaotic system tracking problem. 展开更多
关键词 adaptive dynamic programming approximate dynamic programming chaotic system ZERO-SUM
在线阅读 下载PDF
A novel stable value iteration-based approximate dynamic programming algorithm for discrete-time nonlinear systems
3
作者 曲延华 王安娜 林盛 《Chinese Physics B》 SCIE EI CAS CSCD 2018年第1期228-235,共8页
The convergence and stability of a value-iteration-based adaptive dynamic programming (ADP) algorithm are con- sidered for discrete-time nonlinear systems accompanied by a discounted quadric performance index. More ... The convergence and stability of a value-iteration-based adaptive dynamic programming (ADP) algorithm are con- sidered for discrete-time nonlinear systems accompanied by a discounted quadric performance index. More importantly than sufficing to achieve a good approximate structure, the iterative feedback control law must guarantee the closed-loop stability. Specifically, it is firstly proved that the iterative value function sequence will precisely converge to the optimum. Secondly, the necessary and sufficient condition of the optimal value function serving as a Lyapunov function is investi- gated. We prove that for the case of infinite horizon, there exists a finite horizon length of which the iterative feedback control law will provide stability, and this increases the practicability of the proposed value iteration algorithm. Neural networks (NNs) are employed to approximate the value functions and the optimal feedback control laws, and the approach allows the implementation of the algorithm without knowing the internal dynamics of the system. Finally, a simulation example is employed to demonstrate the effectiveness of the developed optimal control method. 展开更多
关键词 adaptive dynamic programming (ADP) CONVERGENCE STABILITY discounted quadric performanceindex
在线阅读 下载PDF
Two-Phase Rate Adaptation Strategy for Improving Real-Time Video QoE in Mobile Networks 被引量:3
4
作者 Ailing Xiao Jie Liu +2 位作者 Yizhe Li Qiwei Song Ning Ge 《China Communications》 SCIE CSCD 2018年第10期12-24,共13页
With the popularity of smart handheld devices, mobile streaming video has multiplied the global network traffic in recent years. A huge concern of users' quality of experience(Qo E) has made rate adaptation method... With the popularity of smart handheld devices, mobile streaming video has multiplied the global network traffic in recent years. A huge concern of users' quality of experience(Qo E) has made rate adaptation methods very attractive. In this paper, we propose a two-phase rate adaptation strategy to improve users' real-time video Qo E. First, to measure and assess video Qo E, we provide a continuous Qo E prediction engine modeled by RNN recurrent neural network. Different from traditional Qo E models which consider the Qo E-aware factors separately or incompletely, our RNN-Qo E model accounts for three descriptive factors(video quality, rebuffering, and rate change) and reflects the impact of cognitive memory and recency. Besides, the video playing is separated into the initial startup phase and the steady playback phase, and we takes different optimization goals for each phase: the former aims at shortening the startup delay while the latter ameliorates the video quality and the rebufferings. Simulation results have shown that RNN-Qo E can follow the subjective Qo E quite well, and the proposed strategy can effectively reduce the occurrence of rebufferings caused by the mismatch between the requested video rates and the fluctuated throughput and attains standout performance on real-time Qo E compared with classical rate adaption methods. 展开更多
关键词 continuous quality of experience (QoE) model recurrent neural network(RNN) real-time video QoE improving dynamic adaptive streaming over HTTP (DASH)
在线阅读 下载PDF
Data⁃Based Feedback Relearning Algorithm for Robust Control of SGCMG Gimbal Servo System with Multi⁃source Disturbance 被引量:3
5
作者 ZHANG Yong MU Chaoxu LU Ming 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI CSCD 2021年第2期225-236,共12页
Single gimbal control moment gyroscope(SGCMG)with high precision and fast response is an important attitude control system for high precision docking,rapid maneuvering navigation and guidance system in the aerospace f... Single gimbal control moment gyroscope(SGCMG)with high precision and fast response is an important attitude control system for high precision docking,rapid maneuvering navigation and guidance system in the aerospace field.In this paper,considering the influence of multi-source disturbance,a data-based feedback relearning(FR)algorithm is designed for the robust control of SGCMG gimbal servo system.Based on adaptive dynamic programming and least-square principle,the FR algorithm is used to obtain the servo control strategy by collecting the online operation data of SGCMG system.This is a model-free learning strategy in which no prior knowledge of the SGCMG model is required.Then,combining the reinforcement learning mechanism,the servo control strategy is interacted with system dynamic of SGCMG.The adaptive evaluation and improvement of servo control strategy against the multi-source disturbance are realized.Meanwhile,a data redistribution method based on experience replay is designed to reduce data correlation to improve algorithm stability and data utilization efficiency.Finally,by comparing with other methods on the simulation model of SGCMG,the effectiveness of the proposed servo control strategy is verified. 展开更多
关键词 control moment gyroscope feedback relearning algorithm servo control reinforcement learning multisource disturbance adaptive dynamic programming
在线阅读 下载PDF
A new approach of optimal control for a class of continuous-time chaotic systems by an online ADP algorithm
6
作者 宋睿卓 肖文栋 魏庆来 《Chinese Physics B》 SCIE EI CAS CSCD 2014年第5期138-144,共7页
We develop an online adaptive dynamic programming (ADP) based optimal control scheme for continuous-time chaotic systems. The idea is to use the ADP algorithm to obtain the optimal control input that makes the perfo... We develop an online adaptive dynamic programming (ADP) based optimal control scheme for continuous-time chaotic systems. The idea is to use the ADP algorithm to obtain the optimal control input that makes the performance index function reach an optimum. The expression of the performance index function for the chaotic system is first presented. The online ADP algorithm is presented to achieve optimal control. In the ADP structure, neural networks are used to construct a critic network and an action network, which can obtain an approximate performance index function and the control input, respectively. It is proven that the critic parameter error dynamics and the closed-loop chaotic systems are uniformly ultimately bounded exponentially. Our simulation results illustrate the performance of the established optimal control method. 展开更多
关键词 adaptive dynamic programming adaptive critic designs optimal control continuous-time chaoticsystem
在线阅读 下载PDF
Off-policy integral reinforcement learning optimal tracking control for continuous-time chaotic systems
7
作者 魏庆来 宋睿卓 +1 位作者 孙秋野 肖文栋 《Chinese Physics B》 SCIE EI CAS CSCD 2015年第9期147-152,共6页
This paper estimates an off-policy integral reinforcement learning(IRL) algorithm to obtain the optimal tracking control of unknown chaotic systems. Off-policy IRL can learn the solution of the HJB equation from the... This paper estimates an off-policy integral reinforcement learning(IRL) algorithm to obtain the optimal tracking control of unknown chaotic systems. Off-policy IRL can learn the solution of the HJB equation from the system data generated by an arbitrary control. Moreover, off-policy IRL can be regarded as a direct learning method, which avoids the identification of system dynamics. In this paper, the performance index function is first given based on the system tracking error and control error. For solving the Hamilton–Jacobi–Bellman(HJB) equation, an off-policy IRL algorithm is proposed.It is proven that the iterative control makes the tracking error system asymptotically stable, and the iterative performance index function is convergent. Simulation study demonstrates the effectiveness of the developed tracking control method. 展开更多
关键词 adaptive dynamic programming approximate dynamic programming chaotic system optimal tracking control
在线阅读 下载PDF
Approximation-error-ADP-based optimal tracking control for chaotic systems with convergence proof
8
作者 宋睿卓 肖文栋 +1 位作者 孙长银 魏庆来 《Chinese Physics B》 SCIE EI CAS CSCD 2013年第9期305-311,共7页
In this paper, an optimal tracking control scheme is proposed for a class of discrete-time chaotic systems using the approximation-error-based adaptive dynamic programming (ADP) algorithm. Via the system transformat... In this paper, an optimal tracking control scheme is proposed for a class of discrete-time chaotic systems using the approximation-error-based adaptive dynamic programming (ADP) algorithm. Via the system transformation, the optimal tracking problem is transformed into an optimal regulation problem, and then the novel optimal tracking control method is proposed. It is shown that for the iterative ADP algorithm with finite approximation error, the iterative performance index functions can converge to a finite neighborhood of the greatest lower bound of all performance index functions under some convergence conditions. Two examples are given to demonstrate the validity of the proposed optimal tracking control scheme for chaotic systems. 展开更多
关键词 chaotic systems approximation error adaptive dynamic programming optimal tracking control
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部