Learning control for gradually varying references in iteration domain was considered in this research, and a composite iterative learning control strategy was proposed to enable a plant to track unknown iteration-depe...Learning control for gradually varying references in iteration domain was considered in this research, and a composite iterative learning control strategy was proposed to enable a plant to track unknown iteration-dependent trajectories. Specifically, by decoupling the current reference into the desired trajectory of the last trial and a disturbance signal with small magnitude, the learning and feedback parts were designed respectively to ensure fine tracking performance. After some theoretical analysis, the judging condition on whether the composite iterative learning control approach achieves better control results than pure feedback contro! was obtained for varying references. The convergence property of the closed-loop system was rigorously studied and the saturation problem was also addressed in the controller. The designed composite iterative learning control strategy is successfully employed in an atomic force microscope system, with both simulation and experimental results clearly demonstrating its superior performance.展开更多
In this paper, a reinforcement learning-based multibattery energy storage system(MBESS) scheduling policy is proposed to minimize the consumers ’ electricity cost. The MBESS scheduling problem is modeled as a Markov ...In this paper, a reinforcement learning-based multibattery energy storage system(MBESS) scheduling policy is proposed to minimize the consumers ’ electricity cost. The MBESS scheduling problem is modeled as a Markov decision process(MDP) with unknown transition probability. However, the optimal value function is time-dependent and difficult to obtain because of the periodicity of the electricity price and residential load. Therefore, a series of time-independent action-value functions are proposed to describe every period of a day. To approximate every action-value function, a corresponding critic network is established, which is cascaded with other critic networks according to the time sequence. Then, the continuous management strategy is obtained from the related action network. Moreover, a two-stage learning protocol including offline and online learning stages is provided for detailed implementation in real-time battery management. Numerical experimental examples are given to demonstrate the effectiveness of the developed algorithm.展开更多
基金Supported by National Natural Science Foundation of P.R.China(60474038)Science Research Foundation of Beijing Jiaotong University(2005SM005)Specialized Research Fund for the Doctoral Program of Higher Education(20060004002)
基金Projects(61127006,61325017)supported by the National Natural Science Foundation of China
文摘Learning control for gradually varying references in iteration domain was considered in this research, and a composite iterative learning control strategy was proposed to enable a plant to track unknown iteration-dependent trajectories. Specifically, by decoupling the current reference into the desired trajectory of the last trial and a disturbance signal with small magnitude, the learning and feedback parts were designed respectively to ensure fine tracking performance. After some theoretical analysis, the judging condition on whether the composite iterative learning control approach achieves better control results than pure feedback contro! was obtained for varying references. The convergence property of the closed-loop system was rigorously studied and the saturation problem was also addressed in the controller. The designed composite iterative learning control strategy is successfully employed in an atomic force microscope system, with both simulation and experimental results clearly demonstrating its superior performance.
基金supported by the National Key R&D Program of China (2018AAA0101400)the National Natural Science Foundation of China (61921004,62173251,U1713209,62236002)+1 种基金the Fundamental Research Funds for the Central UniversitiesGuangdong Provincial Key Laboratory of Intelligent Decision and Cooperative Control。
文摘In this paper, a reinforcement learning-based multibattery energy storage system(MBESS) scheduling policy is proposed to minimize the consumers ’ electricity cost. The MBESS scheduling problem is modeled as a Markov decision process(MDP) with unknown transition probability. However, the optimal value function is time-dependent and difficult to obtain because of the periodicity of the electricity price and residential load. Therefore, a series of time-independent action-value functions are proposed to describe every period of a day. To approximate every action-value function, a corresponding critic network is established, which is cascaded with other critic networks according to the time sequence. Then, the continuous management strategy is obtained from the related action network. Moreover, a two-stage learning protocol including offline and online learning stages is provided for detailed implementation in real-time battery management. Numerical experimental examples are given to demonstrate the effectiveness of the developed algorithm.