To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on...To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning.展开更多
In this paper, a filtering method is presented to estimate time-varying parameters of a missile dual control system with tail fins and reaction jets as control variables. In this method, the long-short-term memory(LST...In this paper, a filtering method is presented to estimate time-varying parameters of a missile dual control system with tail fins and reaction jets as control variables. In this method, the long-short-term memory(LSTM) neural network is nested into the extended Kalman filter(EKF) to modify the Kalman gain such that the filtering performance is improved in the presence of large model uncertainties. To avoid the unstable network output caused by the abrupt changes of system states,an adaptive correction factor is introduced to correct the network output online. In the process of training the network, a multi-gradient descent learning mode is proposed to better fit the internal state of the system, and a rolling training is used to implement an online prediction logic. Based on the Lyapunov second method, we discuss the stability of the system, the result shows that when the training error of neural network is sufficiently small, the system is asymptotically stable. With its application to the estimation of time-varying parameters of a missile dual control system, the LSTM-EKF shows better filtering performance than the EKF and adaptive EKF(AEKF) when there exist large uncertainties in the system model.展开更多
基金supported by the Natural Science Basic Research Prog ram of Shaanxi(2022JQ-593)。
文摘To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning.
文摘In this paper, a filtering method is presented to estimate time-varying parameters of a missile dual control system with tail fins and reaction jets as control variables. In this method, the long-short-term memory(LSTM) neural network is nested into the extended Kalman filter(EKF) to modify the Kalman gain such that the filtering performance is improved in the presence of large model uncertainties. To avoid the unstable network output caused by the abrupt changes of system states,an adaptive correction factor is introduced to correct the network output online. In the process of training the network, a multi-gradient descent learning mode is proposed to better fit the internal state of the system, and a rolling training is used to implement an online prediction logic. Based on the Lyapunov second method, we discuss the stability of the system, the result shows that when the training error of neural network is sufficiently small, the system is asymptotically stable. With its application to the estimation of time-varying parameters of a missile dual control system, the LSTM-EKF shows better filtering performance than the EKF and adaptive EKF(AEKF) when there exist large uncertainties in the system model.