To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on...To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning.展开更多
In this paper, a filtering method is presented to estimate time-varying parameters of a missile dual control system with tail fins and reaction jets as control variables. In this method, the long-short-term memory(LST...In this paper, a filtering method is presented to estimate time-varying parameters of a missile dual control system with tail fins and reaction jets as control variables. In this method, the long-short-term memory(LSTM) neural network is nested into the extended Kalman filter(EKF) to modify the Kalman gain such that the filtering performance is improved in the presence of large model uncertainties. To avoid the unstable network output caused by the abrupt changes of system states,an adaptive correction factor is introduced to correct the network output online. In the process of training the network, a multi-gradient descent learning mode is proposed to better fit the internal state of the system, and a rolling training is used to implement an online prediction logic. Based on the Lyapunov second method, we discuss the stability of the system, the result shows that when the training error of neural network is sufficiently small, the system is asymptotically stable. With its application to the estimation of time-varying parameters of a missile dual control system, the LSTM-EKF shows better filtering performance than the EKF and adaptive EKF(AEKF) when there exist large uncertainties in the system model.展开更多
准确的高铁沿线风速预测是铁路灾害预警系统的基础需求,为了提升应对和处理强风灾害致突发事件的能力,提出一种基于减法平均优化(subtraction average based optimizer,SABO)算法优化长短时记忆(long short-term memory,LSTM)神经网络...准确的高铁沿线风速预测是铁路灾害预警系统的基础需求,为了提升应对和处理强风灾害致突发事件的能力,提出一种基于减法平均优化(subtraction average based optimizer,SABO)算法优化长短时记忆(long short-term memory,LSTM)神经网络的高铁沿线短期风速预测方法。首先,针对风速非线性和非平稳特性,采用极小化极大(min-max,MM)方法对风速数据进行归一化处理;其次,采用SABO算法中的“-v”方法对LSTM模型的关键参数搜索寻优,并构建风速预测模型;最后,以中国宝兰高铁沿线风速采集点采集的实测风速数据为例,对模型进行有效性检验。实验结果表明:SABO算法的寻优效果更加良好,预测精度更高,所建模型的平均绝对误差(mean absolute error,MAE)、平均绝对百分比误差(mean absolute percentage error,MAPE)和均方根误差(route mean square error,RMSE)分别仅为11.96%、1.23%和16.47%,决定系数(r-square,R^(2))为0.995。与其他模型相比,通过SABO算法优化后的LSTM神经网络在短期风速预测上具有较好的拟合效果和更高的预测精度,可为高铁沿线大风预测预警提供一种新的方法和思路。展开更多
基金supported by the Natural Science Basic Research Prog ram of Shaanxi(2022JQ-593)。
文摘To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning.
文摘In this paper, a filtering method is presented to estimate time-varying parameters of a missile dual control system with tail fins and reaction jets as control variables. In this method, the long-short-term memory(LSTM) neural network is nested into the extended Kalman filter(EKF) to modify the Kalman gain such that the filtering performance is improved in the presence of large model uncertainties. To avoid the unstable network output caused by the abrupt changes of system states,an adaptive correction factor is introduced to correct the network output online. In the process of training the network, a multi-gradient descent learning mode is proposed to better fit the internal state of the system, and a rolling training is used to implement an online prediction logic. Based on the Lyapunov second method, we discuss the stability of the system, the result shows that when the training error of neural network is sufficiently small, the system is asymptotically stable. With its application to the estimation of time-varying parameters of a missile dual control system, the LSTM-EKF shows better filtering performance than the EKF and adaptive EKF(AEKF) when there exist large uncertainties in the system model.
文摘准确的高铁沿线风速预测是铁路灾害预警系统的基础需求,为了提升应对和处理强风灾害致突发事件的能力,提出一种基于减法平均优化(subtraction average based optimizer,SABO)算法优化长短时记忆(long short-term memory,LSTM)神经网络的高铁沿线短期风速预测方法。首先,针对风速非线性和非平稳特性,采用极小化极大(min-max,MM)方法对风速数据进行归一化处理;其次,采用SABO算法中的“-v”方法对LSTM模型的关键参数搜索寻优,并构建风速预测模型;最后,以中国宝兰高铁沿线风速采集点采集的实测风速数据为例,对模型进行有效性检验。实验结果表明:SABO算法的寻优效果更加良好,预测精度更高,所建模型的平均绝对误差(mean absolute error,MAE)、平均绝对百分比误差(mean absolute percentage error,MAPE)和均方根误差(route mean square error,RMSE)分别仅为11.96%、1.23%和16.47%,决定系数(r-square,R^(2))为0.995。与其他模型相比,通过SABO算法优化后的LSTM神经网络在短期风速预测上具有较好的拟合效果和更高的预测精度,可为高铁沿线大风预测预警提供一种新的方法和思路。