To solve the path following control problem for unmanned surface vehicles(USVs),a control method based on deep reinforcement learning(DRL)with long short-term memory(LSTM)networks is proposed.A distributed proximal po...To solve the path following control problem for unmanned surface vehicles(USVs),a control method based on deep reinforcement learning(DRL)with long short-term memory(LSTM)networks is proposed.A distributed proximal policy opti-mization(DPPO)algorithm,which is a modified actor-critic-based type of reinforcement learning algorithm,is adapted to improve the controller performance in repeated trials.The LSTM network structure is introduced to solve the strong temporal cor-relation USV control problem.In addition,a specially designed path dataset,including straight and curved paths,is established to simulate various sailing scenarios so that the reinforcement learning controller can obtain as much handling experience as possible.Extensive numerical simulation results demonstrate that the proposed method has better control performance under missions involving complex maneuvers than trained with limited scenarios and can potentially be applied in practice.展开更多
Load forecasting is of great significance to the development of new power systems.With the advancement of smart grids,the integration and distribution of distributed renewable energy sources and power electronics devi...Load forecasting is of great significance to the development of new power systems.With the advancement of smart grids,the integration and distribution of distributed renewable energy sources and power electronics devices have made power load data increasingly complex and volatile.This places higher demands on the prediction and analysis of power loads.In order to improve the prediction accuracy of short-term power load,a CNN-BiLSTMTPA short-term power prediction model based on the Improved Whale Optimization Algorithm(IWOA)with mixed strategies was proposed.Firstly,the model combined the Convolutional Neural Network(CNN)with the Bidirectional Long Short-Term Memory Network(BiLSTM)to fully extract the spatio-temporal characteristics of the load data itself.Then,the Temporal Pattern Attention(TPA)mechanism was introduced into the CNN-BiLSTM model to automatically assign corresponding weights to the hidden states of the BiLSTM.This allowed the model to differentiate the importance of load sequences at different time intervals.At the same time,in order to solve the problem of the difficulties of selecting the parameters of the temporal model,and the poor global search ability of the whale algorithm,which is easy to fall into the local optimization,the whale algorithm(IWOA)was optimized by using the hybrid strategy of Tent chaos mapping and Levy flight strategy,so as to better search the parameters of the model.In this experiment,the real load data of a region in Zhejiang was taken as an example to analyze,and the prediction accuracy(R2)of the proposed method reached 98.83%.Compared with the prediction models such as BP,WOA-CNN-BiLSTM,SSA-CNN-BiLSTM,CNN-BiGRU-Attention,etc.,the experimental results showed that the model proposed in this study has a higher prediction accuracy.展开更多
To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on...To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning.展开更多
在结构健康监测系统中重构缺失响应数据对于准确评估结构工作状况至关重要。提出了一种基于双向长短期记忆网络和注意力机制的缺失振动响应重构网络——序列到序列-双向长短时记忆网络-注意力模型。该网络在序列到序列(sequence to sequ...在结构健康监测系统中重构缺失响应数据对于准确评估结构工作状况至关重要。提出了一种基于双向长短期记忆网络和注意力机制的缺失振动响应重构网络——序列到序列-双向长短时记忆网络-注意力模型。该网络在序列到序列(sequence to sequence,Seq2Seq)架构的基础上,将响应重构问题建模为序列生成问题,利用数据间潜在的时空关系显著提高模型的重构性能。此外,提出了一种基于均值平滑的损失计算方法评估模型的整体性能。通过对八自由度振动系统数值算例以及道林厅人行桥实际监测数据的研究,验证了所提出模型的鲁棒性与准确性。试验结果表明,该模型在不同噪声环境下均能胜任响应重构任务,在低信噪比的情况下仍表现出优异的重构性能。展开更多
基金supported by the National Natural Science Foundation(61601491)the Natural Science Foundation of Hubei Province(2018CFC865)the China Postdoctoral Science Foundation Funded Project(2016T45686).
文摘To solve the path following control problem for unmanned surface vehicles(USVs),a control method based on deep reinforcement learning(DRL)with long short-term memory(LSTM)networks is proposed.A distributed proximal policy opti-mization(DPPO)algorithm,which is a modified actor-critic-based type of reinforcement learning algorithm,is adapted to improve the controller performance in repeated trials.The LSTM network structure is introduced to solve the strong temporal cor-relation USV control problem.In addition,a specially designed path dataset,including straight and curved paths,is established to simulate various sailing scenarios so that the reinforcement learning controller can obtain as much handling experience as possible.Extensive numerical simulation results demonstrate that the proposed method has better control performance under missions involving complex maneuvers than trained with limited scenarios and can potentially be applied in practice.
文摘Load forecasting is of great significance to the development of new power systems.With the advancement of smart grids,the integration and distribution of distributed renewable energy sources and power electronics devices have made power load data increasingly complex and volatile.This places higher demands on the prediction and analysis of power loads.In order to improve the prediction accuracy of short-term power load,a CNN-BiLSTMTPA short-term power prediction model based on the Improved Whale Optimization Algorithm(IWOA)with mixed strategies was proposed.Firstly,the model combined the Convolutional Neural Network(CNN)with the Bidirectional Long Short-Term Memory Network(BiLSTM)to fully extract the spatio-temporal characteristics of the load data itself.Then,the Temporal Pattern Attention(TPA)mechanism was introduced into the CNN-BiLSTM model to automatically assign corresponding weights to the hidden states of the BiLSTM.This allowed the model to differentiate the importance of load sequences at different time intervals.At the same time,in order to solve the problem of the difficulties of selecting the parameters of the temporal model,and the poor global search ability of the whale algorithm,which is easy to fall into the local optimization,the whale algorithm(IWOA)was optimized by using the hybrid strategy of Tent chaos mapping and Levy flight strategy,so as to better search the parameters of the model.In this experiment,the real load data of a region in Zhejiang was taken as an example to analyze,and the prediction accuracy(R2)of the proposed method reached 98.83%.Compared with the prediction models such as BP,WOA-CNN-BiLSTM,SSA-CNN-BiLSTM,CNN-BiGRU-Attention,etc.,the experimental results showed that the model proposed in this study has a higher prediction accuracy.
基金supported by the Natural Science Basic Research Prog ram of Shaanxi(2022JQ-593)。
文摘To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning.
文摘在结构健康监测系统中重构缺失响应数据对于准确评估结构工作状况至关重要。提出了一种基于双向长短期记忆网络和注意力机制的缺失振动响应重构网络——序列到序列-双向长短时记忆网络-注意力模型。该网络在序列到序列(sequence to sequence,Seq2Seq)架构的基础上,将响应重构问题建模为序列生成问题,利用数据间潜在的时空关系显著提高模型的重构性能。此外,提出了一种基于均值平滑的损失计算方法评估模型的整体性能。通过对八自由度振动系统数值算例以及道林厅人行桥实际监测数据的研究,验证了所提出模型的鲁棒性与准确性。试验结果表明,该模型在不同噪声环境下均能胜任响应重构任务,在低信噪比的情况下仍表现出优异的重构性能。