近年来,综合能源系统作为一种以多种能源形态和设备相互交互的能源系统方案得到了广泛应用和研究.然而,在面对动态复杂的多能源系统时,传统的优化调度方法往往无法满足其实时性和精准度需求.因此,本文设计了一种软深度确定性策略梯度(So...近年来,综合能源系统作为一种以多种能源形态和设备相互交互的能源系统方案得到了广泛应用和研究.然而,在面对动态复杂的多能源系统时,传统的优化调度方法往往无法满足其实时性和精准度需求.因此,本文设计了一种软深度确定性策略梯度(Soft Deep Deterministic Policy Gradient,Soft-DDPG)算法驱动的综合能源系统优化调度方法,以最小化调度周期内系统总运行成本为目标,建立设备运行综合能效评估模型,再采用Soft-DDPG算法对每个能源设备的能效调度动作进行优化控制.Soft-DDPG算法将softmax算子引入到动作值函数的计算中,有效降低了Q值高估问题.与此同时,该算法在动作选择策略中加入了随机噪声,提高了算法的学习效率.实验结果显示,本文所提出的方法解决了综合能源系统能效调度实时性差、精准度低的瓶颈问题,实现了系统的高效灵活调度,降低了系统的总运行成本.展开更多
针对现有基于深度确定性策略梯度(deep deterministic policy gradient,DDPG)算法的再入制导方法计算精度较差,对强扰动条件适应性不足等问题,在DDPG算法训练框架的基础上,提出一种基于长短期记忆-DDPG(long short term memory-DDPG,LST...针对现有基于深度确定性策略梯度(deep deterministic policy gradient,DDPG)算法的再入制导方法计算精度较差,对强扰动条件适应性不足等问题,在DDPG算法训练框架的基础上,提出一种基于长短期记忆-DDPG(long short term memory-DDPG,LSTM-DDPG)的再入制导方法。该方法采用纵、侧向制导解耦设计思想,在纵向制导方面,首先针对再入制导问题构建强化学习所需的状态、动作空间;其次,确定决策点和制导周期内的指令计算策略,并设计考虑综合性能的奖励函数;然后,引入LSTM网络构建强化学习训练网络,进而通过在线更新策略提升算法的多任务适用性;侧向制导则采用基于横程误差的动态倾侧反转方法,获得倾侧角符号。以美国超音速通用飞行器(common aero vehicle-hypersonic,CAV-H)再入滑翔为例进行仿真,结果表明:与传统数值预测-校正方法相比,所提制导方法具有相当的终端精度和更高的计算效率优势;与现有基于DDPG算法的再入制导方法相比,所提制导方法具有相当的计算效率以及更高的终端精度和鲁棒性。展开更多
In this paper,we propose a three-term conjugate gradient method for solving unconstrained optimization problems based on the Hestenes-Stiefel(HS)conjugate gradient method and Polak-Ribiere-Polyak(PRP)conjugate gradien...In this paper,we propose a three-term conjugate gradient method for solving unconstrained optimization problems based on the Hestenes-Stiefel(HS)conjugate gradient method and Polak-Ribiere-Polyak(PRP)conjugate gradient method.Under the condition of standard Wolfe line search,the proposed search direction is the descent direction.For general nonlinear functions,the method is globally convergent.Finally,numerical results show that the proposed method is efficient.展开更多
The stress gradient of surrounding rock and reasonable prestress of support are the keys to ensuring the stability of roadways.The elastic-plastic analytical solution for surrounding rock was derived based on unified ...The stress gradient of surrounding rock and reasonable prestress of support are the keys to ensuring the stability of roadways.The elastic-plastic analytical solution for surrounding rock was derived based on unified strength theory.A model for solving the stress gradient of the surrounding rock with the intermediate principal stress parameter b was established.The correctness and applicability of the solution for the stress gradient in the roadway surrounding rock was verified via multiple methods.Furthermore,the laws of stress,displacement,and the plastic zone of the surrounding rock with different b values and prestresses were revealed.As b increases,the stress gradient in the plastic zone increases,and the displacement and plastic zone radius decrease.As the prestress increases,the peak stress shifts toward the sidewalls,and the stress and stress gradient increments decrease.In addition,the displacement increment and plastic zone increment were proposed to characterize the support effect.The balance point of the plastic zone area appears before that of the displacement zone.The relationship between the stress gradient compensation coefficient and the prestress is obtained.This study provides a research method and idea for determining the reasonable prestress of support in roadways.展开更多
文摘近年来,综合能源系统作为一种以多种能源形态和设备相互交互的能源系统方案得到了广泛应用和研究.然而,在面对动态复杂的多能源系统时,传统的优化调度方法往往无法满足其实时性和精准度需求.因此,本文设计了一种软深度确定性策略梯度(Soft Deep Deterministic Policy Gradient,Soft-DDPG)算法驱动的综合能源系统优化调度方法,以最小化调度周期内系统总运行成本为目标,建立设备运行综合能效评估模型,再采用Soft-DDPG算法对每个能源设备的能效调度动作进行优化控制.Soft-DDPG算法将softmax算子引入到动作值函数的计算中,有效降低了Q值高估问题.与此同时,该算法在动作选择策略中加入了随机噪声,提高了算法的学习效率.实验结果显示,本文所提出的方法解决了综合能源系统能效调度实时性差、精准度低的瓶颈问题,实现了系统的高效灵活调度,降低了系统的总运行成本.
文摘针对现有基于深度确定性策略梯度(deep deterministic policy gradient,DDPG)算法的再入制导方法计算精度较差,对强扰动条件适应性不足等问题,在DDPG算法训练框架的基础上,提出一种基于长短期记忆-DDPG(long short term memory-DDPG,LSTM-DDPG)的再入制导方法。该方法采用纵、侧向制导解耦设计思想,在纵向制导方面,首先针对再入制导问题构建强化学习所需的状态、动作空间;其次,确定决策点和制导周期内的指令计算策略,并设计考虑综合性能的奖励函数;然后,引入LSTM网络构建强化学习训练网络,进而通过在线更新策略提升算法的多任务适用性;侧向制导则采用基于横程误差的动态倾侧反转方法,获得倾侧角符号。以美国超音速通用飞行器(common aero vehicle-hypersonic,CAV-H)再入滑翔为例进行仿真,结果表明:与传统数值预测-校正方法相比,所提制导方法具有相当的终端精度和更高的计算效率优势;与现有基于DDPG算法的再入制导方法相比,所提制导方法具有相当的计算效率以及更高的终端精度和鲁棒性。
基金Supported by the Science and Technology Project of Guangxi(Guike AD23023002)。
文摘In this paper,we propose a three-term conjugate gradient method for solving unconstrained optimization problems based on the Hestenes-Stiefel(HS)conjugate gradient method and Polak-Ribiere-Polyak(PRP)conjugate gradient method.Under the condition of standard Wolfe line search,the proposed search direction is the descent direction.For general nonlinear functions,the method is globally convergent.Finally,numerical results show that the proposed method is efficient.
基金Project(52274130)supported by the National Natural Science Foundation of ChinaProject(ZR2024ZD22)supported by the Major Basic Research Project of the Shandong Provincial Natural Science Foundation,China+2 种基金Project(2023375)supported by the Guizhou University Research and Innovation Team,ChinaProject(Leading Fund(2023)09)supported by the Natural Science Research Fund of Guizhou University,ChinaProject(JYBSYS2021101)supported by the Open Fund of Key Laboratory of Safe and Effective Coal Mining,Ministry of Education,China。
文摘The stress gradient of surrounding rock and reasonable prestress of support are the keys to ensuring the stability of roadways.The elastic-plastic analytical solution for surrounding rock was derived based on unified strength theory.A model for solving the stress gradient of the surrounding rock with the intermediate principal stress parameter b was established.The correctness and applicability of the solution for the stress gradient in the roadway surrounding rock was verified via multiple methods.Furthermore,the laws of stress,displacement,and the plastic zone of the surrounding rock with different b values and prestresses were revealed.As b increases,the stress gradient in the plastic zone increases,and the displacement and plastic zone radius decrease.As the prestress increases,the peak stress shifts toward the sidewalls,and the stress and stress gradient increments decrease.In addition,the displacement increment and plastic zone increment were proposed to characterize the support effect.The balance point of the plastic zone area appears before that of the displacement zone.The relationship between the stress gradient compensation coefficient and the prestress is obtained.This study provides a research method and idea for determining the reasonable prestress of support in roadways.