Missile interception problem can be regarded as a two-person zero-sum differential games problem,which depends on the solution of Hamilton-Jacobi-Isaacs(HJI)equa-tion.It has been proved impossible to obtain a closed-f...Missile interception problem can be regarded as a two-person zero-sum differential games problem,which depends on the solution of Hamilton-Jacobi-Isaacs(HJI)equa-tion.It has been proved impossible to obtain a closed-form solu-tion due to the nonlinearity of HJI equation,and many iterative algorithms are proposed to solve the HJI equation.Simultane-ous policy updating algorithm(SPUA)is an effective algorithm for solving HJI equation,but it is an on-policy integral reinforce-ment learning(IRL).For online implementation of SPUA,the dis-turbance signals need to be adjustable,which is unrealistic.In this paper,an off-policy IRL algorithm based on SPUA is pro-posed without making use of any knowledge of the systems dynamics.Then,a neural-network based online adaptive critic implementation scheme of the off-policy IRL algorithm is pre-sented.Based on the online off-policy IRL method,a computa-tional intelligence interception guidance(CIIG)law is developed for intercepting high-maneuvering target.As a model-free method,intercepting targets can be achieved through measur-ing system data online.The effectiveness of the CIIG is verified through two missile and target engagement scenarios.展开更多
目标跟踪作为图像处理领域的重要组成部分,广泛应用于智能视频监控、军事侦察等领域。但在面对物体形变以及遮挡等复杂应用场景时,相关滤波算法由于缺乏目标和背景判别区分以及遮挡状态判断等策略,存在跟错目标、缓慢漂移到背景等现象,...目标跟踪作为图像处理领域的重要组成部分,广泛应用于智能视频监控、军事侦察等领域。但在面对物体形变以及遮挡等复杂应用场景时,相关滤波算法由于缺乏目标和背景判别区分以及遮挡状态判断等策略,存在跟错目标、缓慢漂移到背景等现象,在遮挡后目标重新出现时,缺乏重检测机制,这些问题导致了跟踪性能在实际工程中大幅下降。针对以上问题进行改进设计,首先在跟踪过程中,使用网络优化器更新多层深度特征提取网络,优化损失函数提高目标与背景的判别能力;其次,采用多重检测抗遮挡优化机制,确定跟踪器状态更新机制;最后,基于深度学习进行检测跟踪识别一体化设计,实现跟踪前典型目标的自动捕获,目标受遮挡后重新出现时实现对典型目标的重新捕获定位。在实验分析中,分别从跟踪精度、可视化定量损失以及算法速度等方面进行了性能验证。实测数据显示,本文采用的方法在以上方面性能表现良好,优于改进前的ECO(efficientconvolution operators for tracking)算法。展开更多
近年来,综合能源系统作为一种以多种能源形态和设备相互交互的能源系统方案得到了广泛应用和研究.然而,在面对动态复杂的多能源系统时,传统的优化调度方法往往无法满足其实时性和精准度需求.因此,本文设计了一种软深度确定性策略梯度(So...近年来,综合能源系统作为一种以多种能源形态和设备相互交互的能源系统方案得到了广泛应用和研究.然而,在面对动态复杂的多能源系统时,传统的优化调度方法往往无法满足其实时性和精准度需求.因此,本文设计了一种软深度确定性策略梯度(Soft Deep Deterministic Policy Gradient,Soft-DDPG)算法驱动的综合能源系统优化调度方法,以最小化调度周期内系统总运行成本为目标,建立设备运行综合能效评估模型,再采用Soft-DDPG算法对每个能源设备的能效调度动作进行优化控制.Soft-DDPG算法将softmax算子引入到动作值函数的计算中,有效降低了Q值高估问题.与此同时,该算法在动作选择策略中加入了随机噪声,提高了算法的学习效率.实验结果显示,本文所提出的方法解决了综合能源系统能效调度实时性差、精准度低的瓶颈问题,实现了系统的高效灵活调度,降低了系统的总运行成本.展开更多
文摘Missile interception problem can be regarded as a two-person zero-sum differential games problem,which depends on the solution of Hamilton-Jacobi-Isaacs(HJI)equa-tion.It has been proved impossible to obtain a closed-form solu-tion due to the nonlinearity of HJI equation,and many iterative algorithms are proposed to solve the HJI equation.Simultane-ous policy updating algorithm(SPUA)is an effective algorithm for solving HJI equation,but it is an on-policy integral reinforce-ment learning(IRL).For online implementation of SPUA,the dis-turbance signals need to be adjustable,which is unrealistic.In this paper,an off-policy IRL algorithm based on SPUA is pro-posed without making use of any knowledge of the systems dynamics.Then,a neural-network based online adaptive critic implementation scheme of the off-policy IRL algorithm is pre-sented.Based on the online off-policy IRL method,a computa-tional intelligence interception guidance(CIIG)law is developed for intercepting high-maneuvering target.As a model-free method,intercepting targets can be achieved through measur-ing system data online.The effectiveness of the CIIG is verified through two missile and target engagement scenarios.
文摘目标跟踪作为图像处理领域的重要组成部分,广泛应用于智能视频监控、军事侦察等领域。但在面对物体形变以及遮挡等复杂应用场景时,相关滤波算法由于缺乏目标和背景判别区分以及遮挡状态判断等策略,存在跟错目标、缓慢漂移到背景等现象,在遮挡后目标重新出现时,缺乏重检测机制,这些问题导致了跟踪性能在实际工程中大幅下降。针对以上问题进行改进设计,首先在跟踪过程中,使用网络优化器更新多层深度特征提取网络,优化损失函数提高目标与背景的判别能力;其次,采用多重检测抗遮挡优化机制,确定跟踪器状态更新机制;最后,基于深度学习进行检测跟踪识别一体化设计,实现跟踪前典型目标的自动捕获,目标受遮挡后重新出现时实现对典型目标的重新捕获定位。在实验分析中,分别从跟踪精度、可视化定量损失以及算法速度等方面进行了性能验证。实测数据显示,本文采用的方法在以上方面性能表现良好,优于改进前的ECO(efficientconvolution operators for tracking)算法。
文摘近年来,综合能源系统作为一种以多种能源形态和设备相互交互的能源系统方案得到了广泛应用和研究.然而,在面对动态复杂的多能源系统时,传统的优化调度方法往往无法满足其实时性和精准度需求.因此,本文设计了一种软深度确定性策略梯度(Soft Deep Deterministic Policy Gradient,Soft-DDPG)算法驱动的综合能源系统优化调度方法,以最小化调度周期内系统总运行成本为目标,建立设备运行综合能效评估模型,再采用Soft-DDPG算法对每个能源设备的能效调度动作进行优化控制.Soft-DDPG算法将softmax算子引入到动作值函数的计算中,有效降低了Q值高估问题.与此同时,该算法在动作选择策略中加入了随机噪声,提高了算法的学习效率.实验结果显示,本文所提出的方法解决了综合能源系统能效调度实时性差、精准度低的瓶颈问题,实现了系统的高效灵活调度,降低了系统的总运行成本.