This paper introduces a self-learning control approach based on approximate dynamic programming. Dynamic programming was introduced by Bellman in the 1950's for solving optimal control problems of nonlinear dynami...This paper introduces a self-learning control approach based on approximate dynamic programming. Dynamic programming was introduced by Bellman in the 1950's for solving optimal control problems of nonlinear dynamical systems. Due to its high computational complexity, the applications of dynamic programming have been limited to simple and small problems. The key step in finding approximate solutions to dynamic programming is to estimate the performance index in dynamic programming. The optimal control signal can then be determined by minimizing (or maximizing) the performance index. Artificial neural networks are very efficient tools in representing the performance index in dynamic programming. This paper assumes the use of neural networks for estimating the performance index in dynamic programming and for generating optimal control signals, thus to achieve optimal control through self-learning.展开更多
针对城市居民区回收箱布局规划和路径优化问题,首先构建居民区回收箱数量与人口、回收频率、回收阈值的线性函数,并构建双层优化模型,回收总利润最大化作为上层目标,运输成本最小化作为下层目标。其次,为求解具有NP-hard特征的新模型,...针对城市居民区回收箱布局规划和路径优化问题,首先构建居民区回收箱数量与人口、回收频率、回收阈值的线性函数,并构建双层优化模型,回收总利润最大化作为上层目标,运输成本最小化作为下层目标。其次,为求解具有NP-hard特征的新模型,设计加入团体学习算子和自适应选择策略的人类学习优化算法,并与禁忌搜索算法嵌套构建混合人类学习算法(hybrid human learning optimization algorithm,HHLO)。再次,采用不同规模算例,并将新算法与基本人类学习算法、遗传算法、自适应粒子群算法、红嘴蓝鹊算法进行对比分析,验证了模型的可行性和算法的有效性。最后,通过上海杨浦区某实例进行灵敏度分析,探讨回收箱容量、分时定价策略和分区定价策略对回收中心总利润与居民满意度的影响。展开更多
基金Supported by the National Science Foundation (U.S.A.) under Grant ECS-0355364
文摘This paper introduces a self-learning control approach based on approximate dynamic programming. Dynamic programming was introduced by Bellman in the 1950's for solving optimal control problems of nonlinear dynamical systems. Due to its high computational complexity, the applications of dynamic programming have been limited to simple and small problems. The key step in finding approximate solutions to dynamic programming is to estimate the performance index in dynamic programming. The optimal control signal can then be determined by minimizing (or maximizing) the performance index. Artificial neural networks are very efficient tools in representing the performance index in dynamic programming. This paper assumes the use of neural networks for estimating the performance index in dynamic programming and for generating optimal control signals, thus to achieve optimal control through self-learning.
文摘针对城市居民区回收箱布局规划和路径优化问题,首先构建居民区回收箱数量与人口、回收频率、回收阈值的线性函数,并构建双层优化模型,回收总利润最大化作为上层目标,运输成本最小化作为下层目标。其次,为求解具有NP-hard特征的新模型,设计加入团体学习算子和自适应选择策略的人类学习优化算法,并与禁忌搜索算法嵌套构建混合人类学习算法(hybrid human learning optimization algorithm,HHLO)。再次,采用不同规模算例,并将新算法与基本人类学习算法、遗传算法、自适应粒子群算法、红嘴蓝鹊算法进行对比分析,验证了模型的可行性和算法的有效性。最后,通过上海杨浦区某实例进行灵敏度分析,探讨回收箱容量、分时定价策略和分区定价策略对回收中心总利润与居民满意度的影响。