摘要
研究了一种多无人机UAVs辅助移动边缘计算(Mobile Edge Computing,MEC)任务卸载方案,通过联合优化任务划分、卸载关联、无人机轨迹和资源分配,实现系统能耗最小化。由于计算任务生成的随机性和用户移动的不可预测性,该问题不仅是一个非凸整数规划问题,更是一个需要实时决策、长期考虑的目标优化问题,传统离线算法难以求解。提出一种基于多智能体强化学习(Multi-Agent Reinforcement Learning,MARL)的任务卸载方法,采用集中式训练-分布式执行架构,根据网络状态的观测做出实时决策。将问题建模为马尔科夫决策模型,基于多智能体近端策略优化算法进行训练,通过不断学习以优化自身策略。针对Actor网络,使用Beta分布改进其策略分布的采样,以适应有界的混合动作空间,引入注意力机制以提升状态值函数的拟合性能,加速算法收敛。仿真结果表明,相比基准方案,所提方法收敛速度提升了10%~30%,用户与无人机的加权能耗降低了22.5%~31.6%。
The task offloading strategy for multiple UAVs-assisted Mobile Edge Computing(MEC)is studied,the objective is to minimize the system energy consumption by jointly optimizing the task partitioning,offloading association,UAV trajectory and resource allocation.Due to the randomness of computing tasks and the unpredictability of user mobility,it is challenging to address the nonlinear integer optimization problem that requires real-time decision-making and long-term considerations by using typical offline algorithms.The method based on Multi-Agent Reinforcement Learning(MARL)is presented combining with the centralized training and distributed executing framework,where the agents independently obtain the limited network observations and then make decisions in real time.Specifically,the problem is modeled as a Markov decision process,the multi-agent proximal policy optimization algorithm is proposed to train the agents and the strategy is optimized by continuous learning.Subsequently,in order to fit the action spaces with boundary,the Beta distribution is utilized in the output of Actor networks to enhance the performance of the action sampling.Moreover,the attention mechanism is introduced to the input of Critic networks to enhance its fitting performance on value function and accelerate the convergence of the algorithm.Simulation results show that the convergence speed can be improved by 10%~30% and the energy consumption can be reduced by 22.5%~31.6% compared with reference schemes.
作者
李斌
LI Bin(School of Computer Science,Nanjing University of Information Science&Technology,Nanjing 210044,China;Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology(CICAEET),Nanjing University of Information Science&Technology,Nanjing 210044,China)
出处
《无线电工程》
北大核心
2023年第12期2731-2740,共10页
Radio Engineering
基金
国家自然科学基金(62101277)
江苏省自然科学基金(BK20200822)。
作者简介
李斌,男,(1987-),博士,教授。主要研究方向:无人机通信、边缘计算等。