To tackle the challenges of intractable parameter tun-ing,significant computational expenditure and imprecise model-driven sparse-based direction of arrival(DOA)estimation with array error(AE),this paper proposes a de...To tackle the challenges of intractable parameter tun-ing,significant computational expenditure and imprecise model-driven sparse-based direction of arrival(DOA)estimation with array error(AE),this paper proposes a deep unfolded amplitude-phase error self-calibration network.Firstly,a sparse-based DOA model with an array convex error restriction is established,which gets resolved via an alternating iterative minimization(AIM)algo-rithm.The algorithm is then unrolled to a deep network known as AE-AIM Network(AE-AIM-Net),where all parameters are opti-mized through multi-task learning using the constructed com-plete dataset.The results of the simulation and theoretical analy-sis suggest that the proposed unfolded network achieves lower computational costs compared to typical sparse recovery meth-ods.Furthermore,it maintains excellent estimation performance even in the presence of array magnitude-phase errors.展开更多
雷达协同抗干扰决策过程中奖励存在稀疏性,导致强化学习算法难以收敛,协同训练困难。为解决该问题,提出一种分层多智能体深度确定性策略梯度(hierarchical multi-agent deep deterministic policy gradient,H-MADDPG)算法,通过稀疏奖励...雷达协同抗干扰决策过程中奖励存在稀疏性,导致强化学习算法难以收敛,协同训练困难。为解决该问题,提出一种分层多智能体深度确定性策略梯度(hierarchical multi-agent deep deterministic policy gradient,H-MADDPG)算法,通过稀疏奖励的累积提升训练过程的收敛性能,引入哈佛结构思想分别存储多智能体的训练经验以消除经验回放混乱问题。在2部和4部雷达组网仿真中,在某种强干扰条件下,雷达探测成功率比多智能体深度确定性梯度(multi-agent deep deterministic policy gradient,MADDPG)算法分别提高了15%和30%。展开更多
兴趣点(point of interest,POI)推荐可以缓解用户选择困难问题并提高位置服务商、商家的收益,是位置社交网络的研究热点之一。在已有的综述中缺乏数据问题对策的梳理、前沿算法的更新、算法性能对比实验等内容。因此对这一领域的研究进...兴趣点(point of interest,POI)推荐可以缓解用户选择困难问题并提高位置服务商、商家的收益,是位置社交网络的研究热点之一。在已有的综述中缺乏数据问题对策的梳理、前沿算法的更新、算法性能对比实验等内容。因此对这一领域的研究进行系统性综述,从数据问题、算法技术和对比实验三个方面进行归纳总结。从POI数据问题角度分析并归纳出数据稀疏、数据依赖和数据隐私三大问题及其对应的解决方法;从算法所用技术角度将现有重要研究分为矩阵分解、编码器、图神经网络、注意力机制、生成模型五类,比较并总结其优劣;从算法性能对比角度出发,选取使用频度最高的召回率和精度作为评价指标,对五个代表性算法进行实验及评价;指出该领域所面临的挑战和未来研究方向。展开更多
The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the mai...The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the main trends of UAV development in the future.This paper studies the behavior decision-making process of UAV swarm rendezvous task based on the double deep Q network(DDQN)algorithm.We design a guided reward function to effectively solve the problem of algorithm convergence caused by the sparse return problem in deep reinforcement learning(DRL)for the long period task.We also propose the concept of temporary storage area,optimizing the memory playback unit of the traditional DDQN algorithm,improving the convergence speed of the algorithm,and speeding up the training process of the algorithm.Different from traditional task environment,this paper establishes a continuous state-space task environment model to improve the authentication process of UAV task environment.Based on the DDQN algorithm,the collaborative tasks of UAV swarm in different task scenarios are trained.The experimental results validate that the DDQN algorithm is efficient in terms of training UAV swarm to complete the given collaborative tasks while meeting the requirements of UAV swarm for centralization and autonomy,and improving the intelligence of UAV swarm collaborative task execution.The simulation results show that after training,the proposed UAV swarm can carry out the rendezvous task well,and the success rate of the mission reaches 90%.展开更多
基金supported by the National Natural Science Foundation of China(62301598).
文摘To tackle the challenges of intractable parameter tun-ing,significant computational expenditure and imprecise model-driven sparse-based direction of arrival(DOA)estimation with array error(AE),this paper proposes a deep unfolded amplitude-phase error self-calibration network.Firstly,a sparse-based DOA model with an array convex error restriction is established,which gets resolved via an alternating iterative minimization(AIM)algo-rithm.The algorithm is then unrolled to a deep network known as AE-AIM Network(AE-AIM-Net),where all parameters are opti-mized through multi-task learning using the constructed com-plete dataset.The results of the simulation and theoretical analy-sis suggest that the proposed unfolded network achieves lower computational costs compared to typical sparse recovery meth-ods.Furthermore,it maintains excellent estimation performance even in the presence of array magnitude-phase errors.
文摘雷达协同抗干扰决策过程中奖励存在稀疏性,导致强化学习算法难以收敛,协同训练困难。为解决该问题,提出一种分层多智能体深度确定性策略梯度(hierarchical multi-agent deep deterministic policy gradient,H-MADDPG)算法,通过稀疏奖励的累积提升训练过程的收敛性能,引入哈佛结构思想分别存储多智能体的训练经验以消除经验回放混乱问题。在2部和4部雷达组网仿真中,在某种强干扰条件下,雷达探测成功率比多智能体深度确定性梯度(multi-agent deep deterministic policy gradient,MADDPG)算法分别提高了15%和30%。
文摘兴趣点(point of interest,POI)推荐可以缓解用户选择困难问题并提高位置服务商、商家的收益,是位置社交网络的研究热点之一。在已有的综述中缺乏数据问题对策的梳理、前沿算法的更新、算法性能对比实验等内容。因此对这一领域的研究进行系统性综述,从数据问题、算法技术和对比实验三个方面进行归纳总结。从POI数据问题角度分析并归纳出数据稀疏、数据依赖和数据隐私三大问题及其对应的解决方法;从算法所用技术角度将现有重要研究分为矩阵分解、编码器、图神经网络、注意力机制、生成模型五类,比较并总结其优劣;从算法性能对比角度出发,选取使用频度最高的召回率和精度作为评价指标,对五个代表性算法进行实验及评价;指出该领域所面临的挑战和未来研究方向。
基金supported by the Aeronautical Science Foundation(2017ZC53033).
文摘The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the main trends of UAV development in the future.This paper studies the behavior decision-making process of UAV swarm rendezvous task based on the double deep Q network(DDQN)algorithm.We design a guided reward function to effectively solve the problem of algorithm convergence caused by the sparse return problem in deep reinforcement learning(DRL)for the long period task.We also propose the concept of temporary storage area,optimizing the memory playback unit of the traditional DDQN algorithm,improving the convergence speed of the algorithm,and speeding up the training process of the algorithm.Different from traditional task environment,this paper establishes a continuous state-space task environment model to improve the authentication process of UAV task environment.Based on the DDQN algorithm,the collaborative tasks of UAV swarm in different task scenarios are trained.The experimental results validate that the DDQN algorithm is efficient in terms of training UAV swarm to complete the given collaborative tasks while meeting the requirements of UAV swarm for centralization and autonomy,and improving the intelligence of UAV swarm collaborative task execution.The simulation results show that after training,the proposed UAV swarm can carry out the rendezvous task well,and the success rate of the mission reaches 90%.