期刊文献+
共找到70,094篇文章
< 1 2 250 >
每页显示 20 50 100
Knowledge transfer in multi-agent reinforcement learning with incremental number of agents 被引量:4
1
作者 LIU Wenzhang DONG Lu +1 位作者 LIU Jian SUN Changyin 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2022年第2期447-460,共14页
In this paper, the reinforcement learning method for cooperative multi-agent systems(MAS) with incremental number of agents is studied. The existing multi-agent reinforcement learning approaches deal with the MAS with... In this paper, the reinforcement learning method for cooperative multi-agent systems(MAS) with incremental number of agents is studied. The existing multi-agent reinforcement learning approaches deal with the MAS with a specific number of agents, and can learn well-performed policies. However, if there is an increasing number of agents, the previously learned in may not perform well in the current scenario. The new agents need to learn from scratch to find optimal policies with others,which may slow down the learning speed of the whole team. To solve that problem, in this paper, we propose a new algorithm to take full advantage of the historical knowledge which was learned before, and transfer it from the previous agents to the new agents. Since the previous agents have been trained well in the source environment, they are treated as teacher agents in the target environment. Correspondingly, the new agents are called student agents. To enable the student agents to learn from the teacher agents, we first modify the input nodes of the networks for teacher agents to adapt to the current environment. Then, the teacher agents take the observations of the student agents as input, and output the advised actions and values as supervising information. Finally, the student agents combine the reward from the environment and the supervising information from the teacher agents, and learn the optimal policies with modified loss functions. By taking full advantage of the knowledge of teacher agents, the search space for the student agents will be reduced significantly, which can accelerate the learning speed of the holistic system. The proposed algorithm is verified in some multi-agent simulation environments, and its efficiency has been demonstrated by the experiment results. 展开更多
关键词 knowledge transfer multi-agent reinforcement learning(MARL) new agents
在线阅读 下载PDF
Cooperative multi-target hunting by unmanned surface vehicles based on multi-agent reinforcement learning 被引量:2
2
作者 Jiawei Xia Yasong Luo +3 位作者 Zhikun Liu Yalun Zhang Haoran Shi Zhong Liu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第11期80-94,共15页
To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model wit... To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model without boundary constraints are built,and the criteria for successful target capture are given.Then,the cooperative hunting problem of a USV fleet is modeled as a decentralized partially observable Markov decision process(Dec-POMDP),and a distributed partially observable multitarget hunting Proximal Policy Optimization(DPOMH-PPO)algorithm applicable to USVs is proposed.In addition,an observation model,a reward function and the action space applicable to multi-target hunting tasks are designed.To deal with the dynamic change of observational feature dimension input by partially observable systems,a feature embedding block is proposed.By combining the two feature compression methods of column-wise max pooling(CMP)and column-wise average-pooling(CAP),observational feature encoding is established.Finally,the centralized training and decentralized execution framework is adopted to complete the training of hunting strategy.Each USV in the fleet shares the same policy and perform actions independently.Simulation experiments have verified the effectiveness of the DPOMH-PPO algorithm in the test scenarios with different numbers of USVs.Moreover,the advantages of the proposed model are comprehensively analyzed from the aspects of algorithm performance,migration effect in task scenarios and self-organization capability after being damaged,the potential deployment and application of DPOMH-PPO in the real environment is verified. 展开更多
关键词 Unmanned surface vehicles multi-agent deep reinforcement learning Cooperative hunting Feature embedding Proximal policy optimization
在线阅读 下载PDF
Collaborative multi-agent reinforcement learning based on experience propagation 被引量:5
3
作者 Min Fang Frans C.A. Groen 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2013年第4期683-689,共7页
For multi-agent reinforcement learning in Markov games, knowledge extraction and sharing are key research problems. State list extracting means to calculate the optimal shared state path from state trajectories with c... For multi-agent reinforcement learning in Markov games, knowledge extraction and sharing are key research problems. State list extracting means to calculate the optimal shared state path from state trajectories with cycles. A state list extracting algorithm checks cyclic state lists of a current state in the state trajectory, condensing the optimal action set of the current state. By reinforcing the optimal action selected, the action policy of cyclic states is optimized gradually. The state list extracting is repeatedly learned and used as the experience knowledge which is shared by teams. Agents speed up the rate of convergence by experience sharing. Competition games of preys and predators are used for the experiments. The results of experiments prove that the proposed algorithms overcome the lack of experience in the initial stage, speed up learning and improve the performance. 展开更多
关键词 multi-agent Q learning state list extracting experience sharing.
在线阅读 下载PDF
A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning 被引量:4
4
作者 MA Ye CHANG Tianqing FAN Wenhui 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2021年第3期642-657,共16页
In the evolutionary game of the same task for groups,the changes in game rules,personal interests,the crowd size,and external supervision cause uncertain effects on individual decision-making and game results.In the M... In the evolutionary game of the same task for groups,the changes in game rules,personal interests,the crowd size,and external supervision cause uncertain effects on individual decision-making and game results.In the Markov decision framework,a single-task multi-decision evolutionary game model based on multi-agent reinforcement learning is proposed to explore the evolutionary rules in the process of a game.The model can improve the result of a evolutionary game and facilitate the completion of the task.First,based on the multi-agent theory,to solve the existing problems in the original model,a negative feedback tax penalty mechanism is proposed to guide the strategy selection of individuals in the group.In addition,in order to evaluate the evolutionary game results of the group in the model,a calculation method of the group intelligence level is defined.Secondly,the Q-learning algorithm is used to improve the guiding effect of the negative feedback tax penalty mechanism.In the model,the selection strategy of the Q-learning algorithm is improved and a bounded rationality evolutionary game strategy is proposed based on the rule of evolutionary games and the consideration of the bounded rationality of individuals.Finally,simulation results show that the proposed model can effectively guide individuals to choose cooperation strategies which are beneficial to task completion and stability under different negative feedback factor values and different group sizes,so as to improve the group intelligence level. 展开更多
关键词 multi-agent reinforcement learning evolutionary game Q-learning
在线阅读 下载PDF
Open-loop and closed-loop D^(α)-type iterative learning control for fractional-order linear multi-agent systems with state-delays 被引量:1
5
作者 LI Bingqiang LAN Tianyi +1 位作者 ZHAO Yiyun LYU Shuaishuai 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2021年第1期197-208,共12页
This study focuses on implementing consensus tracking using both open-loop and closed-loop Dα-type iterative learning control(ILC)schemes,for fractional-order multi-agent systems(FOMASs)with state-delays.The desired ... This study focuses on implementing consensus tracking using both open-loop and closed-loop Dα-type iterative learning control(ILC)schemes,for fractional-order multi-agent systems(FOMASs)with state-delays.The desired trajectory is constructed by introducing a virtual leader,and the fixed communication topology is considered and only a subset of followers can access the desired trajectory.For each control scheme,one controller is designed for one agent individually.According to the tracking error between the agent and the virtual leader,and the tracking errors between the agent and neighboring agents during the last iteration(for open-loop scheme)or the current running(for closed-loop scheme),each controller continuously corrects the last control law by a combination of communication weights in the topology to obtain the ideal control law.Through the rigorous analysis,sufficient conditions for both control schemes are established to ensure that all agents can achieve the asymptotically consistent output along the iteration axis within a finite-time interval.Sufficient numerical simulation results demonstrate the effectiveness of the control schemes,and provide some meaningful comparison results. 展开更多
关键词 multi-agent system FRACTIONAL-ORDER consensus control iterative learning control virtual leader STATE-DELAY
在线阅读 下载PDF
UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning 被引量:25
6
作者 ZHANG Jiandong YANG Qiming +2 位作者 SHI Guoqing LU Yi WU Yong 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2021年第6期1421-1438,共18页
In order to improve the autonomous ability of unmanned aerial vehicles(UAV)to implement air combat mission,many artificial intelligence-based autonomous air combat maneuver decision-making studies have been carried ou... In order to improve the autonomous ability of unmanned aerial vehicles(UAV)to implement air combat mission,many artificial intelligence-based autonomous air combat maneuver decision-making studies have been carried out,but these studies are often aimed at individual decision-making in 1 v1 scenarios which rarely happen in actual air combat.Based on the research of the 1 v1 autonomous air combat maneuver decision,this paper builds a multi-UAV cooperative air combat maneuver decision model based on multi-agent reinforcement learning.Firstly,a bidirectional recurrent neural network(BRNN)is used to achieve communication between UAV individuals,and the multi-UAV cooperative air combat maneuver decision model under the actor-critic architecture is established.Secondly,through combining with target allocation and air combat situation assessment,the tactical goal of the formation is merged with the reinforcement learning goal of every UAV,and a cooperative tactical maneuver policy is generated.The simulation results prove that the multi-UAV cooperative air combat maneuver decision model established in this paper can obtain the cooperative maneuver policy through reinforcement learning,the cooperative maneuver policy can guide UAVs to obtain the overall situational advantage and defeat the opponents under tactical cooperation. 展开更多
关键词 DECISION-MAKING air combat maneuver cooperative air combat reinforcement learning recurrent neural network
在线阅读 下载PDF
UAV Frequency-based Crowdsensing Using Grouping Multi-agent Deep Reinforcement Learning
7
作者 Cui ZHANG En WANG +2 位作者 Funing YANG Yong jian YANG Nan JIANG 《计算机科学》 CSCD 北大核心 2023年第2期57-68,共12页
Mobile CrowdSensing(MCS)is a promising sensing paradigm that recruits users to cooperatively perform sensing tasks.Recently,unmanned aerial vehicles(UAVs)as the powerful sensing devices are used to replace user partic... Mobile CrowdSensing(MCS)is a promising sensing paradigm that recruits users to cooperatively perform sensing tasks.Recently,unmanned aerial vehicles(UAVs)as the powerful sensing devices are used to replace user participation and carry out some special tasks,such as epidemic monitoring and earthquakes rescue.In this paper,we focus on scheduling UAVs to sense the task Point-of-Interests(PoIs)with different frequency coverage requirements.To accomplish the sensing task,the scheduling strategy needs to consider the coverage requirement,geographic fairness and energy charging simultaneously.We consider the complex interaction among UAVs and propose a grouping multi-agent deep reinforcement learning approach(G-MADDPG)to schedule UAVs distributively.G-MADDPG groups all UAVs into some teams by a distance-based clustering algorithm(DCA),then it regards each team as an agent.In this way,G-MADDPG solves the problem that the training time of traditional MADDPG is too long to converge when the number of UAVs is large,and the trade-off between training time and result accuracy could be controlled flexibly by adjusting the number of teams.Extensive simulation results show that our scheduling strategy has better performance compared with three baselines and is flexible in balancing training time and result accuracy. 展开更多
关键词 UAV Crowdsensing Frequency coverage Grouping multi-agent deep reinforcement learning
在线阅读 下载PDF
Tactical reward shaping for large-scale combat by multi-agent reinforcement learning
8
作者 DUO Nanxun WANG Qinzhao +1 位作者 LYU Qiang WANG Wei 《Journal of Systems Engineering and Electronics》 CSCD 2024年第6期1516-1529,共14页
Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the comb... Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the combat group,this task suffers from credit assignment problem more than other rein-forcement learning tasks.This study uses reward shaping to relieve the credit assignment problem and improve policy train-ing for the new generation of large-scale unmanned combat operations.We first prove that multiple reward shaping func-tions would not change the Nash Equilibrium in stochastic games,providing theoretical support for their use.According to the characteristics of combat operations,we propose tactical reward shaping(TRS)that comprises maneuver shaping advice and threat assessment-based attack shaping advice.Then,we investigate the effects of different types and combinations of shaping advice on combat policies through experiments.The results show that TRS improves both the efficiency and attack accuracy of combat policies,with the combination of maneuver reward shaping advice and ally-focused attack shaping advice achieving the best performance compared with that of the base-line strategy. 展开更多
关键词 deep reinforcement learning multi-agent reinforce-ment learning multi-agent combat unmanned battle reward shaping
在线阅读 下载PDF
Multi-agent reinforcement learning based on policies of global objective
9
作者 张化祥 黄上腾 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2005年第3期676-681,共6页
In general-sum games, taking all agent's collective rationality into account, we define agents' global objective, and propose a novel multi-agent reinforcement learning(RL) algorithm based on global policy. In eac... In general-sum games, taking all agent's collective rationality into account, we define agents' global objective, and propose a novel multi-agent reinforcement learning(RL) algorithm based on global policy. In each learning step, all agents commit to select the global policy to achieve the global goal. We prove this learning algorithm converges given certain restrictions on stage games of learned Q values, and show that it has quite lower computation time complexity than already developed multi-agent learning algorithms for general-sum games. An example is analyzed to show the algorithm' s merits. 展开更多
关键词 Markov games reinforcement learning collective rationality policy.
在线阅读 下载PDF
Co op erative Iterative Learning Control of Linear Multi-agent Systems with a Dynamic Leader under Directed Top ologies 被引量:1
10
作者 PENG Zhou-Hua WANG Dan WANG Hao WANG Wei 《自动化学报》 EI CSCD 北大核心 2014年第11期2595-2601,共7页
关键词 迭代学习控制器 LYAPUNOV-KRASOVSKII泛函 多智能体系统 领袖 线性 多代理系统 输出信息 未知输入
在线阅读 下载PDF
玻尔兹曼优化Q-learning的高速铁路越区切换控制算法 被引量:3
11
作者 陈永 康婕 《控制理论与应用》 北大核心 2025年第4期688-694,共7页
针对5G-R高速铁路越区切换使用固定切换阈值,且忽略了同频干扰、乒乓切换等的影响,导致越区切换成功率低的问题,提出了一种玻尔兹曼优化Q-learning的越区切换控制算法.首先,设计了以列车位置–动作为索引的Q表,并综合考虑乒乓切换、误... 针对5G-R高速铁路越区切换使用固定切换阈值,且忽略了同频干扰、乒乓切换等的影响,导致越区切换成功率低的问题,提出了一种玻尔兹曼优化Q-learning的越区切换控制算法.首先,设计了以列车位置–动作为索引的Q表,并综合考虑乒乓切换、误码率等构建Q-learning算法回报函数;然后,提出玻尔兹曼搜索策略优化动作选择,以提高切换算法收敛性能;最后,综合考虑基站同频干扰的影响进行Q表更新,得到切换判决参数,从而控制切换执行.仿真结果表明:改进算法在不同运行速度和不同运行场景下,较传统算法能有效提高切换成功率,且满足无线通信服务质量QoS的要求. 展开更多
关键词 越区切换 5G-R Q-learning算法 玻尔兹曼优化策略
在线阅读 下载PDF
基于MDP和Q-learning的绿色移动边缘计算任务卸载策略
12
作者 赵宏伟 吕盛凱 +2 位作者 庞芷茜 马子涵 李雨 《河南理工大学学报(自然科学版)》 北大核心 2025年第5期9-16,共8页
目的为了在汽车、空调等制造类工业互联网企业中实现碳中和,利用边缘计算任务卸载技术处理生产设备的任务卸载问题,以减少服务器的中心负载,减少数据中心的能源消耗和碳排放。方法提出一种基于马尔可夫决策过程(Markov decision process... 目的为了在汽车、空调等制造类工业互联网企业中实现碳中和,利用边缘计算任务卸载技术处理生产设备的任务卸载问题,以减少服务器的中心负载,减少数据中心的能源消耗和碳排放。方法提出一种基于马尔可夫决策过程(Markov decision process,MDP)和Q-learning的绿色边缘计算任务卸载策略,该策略考虑了计算频率、传输功率、碳排放等约束,基于云边端协同计算模型,将碳排放优化问题转化为混合整数线性规划模型,通过MDP和Q-learning求解模型,并对比随机分配算法、Q-learning算法、SARSA(state action reward state action)算法的收敛性能、碳排放与总时延。结果与已有的计算卸载策略相比,新策略对应的任务调度算法收敛比SARSA算法、Q-learning算法分别提高了5%,2%,收敛性更好;系统碳排放成本比Q-learning算法、SARSA算法分别减少了8%,22%;考虑终端数量多少,新策略比Q-learning算法、SARSA算法终端数量分别减少了6%,7%;系统总计算时延上,新策略明显低于其他算法,比随机分配算法、Q-learning算法、SARSA算法分别减少了27%,14%,22%。结论该策略能够合理优化卸载计算任务和资源分配,权衡时延、能耗,减少系统碳排放量。 展开更多
关键词 碳排放 边缘计算 强化学习 马尔可夫决策过程 任务卸载
在线阅读 下载PDF
Fault-observer-based iterative learning model predictive controller for trajectory tracking of hypersonic vehicles 被引量:2
13
作者 CUI Peng GAO Changsheng AN Ruoming 《Journal of Systems Engineering and Electronics》 2025年第3期803-813,共11页
This work proposes the application of an iterative learning model predictive control(ILMPC)approach based on an adaptive fault observer(FOBILMPC)for fault-tolerant control and trajectory tracking in air-breathing hype... This work proposes the application of an iterative learning model predictive control(ILMPC)approach based on an adaptive fault observer(FOBILMPC)for fault-tolerant control and trajectory tracking in air-breathing hypersonic vehicles.In order to increase the control amount,this online control legislation makes use of model predictive control(MPC)that is based on the concept of iterative learning control(ILC).By using offline data to decrease the linearized model’s faults,the strategy may effectively increase the robustness of the control system and guarantee that disturbances can be suppressed.An adaptive fault observer is created based on the suggested ILMPC approach in order to enhance overall fault tolerance by estimating and compensating for actuator disturbance and fault degree.During the derivation process,a linearized model of longitudinal dynamics is established.The suggested ILMPC approach is likely to be used in the design of hypersonic vehicle control systems since numerical simulations have demonstrated that it can decrease tracking error and speed up convergence when compared to the offline controller. 展开更多
关键词 hypersonic vehicle actuator fault tracking control iterative learning control(ILC) model predictive control(MPC) fault observer
在线阅读 下载PDF
Graded density impactor design via machine learning and numerical simulation:Achieve controllable stress and strain rate 被引量:1
14
作者 Yahui Huang Ruizhi Zhang +6 位作者 Shuaixiong Liu Jian Peng Yong Liu Han Chen Jian Zhang Guoqiang Luo Qiang Shen 《Defence Technology(防务技术)》 2025年第9期262-273,共12页
The graded density impactor(GDI)dynamic loading technique is crucial for acquiring the dynamic physical property parameters of materials used in weapons.The accuracy and timeliness of GDI structural design are key to ... The graded density impactor(GDI)dynamic loading technique is crucial for acquiring the dynamic physical property parameters of materials used in weapons.The accuracy and timeliness of GDI structural design are key to achieving controllable stress-strain rate loading.In this study,we have,for the first time,combined one-dimensional fluid computational software with machine learning methods.We first elucidated the mechanisms by which GDI structures control stress and strain rates.Subsequently,we constructed a machine learning model to create a structure-property response surface.The results show that altering the loading velocity and interlayer thickness has a pronounced regulatory effect on stress and strain rates.In contrast,the impedance distribution index and target thickness have less significant effects on stress regulation,although there is a matching relationship between target thickness and interlayer thickness.Compared with traditional design methods,the machine learning approach offers a10^(4)—10^(5)times increase in efficiency and the potential to achieve a global optimum,holding promise for guiding the design of GDI. 展开更多
关键词 Machine learning Numerical simulation Graded density impactor Controllable stress-strain rate loading Response surface methodology
在线阅读 下载PDF
Real-Time Smart Meter Abnormality Detection Framework via End-to-End Self-Supervised Time-Series Contrastive Learning with Anomaly Synthesis
15
作者 WANG Yixin LIANG Gaoqi +1 位作者 BI Jichao ZHAO Junhua 《南方电网技术》 北大核心 2025年第7期62-71,89,共11页
The rapid integration of Internet of Things(IoT)technologies is reshaping the global energy landscape by deploying smart meters that enable high-resolution consumption monitoring,two-way communication,and advanced met... The rapid integration of Internet of Things(IoT)technologies is reshaping the global energy landscape by deploying smart meters that enable high-resolution consumption monitoring,two-way communication,and advanced metering infrastructure services.However,this digital transformation also exposes power system to evolving threats,ranging from cyber intrusions and electricity theft to device malfunctions,and the unpredictable nature of these anomalies,coupled with the scarcity of labeled fault data,makes realtime detection exceptionally challenging.To address these difficulties,a real-time decision support framework is presented for smart meter anomality detection that leverages rolling time windows and two self-supervised contrastive learning modules.The first module synthesizes diverse negative samples to overcome the lack of labeled anomalies,while the second captures intrinsic temporal patterns for enhanced contextual discrimination.The end-to-end framework continuously updates its model with rolling updated meter data to deliver timely identification of emerging abnormal behaviors in evolving grids.Extensive evaluations on eight publicly available smart meter datasets over seven diverse abnormal patterns testing demonstrate the effectiveness of the proposed full framework,achieving average recall and F1 score of more than 0.85. 展开更多
关键词 abnormality detection cyber-physical security anomaly synthesis contrastive learning time-series
在线阅读 下载PDF
Multi-QoS routing algorithm based on reinforcement learning for LEO satellite networks 被引量:1
16
作者 ZHANG Yifan DONG Tao +1 位作者 LIU Zhihui JIN Shichao 《Journal of Systems Engineering and Electronics》 2025年第1期37-47,共11页
Low Earth orbit(LEO)satellite networks exhibit distinct characteristics,e.g.,limited resources of individual satellite nodes and dynamic network topology,which have brought many challenges for routing algorithms.To sa... Low Earth orbit(LEO)satellite networks exhibit distinct characteristics,e.g.,limited resources of individual satellite nodes and dynamic network topology,which have brought many challenges for routing algorithms.To satisfy quality of service(QoS)requirements of various users,it is critical to research efficient routing strategies to fully utilize satellite resources.This paper proposes a multi-QoS information optimized routing algorithm based on reinforcement learning for LEO satellite networks,which guarantees high level assurance demand services to be prioritized under limited satellite resources while considering the load balancing performance of the satellite networks for low level assurance demand services to ensure the full and effective utilization of satellite resources.An auxiliary path search algorithm is proposed to accelerate the convergence of satellite routing algorithm.Simulation results show that the generated routing strategy can timely process and fully meet the QoS demands of high assurance services while effectively improving the load balancing performance of the link. 展开更多
关键词 low Earth orbit(LEO)satellite network reinforcement learning multi-quality of service(QoS) routing algorithm
在线阅读 下载PDF
FedCLCC:A personalized federated learning algorithm for edge cloud collaboration based on contrastive learning and conditional computing
17
作者 Kangning Yin Xinhui Ji +1 位作者 Yan Wang Zhiguo Wang 《Defence Technology(防务技术)》 2025年第1期80-93,共14页
Federated learning(FL)is a distributed machine learning paradigm for edge cloud computing.FL can facilitate data-driven decision-making in tactical scenarios,effectively addressing both data volume and infrastructure ... Federated learning(FL)is a distributed machine learning paradigm for edge cloud computing.FL can facilitate data-driven decision-making in tactical scenarios,effectively addressing both data volume and infrastructure challenges in edge environments.However,the diversity of clients in edge cloud computing presents significant challenges for FL.Personalized federated learning(pFL)received considerable attention in recent years.One example of pFL involves exploiting the global and local information in the local model.Current pFL algorithms experience limitations such as slow convergence speed,catastrophic forgetting,and poor performance in complex tasks,which still have significant shortcomings compared to the centralized learning.To achieve high pFL performance,we propose FedCLCC:Federated Contrastive Learning and Conditional Computing.The core of FedCLCC is the use of contrastive learning and conditional computing.Contrastive learning determines the feature representation similarity to adjust the local model.Conditional computing separates the global and local information and feeds it to their corresponding heads for global and local handling.Our comprehensive experiments demonstrate that FedCLCC outperforms other state-of-the-art FL algorithms. 展开更多
关键词 Federated learning Statistical heterogeneity Personalized model Conditional computing Contrastive learning
在线阅读 下载PDF
Controlling update distance and enhancing fair trainable prototypes in federated learning under data and model heterogeneity
18
作者 Kangning Yin Zhen Ding +1 位作者 Xinhui Ji Zhiguo Wang 《Defence Technology(防务技术)》 2025年第5期15-31,共17页
Heterogeneous federated learning(HtFL)has gained significant attention due to its ability to accommodate diverse models and data from distributed combat units.The prototype-based HtFL methods were proposed to reduce t... Heterogeneous federated learning(HtFL)has gained significant attention due to its ability to accommodate diverse models and data from distributed combat units.The prototype-based HtFL methods were proposed to reduce the high communication cost of transmitting model parameters.These methods allow for the sharing of only class representatives between heterogeneous clients while maintaining privacy.However,existing prototype learning approaches fail to take the data distribution of clients into consideration,which results in suboptimal global prototype learning and insufficient client model personalization capabilities.To address these issues,we propose a fair trainable prototype federated learning(FedFTP)algorithm,which employs a fair sampling training prototype(FSTP)mechanism and a hyperbolic space constraints(HSC)mechanism to enhance the fairness and effectiveness of prototype learning on the server in heterogeneous environments.Furthermore,a local prototype stable update(LPSU)mechanism is proposed as a means of maintaining personalization while promoting global consistency,based on contrastive learning.Comprehensive experimental results demonstrate that FedFTP achieves state-of-the-art performance in HtFL scenarios. 展开更多
关键词 Heterogeneous federated learning Model heterogeneity Data heterogeneity Contrastive learning
在线阅读 下载PDF
Formation-containment control for nonholonomic multi-agent systems with a desired trajectory constraint
19
作者 GU Xueqiang LU Lina +1 位作者 XIANG Fengtao ZHANG Wanpeng 《Journal of Systems Engineering and Electronics》 2025年第1期256-268,共13页
This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired traje... This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired trajectory. Input the fixed time-varying formation template to the leader and start executing, this process also needs to track the desired trajectory, and the follower needs to converge to the convex hull that the leader crosses. Firstly, the dynamic models of nonholonomic systems are linearized to second-order dynamics. Then, based on the desired trajectory and formation template, the FC control protocols are proposed. Sufficient conditions to achieve FC are introduced and an algorithm is proposed to resolve the control parameters by solving an algebraic Riccati equation. The system is demonstrated to achieve FC, with the average position and velocity of the leaders converging asymptotically to the desired trajectory. Finally, the theoretical achievements are verified in simulations by a multi-agent system composed of virtual human individuals. 展开更多
关键词 multi-agent systems nonholonomic dynamics formation-containment(FC)control desired trajectory constrains
在线阅读 下载PDF
A multi target intention recognition model of drones based on transfer learning
20
作者 WAN Shichang LI Hao +2 位作者 HU Yahui WANG Xuhua CUI Siyuan 《Journal of Systems Engineering and Electronics》 2025年第5期1247-1258,共12页
To address the issue of neglecting scenarios involving joint operations and collaborative drone swarm operations in air combat target intent recognition.This paper proposes a transfer learning-based intention predicti... To address the issue of neglecting scenarios involving joint operations and collaborative drone swarm operations in air combat target intent recognition.This paper proposes a transfer learning-based intention prediction model for drone formation targets in air combat.This model recognizes the intentions of multiple aerial targets by extracting spatial features among the targets at each moment.Simulation results demonstrate that,compared to classical intention recognition models,the proposed model in this paper achieves higher accuracy in identifying the intentions of drone swarm targets in air combat scenarios. 展开更多
关键词 DRONE intention recognition deep learning
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部