期刊文献+
共找到101,970篇文章
< 1 2 250 >
每页显示 20 50 100
基于Q-learning的专家权重优化与多级共识反馈决策
1
作者 杜秀丽 程伟龙 +2 位作者 高星 潘成胜 吕亚娜 《计算机应用研究》 北大核心 2026年第2期420-426,共7页
针对动态复杂多属性决策环境下大规模异构专家群体共识达成效率低、权重分配不精准的问题,提出一种基于Q-learning的权重优化与多级共识反馈方法,旨在提升共识水平与决策质量。该方法通过将专家权重动态调整建模为马尔可夫决策过程,利用... 针对动态复杂多属性决策环境下大规模异构专家群体共识达成效率低、权重分配不精准的问题,提出一种基于Q-learning的权重优化与多级共识反馈方法,旨在提升共识水平与决策质量。该方法通过将专家权重动态调整建模为马尔可夫决策过程,利用Q-learning实现权重自适应优化,并设计涵盖属性、方案、专家与群体四个层级的多级共识反馈机制,从而精准识别并协调不同来源的分歧。实验结果表明,该方法能够显著降低共识达成所需迭代次数,提升权重分配与专家专业度的匹配精度,并获得更可靠的方案排序结果,验证了其在大规模异构专家群体中的鲁棒性与计算效率。研究表明,所提方法为复杂多属性群体决策问题提供了有效的共识建模与决策支持工具。 展开更多
关键词 群体决策 Q-learning 多层共识反馈 动态权重调整
在线阅读 下载PDF
Machine learning-based investigation of uplift resistance in special-shaped shield tunnels using numerical finite element modeling 被引量:1
2
作者 ZHANG Wengang YE Wenyu +2 位作者 SUN Weixin LIU Zhicheng LI Zhengchuan 《土木与环境工程学报(中英文)》 北大核心 2026年第1期1-13,共13页
The uplift resistance of the soil overlying shield tunnels significantly impacts their anti-floating stability.However,research on uplift resistance concerning special-shaped shield tunnels is limited.This study combi... The uplift resistance of the soil overlying shield tunnels significantly impacts their anti-floating stability.However,research on uplift resistance concerning special-shaped shield tunnels is limited.This study combines numerical simulation with machine learning techniques to explore this issue.It presents a summary of special-shaped tunnel geometries and introduces a shape coefficient.Through the finite element software,Plaxis3D,the study simulates six key parameters—shape coefficient,burial depth ratio,tunnel’s longest horizontal length,internal friction angle,cohesion,and soil submerged bulk density—that impact uplift resistance across different conditions.Employing XGBoost and ANN methods,the feature importance of each parameter was analyzed based on the numerical simulation results.The findings demonstrate that a tunnel shape more closely resembling a circle leads to reduced uplift resistance in the overlying soil,whereas other parameters exhibit the contrary effects.Furthermore,the study reveals a diminishing trend in the feature importance of buried depth ratio,internal friction angle,tunnel longest horizontal length,cohesion,soil submerged bulk density,and shape coefficient in influencing uplift resistance. 展开更多
关键词 special-shaped tunnel shield tunnel uplift resistance numerical simulation machine learning
在线阅读 下载PDF
PowerVLM:基于Federated Learning与模型剪枝的电力视觉语言大模型
3
作者 欧阳旭东 雒鹏鑫 +3 位作者 何绍洋 崔艺林 张中超 闫云凤 《全球能源互联网》 北大核心 2026年第1期101-111,共11页
智能电网的快速发展衍生出多模态、多源异构的海量电力数据,给人工智能模型在复杂电力场景感知带来了挑战,同时行业数据的敏感性和隐私保护需求进一步限制了通用模型在电力领域的跨场景迁移能力。对此,提出了一种基于Federated Learnin... 智能电网的快速发展衍生出多模态、多源异构的海量电力数据,给人工智能模型在复杂电力场景感知带来了挑战,同时行业数据的敏感性和隐私保护需求进一步限制了通用模型在电力领域的跨场景迁移能力。对此,提出了一种基于Federated Learning与模型剪枝的电力视觉语言大模型。提出了一种基于类别引导的电力视觉语言大模型PowerVLM,设计了类别引导增强模块,增强模型对电力图文数据的理解和问答能力;采用FL的强化学习训练策略,在满足数据隐私保护下,降低域间差异对模型性能的影响;最后,提出了一种基于信息决议的模型剪枝算法,可实现低训练参数的模型高效微调。分别在变电巡检、输电任务、作业安监3种典型电力场景开展实验,结果表明,该方法在电力场景多模态问答任务中的METEOR、BLEU和CIDEr等各项指标均表现优异,为电力场景智能感知提供了新的技术思路和方法支撑。 展开更多
关键词 智能电网 人工智能 视觉语言大模型 Federated learning 模型剪枝
在线阅读 下载PDF
Insights and analysis of machine learning for benzene hydrogenation to cyclohexene
4
作者 SUN Chao ZHANG Bin 《燃料化学学报(中英文)》 北大核心 2026年第2期133-139,共7页
Cyclohexene is an important raw material in the production of nylon.Selective hydrogenation of benzene is a key method for preparing cyclohexene.However,the Ru catalysts used in current industrial processes still face... Cyclohexene is an important raw material in the production of nylon.Selective hydrogenation of benzene is a key method for preparing cyclohexene.However,the Ru catalysts used in current industrial processes still face challenges,including high metal usage,high process costs,and low cyclohexene yield.This study utilizes existing literature data combined with machine learning methods to analyze the factors influencing benzene conversion,cyclohexene selectivity,and yield in the benzene hydrogenation to cyclohexene reaction.It constructs predictive models based on XGBoost and Random Forest algorithms.After analysis,it was found that reaction time,Ru content,and space velocity are key factors influencing cyclohexene yield,selectivity,and benzene conversion.Shapley Additive Explanations(SHAP)analysis and feature importance analysis further revealed the contribution of each variable to the reaction outcomes.Additionally,we randomly generated one million variable combinations using the Dirichlet distribution to attempt to predict high-yield catalyst formulations.This paper provides new insights into the application of machine learning in heterogeneous catalysis and offers some reference for further research. 展开更多
关键词 machine learning heterogeneous catalysis hydrogenation of benzene XGBoost
在线阅读 下载PDF
Design of catalysts for electrochemical nitric oxide reduction to ammonia based on stacked ensemble learning
5
作者 DUAN Wenhao ZHAO Yan +2 位作者 WANG Huanran ZHU Yaming LI Xianchun 《燃料化学学报(中英文)》 北大核心 2026年第4期128-139,共12页
The electrocatalytic reduction of nitric oxide for ammonia synthesis(NORR)is a key green energy conversion technology.Its efficiency relies on high-performance electrocatalysts to enhance both ammonia yield(Y_(NH3))an... The electrocatalytic reduction of nitric oxide for ammonia synthesis(NORR)is a key green energy conversion technology.Its efficiency relies on high-performance electrocatalysts to enhance both ammonia yield(Y_(NH3))and Faradaic efficiency(F_(NH3)).However,conventional experimental methods for screening high-activity NORR catalysts often entail high resource consumption and time costs.Machine learning combined with SHAP feature analysis was employed to establish a stacked ensemble model that integrates multiple algorithms,to allow for a systematic investigation of the key descriptors governing NORR performance based on an experimental dataset.Evaluation of eight model algorithms revealed that the Stacked-SVR model achieved an R^(2)of 0.9223 and an RMSE of 0.0608 for predicting on the test set,whereas the Stacked-RF model achieved an R^(2)of 0.9042 and an RMSE of 0.0900 for predicting.The stacked ensemble model integrates the strengths of individual algorithms and demonstrates strong NORR prediction performance while avoiding overfitting.SHAP feature analysis results revealed that the Cu content in the catalyst composition has the most significant impact on catalytic performance.Moreover,the combination of the wet chemical reduction synthesis,a carbon fiber(CF)conductive substrate,and HCl electrolyte is more favorable for enhancing catalytic activity.Additionally,moderately lowering the working potential,controlling the electrolyte volume at low to medium levels,reducing catalyst loading,and increasing electrolyte concentration were found to synergistically enhance both and. 展开更多
关键词 NORR machine learning stacked model ammonia yield ammonia Faraday efficiency
在线阅读 下载PDF
Learning Performance of Nonlinear Classification Models Based on Markov Sampling
6
作者 HU Shulan WANG Yusheng +1 位作者 QIAN Zhiyong WANG Renhe 《应用概率统计》 北大核心 2026年第1期61-74,共14页
Nonlinear classification models are widely used in various fields due to their excellent performance in handling complex problems.This paper investigates the learning performance of nonlinear classification models bas... Nonlinear classification models are widely used in various fields due to their excellent performance in handling complex problems.This paper investigates the learning performance of nonlinear classification models based on Markov sampling,which builds upon the traditional framework using i.i.d.samples.Subsequently,we introduce a ueMC-NL algorithm,tailored specifically for nonlinear classification models,facilitating the production of ueMC samples from a finite dataset.Numerical investigations on the random forest and the MLP model reveal that nonlinear classification models utilizing ueMC samples yield lower misclassification rates compared to i.i.d.samples. 展开更多
关键词 learning performance Markov sampling nonlinear classification models uniformly ergodic Markov chain
在线阅读 下载PDF
Study on Machine Learning-based Prediction of Compressive Strength of Concrete with Different Waste Glass Powder Contents
7
作者 YU Daidong MA Yuwei +3 位作者 LI Gang WANG Aiqin HUANG Wei WANG Jingchao 《材料导报》 北大核心 2026年第6期111-125,共15页
The application and promotion of waste glass powder concrete(WGPC)cansignificantly alleviate the pressure of concrete material scarcity and environmental pollution.Compressive strength(CS)is a critical parameter for e... The application and promotion of waste glass powder concrete(WGPC)cansignificantly alleviate the pressure of concrete material scarcity and environmental pollution.Compressive strength(CS)is a critical parameter for evaluating the efficacy of WGPC.Unlike conventional testing methods,machine learning techniques offer precise and reliable predictions of concrete’s compressive strength,especially in its long-term mechanical properties.In this work,four models,namely Multiple Linear Regression(MLR),Back Propagation Neural Network(BPNN),Support Vector Regression(SVR),and Random Forest Regression(RFR)were employed.Furthermore,particle swarm optimization(PSO)algorithm and cross-validation techniques were applied to fine-tune the model parameters,striving for peak prediction performance.The results indicated that optimized models generally exhibit enhanced predictive accuracy compared to their basic counterparts.Notably,the PSO-RFR model excels among all evaluated models,showcasing superior performance on the testing dataset.It achieves a coefficient of determination(R^(2))of 0.9231,a mean absolute error(MAE)of 2.1073,and a root mean square error(RMSE)of 3.6903.When compared to experimental results,the PSO-RFR and PSO-BPNN models demonstrate exceptional predictive accuracy.Notably,the PSO-BPNN model exhibits the closest R^(2)values between its training and test sets.This close alignment of R^(2)values between the training and testing sets reflects the PSO-BPNN model’s superior generalization ability for unseen data.The findings present an efficient method for predicting concrete’s compressive strength,contributing to the sustainable development of concrete materials,and providing theoretical support for their research and application. 展开更多
关键词 waste glass powder concrete compressive strength machine learning particle swarm optimization algorithm VISUALIZATION
在线阅读 下载PDF
Improved expert system of rockburst intensity level prediction based on machine learning and data-driven:Supported by 1114 rockburst cases in 197 rock underground projects
8
作者 PANG Hong-li GONG Feng-qiang +1 位作者 GAO Ming-zhong DAI Jin-hao 《Journal of Central South University》 2026年第1期335-356,共22页
Accurate prediction of rockburst intensity levels is crucial for ensuring the safety of deep hard rock engineering construction.This paper introduced an expert system for rockburst intensity level prediction that empl... Accurate prediction of rockburst intensity levels is crucial for ensuring the safety of deep hard rock engineering construction.This paper introduced an expert system for rockburst intensity level prediction that employs machine learning algorithms as the basis for its inference rules.The system comprises four modules:a database,a repository,an inference engine,and an interpreter.A database containing 1114 rockburst cases was used to construct 357 datasets that serve as the repository for the expert system.Additionally,19 types of machine learning algorithms were used to establish 6783 micro-models to construct cognitive rules within the inference engine.By integrating probability theory and marginal analysis,a fuzzy scoring method based on the SoftMax function was developed and applied to the interpreter for rockburst intensity level prediction,effectively restoring the continuity of rockburst characteristics.The research results indicate that ensemble algorithms based on decision trees are more effective in capturing the characteristics of rockburst.Key factors for accurate prediction of rockburst intensity include uniaxial compressive strength,elastic energy index,the maximum principal stress,tangential stress,and their composite indicators.The accuracy of the proposed rockburst intensity level prediction expert system was verified using 20 engineering rockburst cases,with predictions aligning closely with the actual rockburst intensity levels. 展开更多
关键词 rock mechanics ROCKBURST rockburst intensity level prediction expert system machine learning supervised learning
在线阅读 下载PDF
A lightweight pure visual BEV perception method based on dual distillation of spatial-temporal knowledge
9
作者 LIU Bingdong YU Ruihang +1 位作者 XIONG Zhiming WU Meiping 《Journal of Systems Engineering and Electronics》 2026年第1期36-44,共9页
Bird's-eye-view(BEV)perception is a core technology for autonomous driving systems.However,existing solutions face the dilemma of high costs associated with multimodal methods and limited performance of vision-onl... Bird's-eye-view(BEV)perception is a core technology for autonomous driving systems.However,existing solutions face the dilemma of high costs associated with multimodal methods and limited performance of vision-only approaches.To address this issue,this paper proposes a framework named“a lightweight pure visual BEV perception method based on dual distillation of spatial-temporal knowledge”.This framework innovatively designs a lightweight vision-only student model based on Res Net,which leverages a dual distillation mechanism to learn from a powerful teacher model that integrates temporal information from both image and light detection and ranging(LiDAR)modalities.Specifically,we distill efficient multi-modal feature extraction and spatial fusion capabilities from the BEVFusion model,and distill advanced temporal information fusion and spatiotemporal attention mechanisms from the BEVFormer model.This dual distillation strategy enables the student model to achieve perception performance close to that of multi-modal models without relying on Li DAR.Experimental results on the nu Scenes dataset demonstrate that the proposed model significantly outperforms classical vision-only algorithms,achieves comparable performance to current state-of-the-art vision-only methods on the nu Scenes detection leaderboard in terms of both mean average precision(mAP)and the nu Scenes detection score(NDS)metrics,and exhibits notable advantages in inference computational efficiency.Although the proposed dual-teacher paradigm incurs higher offline training costs compared to single-model approaches,it yields a streamlined and highly efficient student model suitable for resource-constrained real-time deployment.This provides an effective pathway toward low-cost,high-performance autonomous driving perception systems. 展开更多
关键词 3D object detection bird's-eye-view(BEV) knowledge distillation multimodal fusion lightweight model
在线阅读 下载PDF
基于Q-Learning长尾延迟优化的SSD-SMR写缓存策略研究
10
作者 刘健 章步镐 +4 位作者 方匡弛 刘宣锋 孙国道 梁荣华 梁浩然 《计算机工程》 北大核心 2026年第3期287-298,共12页
随着全球数据规模的不断增大,如何以低成本的方式有效提升数据的访问性能是存储系统面临的一项重要挑战,使用低延迟、高带宽的固态硬盘(SSD)和低成本、高存储密度的叠瓦式磁盘(SMR)来构建缓存系统,成为一种有效的解决方案。但是,SMR固... 随着全球数据规模的不断增大,如何以低成本的方式有效提升数据的访问性能是存储系统面临的一项重要挑战,使用低延迟、高带宽的固态硬盘(SSD)和低成本、高存储密度的叠瓦式磁盘(SMR)来构建缓存系统,成为一种有效的解决方案。但是,SMR固有的机械运动和多磁道堆叠的特性导致其写性能较差,SSD中的脏数据频繁写回SMR所导致的大量读-合并-写(RMW)操作可能会引起严重的长尾延迟现象。为此,基于SSD-SMR混合存储架构提出一种结合强化学习Q-Learning算法的缓存替换优化策略。通过学习SMR设备的I/O负载状况与延迟之间的经验知识来控制对SMR的写入,当SMR负载较大时,通过控制缓存中脏数据的逐出来减少SMR因写回而产生的大量RMW操作,从而优化系统在不同负载下的尾部延迟开销。将Q-Learning算法与基于数据流行度的缓存算法LRU以及SMR感知的缓存算法SAC进行结合,使用真实企业Trace和YCSB生成的模拟Trace进行测试,实验结果表明,所提方法能够有效提升现有缓存算法的性能,可以降低57.06%的平均延迟和87.49%的尾部延迟。 展开更多
关键词 Q-learning算法 I/O负载 长尾延迟 缓存替换算法 混合存储
在线阅读 下载PDF
Deep reinforcement learning-based adaptive collision avoidance method for UAV in joint operational airspace
11
作者 Yan Shen Xuejun Zhang +1 位作者 Yan Li Weidong Zhang 《Defence Technology(防务技术)》 2026年第2期142-159,共18页
As joint operations have become a key trend in modern military development,unmanned aerial vehicles(UAVs)play an increasingly important role in enhancing the intelligence and responsiveness of combat systems.However,t... As joint operations have become a key trend in modern military development,unmanned aerial vehicles(UAVs)play an increasingly important role in enhancing the intelligence and responsiveness of combat systems.However,the heterogeneity of aircraft,partial observability,and dynamic uncertainty in operational airspace pose significant challenges to autonomous collision avoidance using traditional methods.To address these issues,this paper proposes an adaptive collision avoidance approach for UAVs based on deep reinforcement learning.First,a unified uncertainty model incorporating dynamic wind fields is constructed to capture the complexity of joint operational environments.Then,to effectively handle the heterogeneity between manned and unmanned aircraft and the limitations of dynamic observations,a sector-based partial observation mechanism is designed.A Dynamic Threat Prioritization Assessment algorithm is also proposed to evaluate potential collision threats from multiple dimensions,including time to closest approach,minimum separation distance,and aircraft type.Furthermore,a Hierarchical Prioritized Experience Replay(HPER)mechanism is introduced,which classifies experience samples into high,medium,and low priority levels to preferentially sample critical experiences,thereby improving learning efficiency and accelerating policy convergence.Simulation results show that the proposed HPER-D3QN algorithm outperforms existing methods in terms of learning speed,environmental adaptability,and robustness,significantly enhancing collision avoidance performance and convergence rate.Finally,transfer experiments on a high-fidelity battlefield airspace simulation platform validate the proposed method's deployment potential and practical applicability in complex,real-world joint operational scenarios. 展开更多
关键词 Unmanned aerial vehicle Collision avoidance Deep reinforcement learning Joint operational airspace Hierarchical prioritized experience replay
在线阅读 下载PDF
Detonation reaction zone width of CL-20-based aluminized explosive: machine learning prediction, theoretical calculation, and experimental characterization
12
作者 Ruipeng Liu Wen Pan +3 位作者 Linjing Tang Xianzhen Jia Weiqiang Pang Xiaojun Feng 《Defence Technology(防务技术)》 2026年第3期395-404,共10页
Investigating the detonation reaction zone structures of high explosives is significant for understanding detonation reaction mechanism.This study employed an integrated approach combining machine learning prediction,... Investigating the detonation reaction zone structures of high explosives is significant for understanding detonation reaction mechanism.This study employed an integrated approach combining machine learning prediction,theoretical calculation,and experimental characterization to determine the detonation reaction zone width of CL-20-based aluminized explosive.In this study,the detonation reaction zone refers to the reaction zone between the von Neumann(VN)peak and sonic point,which usually means the so-called detonation driving zone(DDZ).For the machine learning prediction,an ensemble model integrating Random Forest and Support Vector Regression was developed to predict the reaction zone width using a dataset of 19 publicly available samples.For the theoretical calculation,the Wood-Kirkwood(W-K)detonation theory model was utilized to implement numerical calculation of the reaction zone structures,incorporating chemical reaction kinetics to describe the detonation reaction progress.In experimental characterization,the Photon Doppler Velocimetry(PDV)was applied with LiF as the optical window to measure the particle velocity profile of detonation products and derive the reaction zone width.The results indicate that the reaction zone width values are 0.25 mm,0.28 mm,and 0.26 mm obtained from machine learning prediction,theoretical calculation,and experimental characterization,respectively.The corresponding velocities at the Chapman-Jouguet(CJ)point are 1,938 m/s,2,047 m/s,and 1,982 m/s,respectively.The maximum relative deviation in reaction zone width among three methods is approximately 7.7%,while that for CJ particle velocity is approximately 3.3%.These results from all three methods agree well within engineering error.This validates the effectiveness of integrating machine learning prediction,theoretical calculation and advanced experimental techniques for studying the detonation reaction zone structures of high explosives.This research provides insights into the detonation reaction mechanism and reaction zone characteristics of CL-20-based aluminized explosive. 展开更多
关键词 Detonation reaction zone width CL-20-Based aluminized explosive Machine learning Photon Doppler velocimetry(PDV) Theoretical calculation
在线阅读 下载PDF
基于Q-Learning的多模态自适应光伏功率优化组合预测
13
作者 隗知初 杨苹 +3 位作者 周钱雨凡 陈文皓 万思洋 崔嘉雁 《电力工程技术》 北大核心 2026年第1期115-124,163,共11页
针对光伏功率序列波动性强、随机性高的问题,文中提出一种基于Q-Learning的多模态自适应光伏功率优化组合预测模型。首先,采用鲸鱼优化算法的变分模态分解方法,将原始光伏功率序列分解成不同子模态,并通过集成特征筛选模型,确定各子模... 针对光伏功率序列波动性强、随机性高的问题,文中提出一种基于Q-Learning的多模态自适应光伏功率优化组合预测模型。首先,采用鲸鱼优化算法的变分模态分解方法,将原始光伏功率序列分解成不同子模态,并通过集成特征筛选模型,确定各子模态序列最敏感的气象因素。然后,构建反向传播神经网络、双向长短期记忆网络、门控循环单元网络和时间卷积网络4种基础预测模型。考虑到不同模型对不同频率特征的子序列预测能力不同,利用Q-Learning算法自适应选择各模态对应的最优基础模型组合方式。最后,将不同子模态的预测结果叠加重构,得到最终预测结果,并利用高分辨率光伏气象功率数据集进行验证。结果证明,文中所提出的基于Q-Learning的多模态自适应光伏功率优化组合预测模型,相较于单一模型的预测误差平均绝对误差下降了16.18%,均方误差下降了17.00%。 展开更多
关键词 鲸鱼优化算法 变分模态分解 Q-learning 功率预测 组合模型 光伏发电
在线阅读 下载PDF
基于随机森林与Q-learning融合的多元电力数据存储优化决策方法
14
作者 叶学顺 贾东梨 +2 位作者 周俊 唐英 贾梓豪 《科学技术与工程》 北大核心 2026年第3期1065-1074,共10页
大规模和多样的电力数据存储面临效率低和内存容量不足的瓶颈问题。数据索引和数据压缩等传统数据存储优化方法各有优劣势,如何有效应用于电力数据存储是目前研究的难点。为了解决这个问题,提出了一种融合随机森林和Q-learning的多元电... 大规模和多样的电力数据存储面临效率低和内存容量不足的瓶颈问题。数据索引和数据压缩等传统数据存储优化方法各有优劣势,如何有效应用于电力数据存储是目前研究的难点。为了解决这个问题,提出了一种融合随机森林和Q-learning的多元电力数据存储优化决策方法。该方法中的关键技术包括:首先提出了基于改进随机森林算法的存储优化策略决策模型,引入信息增益方法,综合评价数据存储时对数据库的数据访问频率、查询时间、存储速度以及数据冗余率等因素影响,做出数据直接存储、数据索引存储和数据压缩存储的存储优化方法策略决策;其次提出了基于改进Q-learning算法的数据存储算法决策模型,引入多尺度学习机制、优先经验放回机制和正负向奖励机制,决策数据索引存储时适用的索引算法以及数据压缩存储时适用的数据压缩算法。本方法有效融合了数据索引与数据压缩的技术优势,大幅提升数据存储效率并节约存储空间,为大规模多元电力数据管理提供新的解决方案。 展开更多
关键词 随机森林算法 Q-learning算法 数据存储优化方法 数据索引算法 数据压缩算法
在线阅读 下载PDF
Knowledge transfer in multi-agent reinforcement learning with incremental number of agents 被引量:4
15
作者 LIU Wenzhang DONG Lu +1 位作者 LIU Jian SUN Changyin 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2022年第2期447-460,共14页
In this paper, the reinforcement learning method for cooperative multi-agent systems(MAS) with incremental number of agents is studied. The existing multi-agent reinforcement learning approaches deal with the MAS with... In this paper, the reinforcement learning method for cooperative multi-agent systems(MAS) with incremental number of agents is studied. The existing multi-agent reinforcement learning approaches deal with the MAS with a specific number of agents, and can learn well-performed policies. However, if there is an increasing number of agents, the previously learned in may not perform well in the current scenario. The new agents need to learn from scratch to find optimal policies with others,which may slow down the learning speed of the whole team. To solve that problem, in this paper, we propose a new algorithm to take full advantage of the historical knowledge which was learned before, and transfer it from the previous agents to the new agents. Since the previous agents have been trained well in the source environment, they are treated as teacher agents in the target environment. Correspondingly, the new agents are called student agents. To enable the student agents to learn from the teacher agents, we first modify the input nodes of the networks for teacher agents to adapt to the current environment. Then, the teacher agents take the observations of the student agents as input, and output the advised actions and values as supervising information. Finally, the student agents combine the reward from the environment and the supervising information from the teacher agents, and learn the optimal policies with modified loss functions. By taking full advantage of the knowledge of teacher agents, the search space for the student agents will be reduced significantly, which can accelerate the learning speed of the holistic system. The proposed algorithm is verified in some multi-agent simulation environments, and its efficiency has been demonstrated by the experiment results. 展开更多
关键词 knowledge transfer multi-agent reinforcement learning(MARL) new agents
在线阅读 下载PDF
玻尔兹曼优化Q-learning的高速铁路越区切换控制算法 被引量:4
16
作者 陈永 康婕 《控制理论与应用》 北大核心 2025年第4期688-694,共7页
针对5G-R高速铁路越区切换使用固定切换阈值,且忽略了同频干扰、乒乓切换等的影响,导致越区切换成功率低的问题,提出了一种玻尔兹曼优化Q-learning的越区切换控制算法.首先,设计了以列车位置–动作为索引的Q表,并综合考虑乒乓切换、误... 针对5G-R高速铁路越区切换使用固定切换阈值,且忽略了同频干扰、乒乓切换等的影响,导致越区切换成功率低的问题,提出了一种玻尔兹曼优化Q-learning的越区切换控制算法.首先,设计了以列车位置–动作为索引的Q表,并综合考虑乒乓切换、误码率等构建Q-learning算法回报函数;然后,提出玻尔兹曼搜索策略优化动作选择,以提高切换算法收敛性能;最后,综合考虑基站同频干扰的影响进行Q表更新,得到切换判决参数,从而控制切换执行.仿真结果表明:改进算法在不同运行速度和不同运行场景下,较传统算法能有效提高切换成功率,且满足无线通信服务质量QoS的要求. 展开更多
关键词 越区切换 5G-R Q-learning算法 玻尔兹曼优化策略
在线阅读 下载PDF
基于MDP和Q-learning的绿色移动边缘计算任务卸载策略 被引量:1
17
作者 赵宏伟 吕盛凱 +2 位作者 庞芷茜 马子涵 李雨 《河南理工大学学报(自然科学版)》 北大核心 2025年第5期9-16,共8页
目的为了在汽车、空调等制造类工业互联网企业中实现碳中和,利用边缘计算任务卸载技术处理生产设备的任务卸载问题,以减少服务器的中心负载,减少数据中心的能源消耗和碳排放。方法提出一种基于马尔可夫决策过程(Markov decision process... 目的为了在汽车、空调等制造类工业互联网企业中实现碳中和,利用边缘计算任务卸载技术处理生产设备的任务卸载问题,以减少服务器的中心负载,减少数据中心的能源消耗和碳排放。方法提出一种基于马尔可夫决策过程(Markov decision process,MDP)和Q-learning的绿色边缘计算任务卸载策略,该策略考虑了计算频率、传输功率、碳排放等约束,基于云边端协同计算模型,将碳排放优化问题转化为混合整数线性规划模型,通过MDP和Q-learning求解模型,并对比随机分配算法、Q-learning算法、SARSA(state action reward state action)算法的收敛性能、碳排放与总时延。结果与已有的计算卸载策略相比,新策略对应的任务调度算法收敛比SARSA算法、Q-learning算法分别提高了5%,2%,收敛性更好;系统碳排放成本比Q-learning算法、SARSA算法分别减少了8%,22%;考虑终端数量多少,新策略比Q-learning算法、SARSA算法终端数量分别减少了6%,7%;系统总计算时延上,新策略明显低于其他算法,比随机分配算法、Q-learning算法、SARSA算法分别减少了27%,14%,22%。结论该策略能够合理优化卸载计算任务和资源分配,权衡时延、能耗,减少系统碳排放量。 展开更多
关键词 碳排放 边缘计算 强化学习 马尔可夫决策过程 任务卸载
在线阅读 下载PDF
Knowledge graph construction and complementation for research projects 被引量:1
18
作者 LI Tongxin LIN Mu +2 位作者 WANG Weiping LI Xiaobo WANG Tao 《Journal of Systems Engineering and Electronics》 2025年第3期725-735,共11页
Tracking and analyzing data from research projects is critical for understanding research trends and supporting the development of science and technology strategies.However,the data from these projects is often comple... Tracking and analyzing data from research projects is critical for understanding research trends and supporting the development of science and technology strategies.However,the data from these projects is often complex and inadequate,making it challenging for researchers to conduct in-depth data mining to improve policies or management.To address this problem,this paper adopts a top-down approach to construct a knowledge graph(KG)for research projects.Firstly,we construct an integrated ontology by referring to the metamodel of various architectures,which is called the meta-model integration conceptual reference model.Subsequently,we use the dependency parsing method to extract knowledge from unstructured textual data and use the entity alignment method based on weakly supervised learning to classify the extracted entities,completing the construction of the KG for the research projects.In addition,a knowledge inference model based on representation learning is employed to achieve knowledge completion and improve the KG.Finally,experiments are conducted on the KG for research projects and the results demonstrate the effectiveness of the proposed method in enriching incomplete data within the KG. 展开更多
关键词 research projects knowledge graph(KG) KG completion
在线阅读 下载PDF
Trajectory prediction algorithm of ballistic missile driven by data and knowledge 被引量:1
19
作者 Hongyan Zang Changsheng Gao +1 位作者 Yudong Hu Wuxing Jing 《Defence Technology(防务技术)》 2025年第6期187-203,共17页
Recently, high-precision trajectory prediction of ballistic missiles in the boost phase has become a research hotspot. This paper proposes a trajectory prediction algorithm driven by data and knowledge(DKTP) to solve ... Recently, high-precision trajectory prediction of ballistic missiles in the boost phase has become a research hotspot. This paper proposes a trajectory prediction algorithm driven by data and knowledge(DKTP) to solve this problem. Firstly, the complex dynamics characteristics of ballistic missile in the boost phase are analyzed in detail. Secondly, combining the missile dynamics model with the target gravity turning model, a knowledge-driven target three-dimensional turning(T3) model is derived. Then, the BP neural network is used to train the boost phase trajectory database in typical scenarios to obtain a datadriven state parameter mapping(SPM) model. On this basis, an online trajectory prediction framework driven by data and knowledge is established. Based on the SPM model, the three-dimensional turning coefficients of the target are predicted by using the current state of the target, and the state of the target at the next moment is obtained by combining the T3 model. Finally, simulation verification is carried out under various conditions. The simulation results show that the DKTP algorithm combines the advantages of data-driven and knowledge-driven, improves the interpretability of the algorithm, reduces the uncertainty, which can achieve high-precision trajectory prediction of ballistic missile in the boost phase. 展开更多
关键词 Ballistic missile Trajectory prediction The boost phase Data and knowledge driven The BP neural network
在线阅读 下载PDF
Fault-observer-based iterative learning model predictive controller for trajectory tracking of hypersonic vehicles 被引量:3
20
作者 CUI Peng GAO Changsheng AN Ruoming 《Journal of Systems Engineering and Electronics》 2025年第3期803-813,共11页
This work proposes the application of an iterative learning model predictive control(ILMPC)approach based on an adaptive fault observer(FOBILMPC)for fault-tolerant control and trajectory tracking in air-breathing hype... This work proposes the application of an iterative learning model predictive control(ILMPC)approach based on an adaptive fault observer(FOBILMPC)for fault-tolerant control and trajectory tracking in air-breathing hypersonic vehicles.In order to increase the control amount,this online control legislation makes use of model predictive control(MPC)that is based on the concept of iterative learning control(ILC).By using offline data to decrease the linearized model’s faults,the strategy may effectively increase the robustness of the control system and guarantee that disturbances can be suppressed.An adaptive fault observer is created based on the suggested ILMPC approach in order to enhance overall fault tolerance by estimating and compensating for actuator disturbance and fault degree.During the derivation process,a linearized model of longitudinal dynamics is established.The suggested ILMPC approach is likely to be used in the design of hypersonic vehicle control systems since numerical simulations have demonstrated that it can decrease tracking error and speed up convergence when compared to the offline controller. 展开更多
关键词 hypersonic vehicle actuator fault tracking control iterative learning control(ILC) model predictive control(MPC) fault observer
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部