融合知识嵌入评分的强化学习多跳问答模型

Reinforcement learning multi-hop QA model with integrated knowledge embedding scoring

在线阅读下载PDF

导出

摘要为降低基于强化学习的知识图谱多跳问答模型中智能体搜索的盲目性,缓解模型训练中的稀疏奖励和延迟奖励,构建一种融合知识嵌入评分机制的强化学习多跳问答模型。创新地采用评分模块约束智能体的搜索方向,并构造一个集成该评分模块的奖励塑造策略,缓解稀疏奖励和延迟奖励。通过在PathQuestion和PathQuestion-Large数据集上与其它几种模型进行对比实验,展现了优于其它基准模型的准确性。通过消融实验,验证了评分模块和奖励塑造策略的有效性。通过收敛时长的验证实验,验证了评分模块在降低智能体搜索盲目性的有效性。 To reduce the randomness of agent searches in multi-hop question answering models based on reinforcement learning within knowledge graphs and to address issues of sparse and delayed rewards during training,a novel reinforcement learning multihop QA model integrated with knowledge embedding scoring was developed.The agent’s search path was directed by an innovative scoring module,and a reward shaping strategy that integrates this module was implemented to alleviate sparse and delayed rewards.Through comparative experiments on the PathQuestion and PathQuestion-Large datasets,the proposed model demonstrates superior accuracy over other benchmark models.Ablation experiments validate the efficacy of the scoring module and reward shaping strategy.Convergence duration experiments confirm the scoring module’s effectiveness in reducing the randomness of agent searches.

作者赵小康李书琴 ZHAO Xiao-kang;LI Shu-qin(College of Information Engineering,Northwest A&F University,Yangling 712100,China)

机构地区西北农林科技大学信息工程学院

出处《计算机工程与设计》北大核心 2025年第9期2450-2456,共7页 Computer Engineering and Design

关键词知识图谱知识问答强化学习知识图谱嵌入奖励塑造弱监督多跳问答 knowledge graph knowledge question answering reinforcement learning knowledge graph embedding reward shaping weak supervision multi-hop question answering

分类号 TP182 [自动化与计算机技术—控制理论与控制工程]

作者简介赵小康(1999-),男,山西临汾人,硕士研究生,研究方向为智能信息系统;通讯作者:李书琴(1965-),女,陕西渭南人,教授,博士生导师,研究方向为农业信息化与智能信息系统。E-mail:xk.zhao@nwafu.edu.cn。

引文网络
相关文献

1姜霞.建筑工程进度管理中关键路径法的应用优化[J].葡萄酒,2022(14):0253-0254.
2桂芳芳.公共图书馆阅读推广活动的品牌化塑造策略[J].兰台内外,2025(25):78-80.
3李慧杰,郑莹.社会资本如何影响灾害管理过程——基于应急文化视角的研究[J].复旦公共行政评论,2024(2):138-161.
4赵鑫,王润琦,龙连飞.人工智能视阈下中国纪录片对国家形象的塑造策略[J].现代视听,2025(8):20-27.
5陈曦尧,张颖,赵立凡,何坤,陈建平.基于注意力机制的强化学习冷负荷预测方法[J].计算机与数字工程,2025,53(6):1591-1597.
6卢海航.内部审计在行政事业单位内部控制中的作用[J].中国总会计师,2025(8):38-40.
7谢春辉,罗军,谢春双.广西职业院校专任教师职业形象的自我认知和塑造[J].西部现代职业教育研究,2025,4(2):53-61.
8叶贺,赵唯,徐羽贞,向忠,钱淼,于雅彤,金京.AQLTL:面向染色机异常检测的时间序列网络[J].纺织科学研究,2025(4):71-78.
9牛占伟.电视播出系统核心网络的安全策略设计研究[J].声屏花,2025(3):0154-0156.
10崔祎萌,王东阳,高晓彩.负性内容呈现对好奇心的影响:延迟呈现与时间不确定的作用[J].中国临床心理学杂志,2025,33(4):711-715.

计算机工程与设计

2025年第9期

浏览历史

内容加载中请稍等...

融合知识嵌入评分的强化学习多跳问答模型

相关作者

相关机构

相关主题

浏览历史