期刊文献+

融合知识嵌入评分的强化学习多跳问答模型

Reinforcement learning multi-hop QA model with integrated knowledge embedding scoring
在线阅读 下载PDF
导出
摘要 为降低基于强化学习的知识图谱多跳问答模型中智能体搜索的盲目性,缓解模型训练中的稀疏奖励和延迟奖励,构建一种融合知识嵌入评分机制的强化学习多跳问答模型。创新地采用评分模块约束智能体的搜索方向,并构造一个集成该评分模块的奖励塑造策略,缓解稀疏奖励和延迟奖励。通过在PathQuestion和PathQuestion-Large数据集上与其它几种模型进行对比实验,展现了优于其它基准模型的准确性。通过消融实验,验证了评分模块和奖励塑造策略的有效性。通过收敛时长的验证实验,验证了评分模块在降低智能体搜索盲目性的有效性。 To reduce the randomness of agent searches in multi-hop question answering models based on reinforcement learning within knowledge graphs and to address issues of sparse and delayed rewards during training,a novel reinforcement learning multihop QA model integrated with knowledge embedding scoring was developed.The agent’s search path was directed by an innovative scoring module,and a reward shaping strategy that integrates this module was implemented to alleviate sparse and delayed rewards.Through comparative experiments on the PathQuestion and PathQuestion-Large datasets,the proposed model demonstrates superior accuracy over other benchmark models.Ablation experiments validate the efficacy of the scoring module and reward shaping strategy.Convergence duration experiments confirm the scoring module’s effectiveness in reducing the randomness of agent searches.
作者 赵小康 李书琴 ZHAO Xiao-kang;LI Shu-qin(College of Information Engineering,Northwest A&F University,Yangling 712100,China)
出处 《计算机工程与设计》 北大核心 2025年第9期2450-2456,共7页 Computer Engineering and Design
关键词 知识图谱 知识问答 强化学习 知识图谱嵌入 奖励塑造 弱监督 多跳问答 knowledge graph knowledge question answering reinforcement learning knowledge graph embedding reward shaping weak supervision multi-hop question answering
作者简介 赵小康(1999-),男,山西临汾人,硕士研究生,研究方向为智能信息系统;通讯作者:李书琴(1965-),女,陕西渭南人,教授,博士生导师,研究方向为农业信息化与智能信息系统。E-mail:xk.zhao@nwafu.edu.cn。
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部