期刊文献+

基于多兴趣对比的深度强化学习推荐模型

Deep Reinforcement Learning Recommendation Model Based on Multi-Interest Contrast
在线阅读 下载PDF
导出
摘要 深度强化学习(DRL)被广泛应用于推荐系统中,用于动态建模用户兴趣并最大化用户的累积收益。然而,用户反馈稀疏问题成为基于DRL的推荐算法面临的重要挑战之一。对比学习作为一种自监督学习方法,通过构造用户兴趣的多个视角增强其表示,进而缓解用户反馈稀疏问题。现有的对比学习方法通常利用基于启发式的增强策略,导致关键信息丢失,且未充分利用异构的交互信息。为解决这些问题,该文提出了基于多兴趣对比的深度强化学习推荐模型(MOCIR)。该模型包括一个对比表示模块和一个策略网络模块。对比表示模块利用异构信息网络(HIN)建模用户不同方面的局部兴趣,同时基于原始数据建模用户的全局兴趣,然后将同一用户的全局兴趣与局部兴趣、不同用户的全局兴趣与局部兴趣分别作为对比学习的正样本对和负样本对,以有效捕捉用户兴趣;策略网络模块用于在聚合用户状态表示后进行推荐;2个模块采用交替更新机制。在3个数据集上的实验结果表明,所提模型的推荐性能优于多个基于深度强化学习的模型,有效地解决了推荐中用户反馈稀疏问题。 Deep Reinforcement Learning(DRL)is widely applied in recommender systems to dynamically model user interests and maximize cumulative user benefits.However,the sparsity of user feedback has become a significant challenge for DRL-based recommendation algorithms.Contrastive learning,as a self-supervised learning method,enhances user interest representation by constructing multiple perspectives,thereby alleviating the issue of sparse user feedback.Existing contrastive learning methods typically rely on heuristic-based augmentation strategies,which often lead to the loss of key information and fail to fully utilize heterogeneous interaction data.To address these issues,this paper proposed a multi-interest oriented contrastive deep reinforcement learning recommendation(MOCIR)model.The model consists of two key modules:a contrastive representation module and a policy network module.The contrastive representation module utilizes a Heterogeneous Information Network(HIN)to model the user’s local interests from different aspects while capturing their global interests based on raw interaction data.It then treats the global and local interests of the same user as positive pairs and those of different users as negative pairs for contrastive learning,effectively enhancing user interest representation.The policy network module aggregates user state representations and generates recommendations.The two modules are trained using an alternating update mechanism.Experimental results on three benchmark datasets show that the proposed model outperforms several DRL-based models in recommendation performance,effectively addressing the problem of sparse user feedback in recommendations.
作者 刘慧婷 刘绍雄 王佳乐 赵鹏 LIU Huiting;LIU Shaoxiong;WANG Jiale;ZHAO Peng(School of Computer Science and Technology,Anhui University,Hefei 230601,Anhui,China;Institute of Artificial Intelligence,Hefei Comprehensive National Science Center,Hefei 230088,Anhui,China;Stony Brook Institute,Anhui University,Hefei 230039,Anhui,China)
出处 《华南理工大学学报(自然科学版)》 北大核心 2025年第9期11-21,共11页 Journal of South China University of Technology(Natural Science Edition)
基金 国家自然科学基金项目(62576003) 安徽省高校协同创新项目(GXXT-2022-040) 安徽省自然科学基金项目(2008085MF219,2108085MF212) 安徽省高校自然科学研究项目(KJ2021-A0040,KJ2021-A0043)。
关键词 多兴趣 强化学习 对比学习 异质信息网络 multi-interest reinforcement learning contrastive learning heterogeneous information network
作者简介 刘慧婷(1978-),女,博士,副教授,主要从事自然语言处理和个性化推荐研究。E-mail:htliu@ahu.edu.cn。
  • 相关文献

参考文献2

二级参考文献12

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部