摘要
在这个网络数据呈现爆炸式增长的时代,如何利用用户行为数据,对每位目标用户进行精准的项目推荐是一个极有价值的研究方向。协同过滤推荐算法作为最常见的推荐算法之一,如何对传统的协同过滤算法进行优化,便是该文的研究内容。针对传统协同过滤算法存在的数据稀疏、冷启动以及实时性问题。采用CFDP算法对项目集合进行聚类,并对采用Slope-One算法进行数据填充,有效地缓解了数据稀疏以及冷启动的问题。针对传统算法的实时性问题,引入了时间因子,对每一项预测评分都乘以时间权重,使得预测评分更加科学准确,解决了推荐系统的实时性问题。采用MovieLens 1M数据集分别对传统协同过滤算法以及改进协同过滤算法进行对比实验,得出新算法的平均绝对偏差MAE要小于传统的协同过滤推荐算法,表明改进算法有效地优化了传统算法。
In this era of exponential explosion of network data,how to use user behavior data to accurately recommend each project target is a valuable research direction.Collaborative filtering recommendation algorithm is one of the most classic recommendation algorithms.How to optimize the traditional collaborative filtering algorithm is the research content of this paper.Data sparse,cold start and real-time problems exist for traditional collaborative filtering algorithms.In this paper,the CFDP algorithm is used to cluster the project collection,and the Slope-One algorithm is used to fill the data,which effectively alleviates the problem of data sparseness and cold start.For the real-time problem of the recommendation system,this paper introduces the time factor.Each prediction score is multiplied by the time weight,making the prediction score more scientific and accurate,and solving the real-time problem of the recommendation system.The traditional collaborative filtering algorithm and the improved collaborative filtering algorithm are compared by MovieLens 1M dataset.The average absolute deviation MAE of the new algorithm is smaller than the traditional collaborative filtering recommendation algorithm,which indicates that the improved algorithm effectively optimizes the traditional algorithm.
作者
张凯辉
周志平
赵卫东
ZHANG Kaihui;ZHOU Zhiping;ZHAO Weidong(School of Electronic Information and Engineering,Tongji University,Shanghai 200093,China)
出处
《计算机工程与应用》
CSCD
北大核心
2020年第15期80-85,共6页
Computer Engineering and Applications
基金
国家重点研发计划(No.2017YFB1401600)。
关键词
协同过滤推荐算法
CFDP算法
时间因子
平均绝对偏差
collaborative filtering recommendation algorithm
CFDP algorithm
time factor
mean absolute deviation
作者简介
张凯辉(1995—),男,硕士,主要研究方向为人工智能、数据挖掘;通信作者:周志平(1961—),男,讲师,主要研究方向为计算机应用,E-mail:tjzhou168@sina.com;赵卫东(1965—),男,研究员,博士生导师,主要研究方向为模式识别、机器学习及计算机视觉。