摘要
为解决传统用户影响力度量算法面向海量数据处理时运行速度下降的问题,提出一种基于隐性兴趣的用户综合影响力度量算法。通过隐含狄利克雷分配模型得到用户隐性兴趣偏好,根据困惑度和平均话题相似度综合确定最优兴趣话题数,并改进PageRank算法的用户兴趣传播转移率获得用户隐性兴趣传播影响力。在Spark计算框架的基础上,采用层次分析法且结合用户自身影响力和用户隐性兴趣传播影响力,计算得到最终用户影响力。实验结果表明,该算法综合考虑用户兴趣和用户自身影响因素,能够更客观高效地评估用户的真实影响力。
The speed of traditional user influence measurement algorithms is reduced when dealing with massive data.To address the problem,this paper proposes a comprehensive user influence measurement algorithm based on recessive interest.The Latent Dirichlet Allocation(LDA)model is used to obtain the recessive interests of the user,and the number of the optimal interest topics is determined based on the perplexity and the average topic similarity.Then,the transmission rate of user interests in the PageRank algorithm is improved to obtain the User Interest Factor(UIF).Finally,based on the Spark computing framework,the Analytic Hierarchy Process(AHP)is used to calculate the ultimate user influence by combining the influence of the user and UIF.Experimental results show that the proposed algorithm has a holistic consideration on user interests and the influence factors of the user,which enables it to provide more efficient and reasonable evaluation of the real influence of the user.
作者
童曼琪
黄江升
郭昆
TONG Manqi;HUANG Jiangsheng;GUO Kun(Fujian Provincial Key Laboratory of Spatial Data Mining and Information Sharing,Ministry of Education,Fuzhou University,Fuzhou 350002,China;Fujian Provincial Key Laboratory of Network Computing and Intelligent Information Processing,Fuzhou University,Fuzhou 350002,China;State Grid Info-Telecom Great Power Science and Technology Co.,Ltd.,Fuzhou 350003,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2020年第11期61-69,共9页
Computer Engineering
基金
国家自然科学基金(61300104)
福建省高等学校新世纪优秀人才支持计划(JA13021)
福建省杰出青年科学基金(2015J06014)
福建省高校产学合作项目(2017H6008)。
作者简介
童曼琪(1993—),女,硕士研究生,主研方向为数据挖掘;黄江升,工程师;通信作者:郭昆,副教授、博士,E-mail:gukn123@163.com。