摘要
构建大规模网络舆情演化仿真模型,对新冠疫情武汉重灾区与全国其他地区采取差异化的应急管理和舆情疏导具有指导价值。为实现主题细粒度的舆情情感演化仿真,将LDA(Latent Dirichlet Allocation)主题模型与BERT(Bidirectional Encoder Representations from Transformers)词向量深度融合,优化主题向量助力文本主题聚类;同时,在改进BERT预训练任务的基础上,叠加深度预训练任务,以提高模型在情感分类中的精确度。结果表明:在主题向量训练过程中,改进的BERT-LDA模型较原始LDA模型NPMI(Normalized Pointwise Mutual Information)值提升0.357;在疫情事件情感分类任务上,AUC(Area Under the Curve)值超过了99.6%,证明其能够有效运用于大规模网络舆情演化仿真。
The construction of a large-scale online public opinion evolution simulation model has guidance value for differentiated emergency management and public opinion guidance in the worst-hit areas in Wuhan and the other areas in China during the outbreak of the COVID-19. In order to realize the fine-grained simulation of the public sentiment evolution of the topic, the LDA topic model is deeply integrated with BERT word vector to optimize the topic vector and power the text topic clustering. At the same time, on the basis of improving BERT pre-training task, the deep pre-training task is superimposed to improve the accuracy of the model in emotion classification. The results show that the NPMI value of the improved BERT-LDA model is 0.357 higher than that of the original LDA model during the topic vector training. In terms of the emotional classification task of epidemic events, the AUC value exceeds 99.6%, which proves that the improved BERT-LDA model can be effectively applied to large-scale internet public opinion evolution simulation.
作者
庄穆妮
李勇
谭旭
毛太田
蓝凯城
邢立宁
Zhuang Muni;Li Yong;Tan Xu;Mao Taitian;Lan Kaicheng;Xing Lining(School of Public Management,Xiangtan University,Xiangtan 411105,China;School of Economics and Management,Changsha University,Changsha 410022,China;School of Software Engineering,Shenzhen Institute of Information Technology,Shenzhen 518172,China;College of Systems Engineering,National University of Defense Technology,Changsha 410022,China)
出处
《系统仿真学报》
CAS
CSCD
北大核心
2021年第1期24-36,共13页
Journal of System Simulation
基金
国家自然科学基金(72074033)
教育部人文社科基金(17YJCZH157)
广东省视频图像大数据公共安全应用创新团队项目
深圳市科技计划基础研究重点项目(JCYJ20200109141218676)。
关键词
新冠肺炎疫情
BERT-LDA模型
舆情演化仿真
差异性比较
corona virus disease 2019(COVID-19)
BERT-LDA model
evolution simulation of public opinion
difference comparison
作者简介
庄穆妮(1996-),女,硕士生,研究方向为网络舆情分析。E-mail:997737694@qq.com。