摘要
该文针对疫苗接种的相关微博评论进行情感倾向分析,首先利用基于神经网络的Doc2vec模型训练文本向量,继而使用支持向量机(SVM)、随机森林(RF)、逻辑回归(LR)三种机器学习的算法完成情感分类任务,且分别讨论了三种算法在四种不同的Doc2vec模型设定方案下的分类表现。其中Distributed Memory version of Paragraph Vector (PV-DM)算法训练的文本向量中,RF表现最优,在方案一与方案二上其F1分数值均为最高,分别为87.24%、87.50%。基于Distributed Bag of Words version of Paragraph Vector (PV-DBOW)算法训练的文本向量中,SVM表现最优,在方案三与方案四上其F1分数值达到最高,分别为84.11%、83.91%。
Firstly, Doc2vec model based on neural network was used to train the text vector, and then three machine learning algorithms including Support Vector Machine (SVM), Random Forest (RF) and Logistic Regression (LR) were used to complete the emotion classification task. The classification performance of the three algorithms under four different Doc2vec model setting schemes is discussed respectively. Among the text vectors trained by the Distributed Memory version of Paragraph Vector (PV-DM) algorithm, RF performs best, and its F1 score is the highest in plan 1 and plan 2, which are 87.24% and 87.50%, respectively. Among the text vectors trained by the Distributed Bag of Words Version of Paragraph Vector (PV-DBOW) algorithm, SVM has the best performance, and its F1 score is the highest in scheme 3 and scheme 4, which are 84.11% and 83.91% respectively.
出处
《应用数学进展》
2022年第1期269-277,共9页
Advances in Applied Mathematics