摘要
[目的/意义]针对传统的情感词词典构造方法对新词判断准确率不够理想,领域拓展性较差等问题,提出基于句法依赖规则和词性特征的情感词识别模型。[方法/过程]以京东商城iPhone 6s的商品评论为语料,通过使用Stanford Parser句法分析工具、情感种子词典、基于评论语料的人工标注情感词典、手机对象词典等外部数据,构建基于句法依赖规则以及词性特征的情感词识别模型。[结果/结论]实验表明,该模型能有效识别手机领域中的情感词,准确率达到84.89%,且无需人工干预。[局限]情感词识别匹配模型规模偏小,模型召回率仍有进一步提升的空间。此外实验只在手机领域下进行探究,并未涉及其他领域。
[Purpose/significance] Aiming at the problems that the traditional method for constructing the sentiment words dictionary has poor performances in new words recognition and field expansion,this paper proposes a sentiment words recognition model based on the syntactic dependence rules and part of speech( POS) features. [Method/process] Using the product reviews of iPhone 6s in Jingdong Mall as the corpus,the paper constructs sentiment words recognition model based on syntactic dependency rules and POS characteristics through Stanford Parser parsing tools,sentiment seed dictionaries,artificial annotation sentiment dictionaries based on comment corpus,mobile object dictionaries and other external data. [Result/conclusion]Experiments show that the model can effectively identify the sentiment words in mobile phone,with an accuracy rate of 84.89% without any human intervention. [Limitations] Sentiment words recognition matching model is small in scale,and recall rate of the model still has room for further improvement. In addition,the experiment is conducted only on mobile phones and does not cover other fields.
出处
《情报理论与实践》
CSSCI
北大核心
2018年第5期137-142,共6页
Information Studies:Theory & Application
基金
国家社会科学基金项目"用户评论情感分析及其在竞争情报服务中的应用研究"的成果之一
项目编号:11CTQ022
关键词
句法依赖关系
句法分析
词性标注
情感词识别
syntactic dependency rules
syntactic analysis
POS tag
sentiment words recognition
作者简介
邓淑卿(ORCID:0000-0003—2820-6898),女,1994年生,硕士生。研究方向:网络用户情感分析等。;李玩伟,男,1994年生。研究方向:电商搜索引擎优化,搜索意图识别等。;徐健(ORCID:0000-0003-4886-4708,通讯作者),男,1977年生,博士,副教授。研究方向:网络用户情感分析,数据驱动的知识发现等。