期刊文献+

基于聚类核的半监督情感分类算法研究 被引量:4

Research on Semi-supervised Sentiment Classification Based on Cluster Kernel
在线阅读 下载PDF
导出
摘要 在互联网快速发展的今天,人类已经进入"大数据"时代,其中文本数据作为人类知识的载体,对于人类的进步与发展意义重大。如何运用大量未标记样本来提升文本情感分类的精度,也变得愈发重要。将半监督学习中的聚类核算法应用到情感分类问题中,给出基于聚类核的半监督情感分类算法。在标记样本和未标记样本上,建立加权无向图,求解聚类核,然后将该核函数用于SVM的情感分类器的训练上,完成情感分类工作。该方法直接将未标记样本所蕴含的信息融合到核中,不需要建立多个分类器,有效利用了未标记样本。实验结果表明,CKSVM算法在分类精度上明显优于基于Self-learning SVM和Co-training SVM的半监督情感分类算法,且在不同数据集上都有较好的适应性。 In the rapid development of the Intemet today,mankind has entered the era of big data. Text data as the carrier of human knowledge,is of great significance for human progress and development. So the usage of a large number of unlabeled samples to improve the accuracy of sentiment classification,has become more and more important. The kernel clustering method in semi supervised learning is applied to the emotion classification problem, and a semi supervised sentiment classification algorithm based on kernel clustering is proposed. A weighted undirected graph is built according to the labeled samples and unlabeled samples, solving the clustering kernel, and then the kernel function is used for the training of classifier SVM. This method directly uses the information contained by unlabeled samples into the kernel, no need to set up multiple classifiers, effective useagc of the unlabeled samples. Experimental results show that the CKSVM is better than that based on Self-learning SVM and Co-training SVM in classification accuracy ,with better adaptability on different data sets.
作者 郑文静 李雷
出处 《计算机技术与发展》 2016年第12期87-91,95,共6页 Computer Technology and Development
基金 国家自然科学基金资助项目(61070234 61071167 61501251) 南京邮电大学引进人才科研启动基金资助项目(NY214191)
关键词 半监督学习 聚类核 情感分类 semi-supervised learning clustering kernel graph sentiment classification
作者简介 郑文静(1990-),女,研究方向为机器学习、情感分类; 李雷,博士,教授,研究方向为智能信号处理、非线性分析与计算智能、机器学习。
  • 相关文献

参考文献2

二级参考文献32

  • 1Pang B, Lee I., Vaithyanathan S. Thumbs up? sentiment classification using machine learning techniques [C] //Proc of the 40th Annual Meeting on Association for Computational Linguistics (ACL'02). Stroudsburg, USA: Association for (2omputational Linguistics, 2002:79-86.
  • 2Turney P. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews [C] //Proc of the 40th Annual Meeting on Association for Computational Linguistics (ACL'02). Stroudsburg, USA: Association for Computational Linguistics, 2002, 417-424.
  • 3Pang B, Lee L. Opinion mining and sentiment analysis [J]. Foundations and Trdnds in Information Retrieval, 2008, 2(1/ 2) : 1-135.
  • 4Subasic P, Huettner A. Affect analysis of text using fuzzy semantic typing [J]. IEEE Trans on Fuzzy Systems, 2001, 9 (4) : 417-424.
  • 5Rakesh A newsgroups [C]//Proc (WWW'03) Rajagopalan S, Srikant R, et al. Mining using networks arising {tom social behavior of the 12th Int Conf on World Wide Web New York: ACM, 2003:529-535.
  • 6Zhou S, Chen Q, Wang X. Active deep networks for semi supervised sentiment classification [C2 //Proc of the 23rd Int Conf on Computational Linguistics: Posters (COI.ING'10). Stroudsburg, USA Association for Computational Linguistics, 2010: 1515-1523.
  • 7Xia R, Zong C, Li S. Ensemble of feature sets and classification algorithms for sentiment classification [J]. Information Sciences, 2011, 181(6): 1138-1152.
  • 8Li S, Hao J. Spectral Clustering-Based Semi-supervised Sentiment Classification G //LNCS 7713: Proc of the 8th Advanced Data Mining and Applications. Berlin: Springer, 2012: 271-283.
  • 9Mohar B. The Laplacian spectrum of graphs [J]. Graph Theory Combinatorics, and Applications, 1991, 2:871-898.
  • 10Mohar B, Juvan M. Graph Symmetry: Algebraic Methods and Applications[M]. Berlin: Springer, 1997:227-275.

共引文献22

同被引文献40

引证文献4

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部