期刊文献+

主题特征格分析:一种用户生成文本质量评估方法 被引量:9

o10TFLA: A Quality Analysis Framework for User Generated Contents
在线阅读 下载PDF
导出
摘要 本文设计了一种用户生成文本的质量分析框架.首先,基于主题分析构建商品类别主题特征集合.其次,利用主题特征与商品分类的强关联关系,构建形式化概念分析的形式背景,将分类-主题概念格化简并生成主题特征格,以此构建五个质量特征并生成质量评估模型.最后,在真实评论数据上的实验结果表明新方法具有更高预测精度. In this paper,we design a topic-features lattices analysis(TFLA)framework based on objectivity quality dimensions.Firstly,we apply the latent Dirichlet allocation(LDA)approach to get latent topics as topic-features for each goods categories.Secondly,we construct formal background based on the strong relationship between goods categories and topic-features.So we could get generalization and instantiation relationship among the topic-features through formal concept analysis(FCA).We employ domain knowledge and relationships among topic-features to define five objective quality features.Also,we use machine learning methods to build quality evaluation models based on these quality features.Experiment results on actual comment data sets show that our new quality models’prediction results are in agreement with the artificial quality tags in most cases.The best performances could get that the mean absolute error(MAE)is 0.7 and F-measure is 0.5,which is significantly better than the conventional quality prediction model based on support vector machine(SVM)classification.
作者 钟将 张淑芳 郭卫丽 李雪 ZHONG Jiang;ZHANG Shu-fang;GUO Wei-li;LI Xue(Key Laboratory of Dependable Service Computing in Cyber Physical Society,Ministry of Education,Changqing University,Chongqing 400030,China;College of Computer Science,Chongqing University,Chongqing 400030,China;;Chongqing College of Electronic Engineering,Chongqing 401331,China;School of Information Technology and Electrical Engineering,University of Queensland,Brisbane 4072,Australia)
出处 《电子学报》 EI CAS CSCD 北大核心 2018年第9期2201-2206,共6页 Acta Electronica Sinica
基金 国家863高技术研究发展计划(No.2015AA015308) 国家重点研发计划项目(No.2017YFB1402401) 重庆市社会事业与民生保障科技创新专项(No.cstc2017shmsA20013)
关键词 用户评论 质量评估 主题特征 主题特征格 user comment data quality topic features lattices of topic-features
作者简介 钟将,男,1974年出生,重庆江津人.博士,教授,主要研究方向为数据挖掘及应用,网络信息安全.E-mail:zhongjiang@cqu.edu.cn;张淑芳,女,1972年出生,陕西澄城人,博士研究生,副教授,主要研究方向大数据挖掘和模拟计算.E-mail:roseymcn2000@foxmail.com;郭卫丽,女,1990年出生,河北行唐人,硕士,主要研究方向为数据挖掘、高性能计算.E-mail:870188993@qq.com;李雪,男,1955年出生,重庆沙坪坝人,博士,教授,主要研究方向为数据挖掘,大数据.E-mail:xueli@itee.uq.edu.au
  • 相关文献

参考文献3

二级参考文献26

  • 1Shehata S,et al. An efficient concept-based mining model for enhancing text clustering[ J]. IEEE Transactions on Knowledge and Data Engineering,2010,22(10) : 1360 - 1371.
  • 2Andrzejewski D, Buttler D. Latent topic feedback for informa- tion relrieval[ A ]. Proceedings of 17th ACM SIGKDD Interna- tional Conference on Knowledge Discovery and Data Mining (KDD) [ C] .New York: ACM press,2011.600- 608.
  • 3Wang X, et al. Topical N-grams:Phrase and topic discovery, with an application to information retrieval[ A]. Proc of the 7th IEEE. International Conference on Data Mining [ C ]. Omaha, Nebraska, USA, 2007.697 - 702.
  • 4Heinrich G. Parameter estimation for text analysis[ Z/OL]. http://www, arbylon, net/publications/text-est, pdf, 2005.
  • 5Ramage D, Heymann P. Clustering the tagged web[ A] .Proc of the Second ACM International Conference on Web Search and Data Mining[ C]. Barcelona, Spain,2009.54- 63.
  • 6Frey B J, Dueck D. Clustering by passing messages between data points[ J]. Science,2007,315(5814) :972- 976.
  • 7Newman D,Noh Y, Tally E. Evaluating topic models for digi- tal libraries[ A] .Proc of JCDL[ C]. Gold Coast, Queensland, Australia, 2010.215 - 224.
  • 8Vaidya J,Afluri V,Guo Q. The role mining problem:finding a minimal descriptive set of roles[A]. Proceedings of the 12th ACM symposium on Access control models and technologies [C] .New York:ACM,2007. 175- 184.
  • 9Vaidya J, Afluri V, Gun Q, et al. Edge-rmp: Minimizing admin- istrative assignments for role-based access control [J]. Journal of Computer Security, 2009,17 (2) 211 - 235.
  • 10Lu H, Vaidya J, Afluri V. Optimal boolean malrix decomposi- tion: Application to role engineering [ A ]. IEEE 24th Interna- tional Conference on Data Engineefing [C ]. Piscataway: 1EEE, 2008.297 - 306.

共引文献82

同被引文献67

引证文献9

二级引证文献44

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部