期刊文献+

基于Stacking集成学习的在线健康社区问答信息采纳识别研究 被引量:7

Information Adoption Recognition in Online Question and Answer Health Communities Based on Stacking Ensemble Strategy
原文传递
导出
摘要 【目的/意义】提出基于Stacking集成学习的问答信息采纳行为识别策略,促进在线健康社区问答的精准化推送、助推数字化医疗服务高质量发展。【方法/过程】构建以集成学习方法和非集成学习方法为基学习器、以逻辑回归算法(LR)为元学习器的Stacking集成学习模型,比较单预测模型、同类预测模型组合、不同类预测模型组合的Stacking集成学习模型预测精度,选取“寻医问药”平台的慢性病问答构建数据集验证模型的优越性,并选取“快速问医生有问必答120”平台数据验证模型的可移植性。【结果/结论】Stacking集成模型相比于单预测模型能够更精准识别被采纳问答信息,模型具有较强的泛化性,可以适用于不同的在线健康社区。【创新/局限】本文基于Stacking集成思想构建两阶段预测模型,并借助机器学习构建最佳预测模型组合,显著提高在线健康社区问答信息采纳识别精度,但伴随问答信息积累,在线健康社区问答模式不断发展变化,考虑结合历史数据和每日更新数据的动态预测方法是未来研究工作重点。 【Purpose/significance】In order to promote the accurate recommendation of online health community Q & A and boost the high-quality development of digital medical services, this paper proposes the information adoption forecasting model according to the stacking ensemble strategy which based on the massive online health community Q & A information.【Method/process】The stacking ensemble strategy chooses non-integrated learning method and integrated learning method as the first layer learners, while linear regression is used as the meta learner. We choose ’xywy.com’ to build the dataset and construct predict indicators, including text structure, online social communication record, professional authority. We compare the prediction accuracy between single prediction model and stacking ensemble strategy with different model combination. Then we select the data from ’120ask.com’ platform to verify the generalization of stacking ensemble strategy.【Result/conclusion】The results demonstrate that the stacking ensemble strategy has higher prediction accuracy and strong generalization than the single prediction model, which can be applied to different online health Q & A communities.【Innovation/limitation】Based on machine learning methods, the stacking ensemble strategy can significantly improve the prediction accuracy of information adoption for online health Q & A communities. At the same time, the communication patterns are changing in online health Q & A communities, and it is important to take the daily updated data into account to improve the predict accuracy by stacking ensemble strategy in the future research.
作者 林萍 吕健超 LIN Ping;LYU Jian-chao(School of Management,Nanjing University of Posts and Telecommunications,Nanjing 210003,China;Center of Innovation and Emergency Management in Information Industry,Research Base of Philosophy and Social Sciences in Jiangsu,Nanjing 210003,China;Jiangsu Joint Postgraduate Training Base Construction by Nanjing University of Posts and Telecommunication and Socool—Tech Co.,Ltd,Nanjing 210000,China)
出处 《情报科学》 CSSCI 北大核心 2023年第2期135-142,共8页 Information Science
基金 国家自然科学基金资助项目“层次网络结构DEA模型及其在医疗卫生系统绩效管理中的应用研究”(72171124) 江苏高校哲学社会科学研究重大项目“数字医疗时代在线健康社区问答信息采纳研究”(2022SJZD095) 江苏省学术学位研究生创新计划项目“老年人群健康信息采纳机制研究——基于社会认知与信息质量交互视角”(KYCX20_0837)。
关键词 在线健康社区 Stacking集成策略 机器学习 信息采纳 信息识别 online health community stacking ensemble strategy machine learning information adoption information recognition
作者简介 林萍(1977-),女,福建惠安人,博士研究生,副教授,硕士生导师,主要从事数据挖掘、网络舆情研究;吕健超(1996-),男,江苏南京人,硕士研究生,主要从事网络数据挖掘研究。
  • 相关文献

参考文献16

二级参考文献119

  • 1李利群.健康传播运动中的健康风险信息理论研究[J].现代传播(中国传媒大学学报),2005,27(3):117-118. 被引量:12
  • 2秦美婷,汤书昆.健康信息的传播对改变个体行为之刍议[J].中国健康教育,2006,22(1):64-66. 被引量:7
  • 3张亮,王树梅,黄河燕,张孝飞.面向中文问答系统的问句句法分析[J].山东大学学报(理学版),2006,41(3):85-88. 被引量:5
  • 4张仰森,曹元大,俞士汶.基于规则与统计相结合的中文文本自动查错模型与算法[J].中文信息学报,2006,20(4):1-7. 被引量:34
  • 5Agichtein E, Castillo C, Donato D, et al. Finding high-quality content in social media [C] // Proceedings of the International Conference on Web Search and Web Data Mining Palo Alto. Cal- ifornia, USA, 2008.
  • 6Shah C, Pomerantz J. Evaluating and Predicting Answer Quality in Community QA [C] // SIGIR ' 10. Geneva, Switzerland, July 2010:19-23.
  • 7Broder A, Kumar R, Maghoul F, et al. Graph structure in the Web[J]. Computer Networks, 2000: 33 (1-6) :309-320.
  • 8Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine [J]. Computer Networks and ISDN Systems, 1998,30(1-7) :107-117.
  • 9Kleinberg J M. Authoritative sources in a hyperlinked environ- ment[J]. Journal of the ACM, 1999,46(5) :604-632.
  • 10Zhou Y,Croft W B. Document quality models for web ad hoe re- trieval[C]//Proceedings of the ACM Fourteenth Conference on Information and Knowledge Management. 2005:331-332.

共引文献254

同被引文献111

引证文献7

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部