摘要
                
                    【目的/意义】提出基于Stacking集成学习的问答信息采纳行为识别策略,促进在线健康社区问答的精准化推送、助推数字化医疗服务高质量发展。【方法/过程】构建以集成学习方法和非集成学习方法为基学习器、以逻辑回归算法(LR)为元学习器的Stacking集成学习模型,比较单预测模型、同类预测模型组合、不同类预测模型组合的Stacking集成学习模型预测精度,选取“寻医问药”平台的慢性病问答构建数据集验证模型的优越性,并选取“快速问医生有问必答120”平台数据验证模型的可移植性。【结果/结论】Stacking集成模型相比于单预测模型能够更精准识别被采纳问答信息,模型具有较强的泛化性,可以适用于不同的在线健康社区。【创新/局限】本文基于Stacking集成思想构建两阶段预测模型,并借助机器学习构建最佳预测模型组合,显著提高在线健康社区问答信息采纳识别精度,但伴随问答信息积累,在线健康社区问答模式不断发展变化,考虑结合历史数据和每日更新数据的动态预测方法是未来研究工作重点。
                
                【Purpose/significance】In order to promote the accurate recommendation of online health community Q & A and boost the high-quality development of digital medical services, this paper proposes the information adoption forecasting model according to the stacking ensemble strategy which based on the massive online health community Q & A information.【Method/process】The stacking ensemble strategy chooses non-integrated learning method and integrated learning method as the first layer learners, while linear regression is used as the meta learner. We choose ’xywy.com’ to build the dataset and construct predict indicators, including text structure, online social communication record, professional authority. We compare the prediction accuracy between single prediction model and stacking ensemble strategy with different model combination. Then we select the data from ’120ask.com’ platform to verify the generalization of stacking ensemble strategy.【Result/conclusion】The results demonstrate that the stacking ensemble strategy has higher prediction accuracy and strong generalization than the single prediction model, which can be applied to different online health Q & A communities.【Innovation/limitation】Based on machine learning methods, the stacking ensemble strategy can significantly improve the prediction accuracy of information adoption for online health Q & A communities. At the same time, the communication patterns are changing in online health Q & A communities, and it is important to take the daily updated data into account to improve the predict accuracy by stacking ensemble strategy in the future research.
    
    
                作者
                    林萍
                    吕健超
                LIN Ping;LYU Jian-chao(School of Management,Nanjing University of Posts and Telecommunications,Nanjing 210003,China;Center of Innovation and Emergency Management in Information Industry,Research Base of Philosophy and Social Sciences in Jiangsu,Nanjing 210003,China;Jiangsu Joint Postgraduate Training Base Construction by Nanjing University of Posts and Telecommunication and Socool—Tech Co.,Ltd,Nanjing 210000,China)
     
    
    
                出处
                
                    《情报科学》
                        
                                CSSCI
                                北大核心
                        
                    
                        2023年第2期135-142,共8页
                    
                
                    Information Science
     
            
                基金
                    国家自然科学基金资助项目“层次网络结构DEA模型及其在医疗卫生系统绩效管理中的应用研究”(72171124)
                    江苏高校哲学社会科学研究重大项目“数字医疗时代在线健康社区问答信息采纳研究”(2022SJZD095)
                    江苏省学术学位研究生创新计划项目“老年人群健康信息采纳机制研究——基于社会认知与信息质量交互视角”(KYCX20_0837)。
            
    
                关键词
                    在线健康社区
                    Stacking集成策略
                    机器学习
                    信息采纳
                    信息识别
                
                        online health community
                        stacking ensemble strategy
                        machine learning
                        information adoption
                        information recognition
                
     
    
    
                作者简介
林萍(1977-),女,福建惠安人,博士研究生,副教授,硕士生导师,主要从事数据挖掘、网络舆情研究;吕健超(1996-),男,江苏南京人,硕士研究生,主要从事网络数据挖掘研究。