摘要
目的通过构建决策树模型探索各种危险因素对脑卒中危险分级的影响,提高脑卒中数据统计分析与脑卒中防治的效率。方法采用C4.5决策树算法构建模型,以全国脑卒中筛查历史数据作为数据集,选取部分档案个人信息、常见病信息与医学检查信息等作为影响变量。结果构建出包含37个节点的决策树模型,其对危险分级数据总体预测准确率为85.03%,其中低、中、高危级别查准率与F1度量值分别为87.19%、75.65%、77.07%、0.93、0.64、0.57。结论模型可以较为准确的预测低、中、高危级别的档案,实现对全国脑卒中筛查数据中危险分级缺失字段进行插补,为脑卒中数据分析、疾病预防提供帮助。
Objective To explore the influence of various risk factors on the risk classification of stroke by constructing a decision tree model and improve the statistical analysis of stroke data and the efficiency of stroke prevention.Methods The C4.5 decision tree algorithm was used to construct the model.The data set is the historical data of national stroke screening.Part of the personal information,common disease information and medical examination information were selected from the data set as influencing variables.Results A decision tree model with 37 nodes was constructed and its overall prediction accuracy rate of risk classification data was 85.03%.The low,intermediate and high risk level accuracy and F1 scores were 87.19%,75.65%,77.07%,0.93,0.64 and 0.57.Conclusion The decision tree model could accurately predict the low,intermediate and high classification files and interpolate the risk grading missing fields in the national stroke screening data to help stroke data analysis and disease prevention.
作者
汪仁
边迪
王树奇
李雪萌
赵东升
Wang Ren;Bian Di;Wang Shuqi;Li Xuemeng;Zhao Dongsheng
出处
《中国疗养医学》
2019年第3期233-236,共4页
Chinese Journal of Convalescent Medicine
基金
国家自然科学基金重点项目(项目编号:71532014)
关键词
决策树
脑卒中筛查
危险分级
数据插补
Decision tree
Stroke screening
Data interpolation
Risk classification
作者简介
通信作者:边迪