摘要
目的:评价基于FP-Growth和Apriori算法的机器学习(machine learning,ML)联合基于Logistic回归分析的医学统计技术在直肠癌患者死亡风险预测中应用的可行性和价值。方法:筛选2008年01月至2018年01月间于我院接受诊疗的青壮年直肠癌患者1704例,包含5年内死亡患者324例(19.01%),收集其资料进行前瞻性研究。编制FP-Growth和Apriori算法程序,通过ML计算死亡患者基线资料间有效强关联规则。对全部患者行Logistic回归分析导致死亡的独立风险因素。参考医学统计预测结果来验证ML应用的可行性,评价ML联合医学统计的应用价值。结果:通过ML计算获得死亡患者基线资料间有效强关联规则9项,其前项包括:年龄(50~59岁)、性别(男)、癌胚抗原(carcinoembryonic antigen,CEA)(≥5μg/L)、肿瘤大小(>5 cm 3)、组织学分化程度(低分化)、N分期(N 2)、远端转移灶数目(≥3个)、手术(远端转移灶手术)。除缺失“Stage分期”和“转移淋巴结数目”外,ML同医学统计预测结果高度一致。在ML的应用具有一定可行性和可操作性基础上,联合医学统计使得预测结果更具逻辑性。结论:ML联合医学统计可用于预测青壮年直肠癌患者的死亡风险,具有一定应用和推广价值。
Objective:To evaluate the feasibility and value of applying machine learning(ML)based on FP-Growth and Apriori algorithms combined with medical statistics based on Logistic regression analysis in predicting the risk of death for rectal cancer patients.Methods:1704 young and middle-aged patients with rectal cancer who were diagnosed and treated in our hospital from January 2008 to January 2018 were selected,including 324 patients(19.01%)who died within 5 years.The information of the patients was collected for prospective study.The FP-Growth and Apriori algorithm program was compiled.The effective strong association rules between the information of dead patients were calculated through ML.The independent risk factors leading to death were analyzed through Logistic regression.The application feasibility of ML was verified referring to the medical statistical prediction results.The application value of ML combined with medical statistics was evaluated.Results:9 effective strong association rules between information of dead patients were obtained through ML calculation.The first items included age(50~59 years old),gender(male),CEA(≥5μg/L),tumor size(>5 cm 3),histological differentiation(poorly differentiated),N-stage(N 2),number of distal metastatic lesions(≥3),surgery(surgery for distal metastatic lesions).Except for the lack of"Stage"and"number of metastatic lymph node",the ML result was highly consistent with the that of medical statistics.Based on the feasibility and operability of the application of ML,combining medical statistics made the prediction result more logical.Conclusion:ML combined with medical statistics can be used to predict the risk of death for young and middle-aged rectal cancer patients.This technique has certain application and promotion value.
作者
曹宇星
田龙
王晨宇
CAO Yuxing;TIAN Long;WANG Chenyu(Department of Radiotherapy,the First Affiliated Hospital of Hebei Northern University,Hebei Zhangjiakou 075000,China)
出处
《现代肿瘤医学》
CAS
北大核心
2023年第19期3621-3625,共5页
Journal of Modern Oncology
基金
河北省张家口市重点研发计划项目(编号:2121182D)。
关键词
机器学习
医学统计
直肠癌
死亡
预测
machine learning
medical statistics
rectal cancer
death
forecast
作者简介
曹宇星(1994-),女,河北张家口人,硕士,主管护师,主要从事放射肿瘤学和医学统计学的相关研究工作。E-mail:1277473912@qq.com;通信作者:田龙(1988-),男,河北张家口人,副主任医师,主要从事放射肿瘤学的相关研究工作。E-mail:1277473912@qq.com。