摘要
目的探索自动化机器学习(AutoML)在预测重症监护病房(ICU)感染患者死亡中的应用。方法以2019~2020年四川省自贡市重症监护病房开源数据库中感染患者作为研究对象,基于H 2O平台建立AutoML死亡预测模型。算法包括梯度提升模型(GBM)、极端梯度增强算法(XGBoost)、广义线性模型(GLM)、深度学习(DL)、随机森林(RF)。数据集按照3∶1随机分为训练集和验证集,训练集用于模型的构建,验证集用于评价模型效果。模型表现指标为受试者工作特征(ROC)曲线下面积(AUC),此外通过变量重要性排序、Shapley加法解释图(SHAP)、部分依赖关系和独立模型局部解释(LIME)等方法来解释模型。结果共计1151和380例患者分别被纳入训练集和验证集来进行AutoML建模。在验证集中,基于XGBoost算法的AutoML模型表现最优,拥有最高的AUC(0.753)和最高的准确率(0.713),优于第2名GBM模型(AUC 0.748)、第3名GLM模型(AUC 0.745)。在XGBoost模型中,重要的变量包括诊断疾病、活化部分凝血活酶时间(APTT)、胱抑素C(CysC)、年龄、肌酸激酶同工酶(CK-MB)、脑利钠肽(BNP)、国际标准化比值(INR)、钾离子(K^(+))、白蛋白(ALB)、乳酸(Lac)。结论通过AutoML建模在预测ICU感染患者死亡结局应用中呈现较好表现。AutoML在临床研究中具有良好的应用前景,但该模型仍需要进行广泛的外部验证。
Objective To explore the application of automated machine learning(AutoML)for predicting the mortality of patients with infection in intensive care units(ICU).Methods Data of the patients with infection in intensive care units of Zigong city,Sichuan province in 2019 to 2020 were selected to develop AutoML mortality prediction models on H 2O platform.Algorithms included gradient boost model(GBM),extreme gradient boosting(XGBoost),generalized liner model(GLM),deep learning(DL)and random forest(RF).According to the ratio of 3∶1,selected data were randomly divided into training set(to build the model)and validation set(to evaluate the model).The performance of the models was evaluated by area under ROC curves(AUC).In addition,variable importance,Shapley additive explanation(SHAP),partial dependency plotting and local interpretable model-agnostic explanation(LIME)were used to explain the models.Results 1151 patients and 380 patients were included in the training set and validation set for AutoML modeling.AutoML model based on the XGBoost algorithm performed best among the five algorithms,which had the highest AUC of 0.753 and the highest accuracy of 0.713,followed by GBM(AUC 0.748)and GLM(AUC 0.745).Important variables in the XGBoost model included disease diagnosis,activated partial thromboplastin time(APTT),cystatin C(CysC),age,creatine kinase isoenzyme(CK-MB),brain natriuretic peptide(BNP),international normalized ratio(INR),potassium ion(K^(+)),albumin(ALB),lactic acid(Lac).Conclusions The AutoML-based models performed practically in predicting the mortality of ICU patients with infection.AutoML has good application prospects in clinical research,but this model still needs extensive external validation.
作者
周亦佳
何宇
薛雨涵
林嘉希
殷民月
韦瑶
朱锦舟
于倩倩
Zhou Yijia;He Yu;Xue Yuhan;Lin Jiaxi;Yin Minyue;Wei Yao;Zhu Jinzhou;Yu Qianqian(Suzhou Medical College,Soochow University,Suzhou 215000,China)
出处
《中国急救医学》
CAS
CSCD
2023年第10期768-775,共8页
Chinese Journal of Critical Care Medicine
基金
国家自然科学基金项目(82000540)
苏州市科教兴卫项目(KJXW2019001)
苏州大学医学部学生课外科研项目(2021YXBKWKY050)
江苏大学2021年度临床医学科技发展基金项目(JLY2021095)。
作者简介
周亦佳(2001-),女,本科生,E-mail:2030502023@stu.suda.edu.cn;通信作者:于倩倩(1985-),女,副主任医师,E-mail:yuqian850625@126.com。