期刊文献+

基于临床、能谱CT及CT影像组学特征构建机器学习模型术前预测结直肠癌患者KRAS基因状态

Machine Learning Model Based on Clinical,CT Spectral and CT Radiomics Features for Predicting KRAS Gene Status in Colorectal Cancer Patients Before Surgery
原文传递
导出
摘要 目的探讨基于临床、能谱CT及CT影像组学特征构建的不同机器学习模型对术前预测结直肠癌(CRC)患者Kirsten大鼠肉瘤病毒癌基因同源物(KRAS)基因状态的应用价值。方法回顾性搜集2020年6月至2023年12月经病理确诊为结直肠腺癌204例。依据KRAS基因检测结果分为KRAS野生型组(n=87)和KRAS突变型组(n=117)。于CT静脉期薄层增强图像勾画病灶感兴趣区(ROI),提取影像组学特征,按照7∶3比例随机划分为训练集和测试集,采用最小绝对收缩与选择算子(LASSO)算法筛选影像组学特征。由支持向量机(SVM)、极端梯度提升(XGBoost)及逻辑回归(LR)机器学习算法构建术前预测CRC患者KRAS基因亚型的模型(共六种,组学数据构建的SVM模型,XGBoost模型及LR模型;临床、能谱CT-CT影像组学联合数据构建的SVM模型,XGBoost模型及LR模型),绘制受试者工作特征(ROC)曲线,计算曲线下面积(AUC),评价各模型预测CRC患者KRAS基因亚型的效能;以DeLong检验比较各模型间效能差异。以决策曲线分析(DCA)评价临床、能谱CT及CT影像组学联合数据构建的三种机器学习模型临床应用价值。结果KRAS野生型组和KRAS突变型组间静脉期能谱参数碘基值(IC)、标准化碘基值(NIC)及有效原子序数(Eff-Z)具有统计学差异(P<0.05);年龄、性别及血清肿瘤标志物等临床指标均无显著差异(P>0.05)。相较于单纯CT影像组学数据,联合静脉期能谱参数后进一步提高了模型的预测效能,由CT影像组学数据及能谱CT-CT影像组学联合数据构建的SVM模型AUC值分别为0.810,0.866;准确率分别为0.758,0.790。由CT影像组学数据及能谱CT-CT影像组学联合数据构建的XGBoost模型AUC值分别为0.804,0.918;准确率分别为0.790,0.855。由CT影像组学数据及能谱CT-CT影像组学联合数据构建的LR模型AUC值分别为0.827,0.910;准确率分别为0.774,0.806。其中,能谱CT-CT影像组学联合数据构建的XGBoost模型的AUC值、准确率、灵敏度及特异度方面均达到了最优的水平,经Delong检验AUC值差异均有统计学意义(P均<0.05)。临床决策曲线显示,当风险阈值为25%~98%时,采用XGBoost模型术前预测KRAS基因状态的净受益最高且风险阈值范围更大。结论基于静脉期能谱参数特征联合CT影像组学特征构建的多种机器学习模型可以有效地在术前评估CRC患者KRAS基因状态;利用XGBoost算法获得的联合模型具有最佳效能。 Objective To explore the application value of different machine learning models to predict preoperative KRAS gene status in patients with colorectal cancer based on clinical,CT spectral and CT radiomics features.Methods From June 2020 to December 2023,a retrospective study was performed for the two hundred and four patients with colorectal adenocarcinoma through pathology confrmed in North China University of Science and Technology Affiliated Hospital.Based on KRAS gene test results,these cases were divided into the KRAS wild type(n=87)and KRAS mutant type(n=117)groups.The regions of interest of colorectal cancer were drawn on the venous enhancement thin images,and all radiomics features were further extracted.Randomly divided into the training group and the test group at a ratio of 7∶3,and the least absolute shrinkage and selection operator(LASSO)was used to screen the radiomics features.Support vector machine(SVM),eXtreme Gradient Boosting(XGBoost)and Logistic regression(LR)were constructed to predict KRAS gene subtype in colorectal cancer patients before surgery(a total of 6,SVM model,XGBoost model and LR model were constructed from the pure radiomics features;SVM model,XGBoost model and LR model were constructed from the combination of clinical,CT spectral and CT radiomics features).The receiver operating characteristic(ROC)curve was drawn,and the area under the curve(AUC)was calculated to evaluate the effectiveness of each model for predicting the KRAS gene subtype of colorectal cancer.Delong test was used to compare the effectiveness among 6 models.The clinical application value of the three machine learning models based on the combination of clinical,CT spectral and CT radiomics features were evaluated with decision curve analysis(DCA).Results The differences in the Iodine concentration(IC),Normalized iodine concentration(NIC)and Effective-Z(Eff-Z)of the venous phase energy spectral parameters were statistically significant between the wild-type KRAS and mutant KRAS groups(P<0.05);The two groups showed no significant difference in clinical parameters including age,sex,and biochemistry serum markers(P>0.05).Comparing with the pure radiomics data,it can be seen that the addition of clinical parameters further improveed the predictive efficiency of the model.The AUC values of the SVM model constructed from the pure CT radiomics data and the combination of CT spectral and CT radiomics features were 0.810 and 0.866,respectively;The accuracy were 0.758 and 0.790 respectively.The AUC values of the XGBoost model constructed from the pure CT radiomics data and the combination of CT spectral and CT radiomics features were 0.804 and 0.918,respectively.The accuracy were 0.790 and 0.855 respectively.The AUC values of the LR model constructed from the pure CT radiomics data and the combination of CT spectral and CT radiomics features were 0.827 and 0.910,respectively.The accuracy were 0.774 and 0.806 respectively.Wherein,the AUC,Accuracy,Sensitivity and Specificity of the XGBoost model constructed from the combination of CT spectral and CT radiomics features reached the optimal level,the differences in AUC values were statistically significant by Delong test(P<0.05).The DCA showed the XGBoost model had the highest net benefit and a wider range of threshold probabilities when the risk threshold was 25%-98%.Conclusion Multiple machine learning models based on the combination of CT spectral(at venous phase)and CT radiomics features can effectively evaluate KRAS gene status in colorectal cancer patients before surgery,the XGBoost algorithm has the best performance.
作者 李泽茂 王雅静 周伟 王星稳 陈伟彬 LI Zemao;WANG Yajing;ZHOU Wei(Medical Imaging Centre,North China University of Science and Technology,Tangshan,Hebei Province 063000,P.R.China)
出处 《临床放射学杂志》 北大核心 2024年第10期1737-1743,共7页 Journal of Clinical Radiology
基金 2023年度唐山市人才资助项目(编号:C202303027)。
关键词 结直肠癌 影像组学 CT能谱成像 KRAS基因 机器学习 Colorectal cancer Radiomics CT spectral imaging KRAS gene Machine learning
作者简介 通讯作者:陈伟彬。
  • 相关文献

参考文献6

二级参考文献41

共引文献93

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部