摘要
目的 :探讨多变量缺失数据的不同处理方法对结果的影响。方法 :分别利用删除含缺失值的观察、简单填补、多重填补 3种方法对多变量中度缺失的 92 5例肝癌患者的临床资料进行统计分析并对其结果进行比较。结果 :不同方法所产生的结果差别较大。在 α=0 .0 5的水平下 ,利用多重填补处理的数据集分析得到影响肝癌患者生存时间的危险因素 :临床分期、肝硬化史、门脉癌栓、g- GT和 WBC;而用删除含缺失值方法得到的却是 :TNM分期、碘油剂量、AST、AL P;简单填补比多重填补多产生 3个危险因素 ,分别是 :TNM分期、AL P和 AFP。结论 :本资料采用删除含缺失值的观察的方法结果最差 ;简单填补相对较好 ,但容易降低标准误、减小 P值 ;而多重填补处理比较合理、科学。
Objective:To explore the results of different methods for managing multivariate missing data. Methods: Case deletion, simple imputation and multiple imputation were compared when used for analyzing the clinical data of 925 liver cancer patients with medium multivariate missing data. Results: There were differences among the 3 methods. When α=0.05, the risk factors influencing patients' survival time were clinical staging,history of hepatic cirrhosis, portal vein tumor thrombas, and levels of g-GT and WBC with multiple imputation, and were TNM staging, lipiodol dose, AST and ALP with case deletion. The 3 more factors of simple imputation were TNM staging, ALP and AFP compared with multiple imputation. Conclusion: Simple imputation is superior to case deletion in management of multivariate missing data but tends to make standard error smaller and P value lower. Multiple imputation is more reasonable and scientific than the other 2 methods.
出处
《第二军医大学学报》
CAS
CSCD
北大核心
2004年第9期1013-1016,共4页
Academic Journal of Second Military Medical University
关键词
多变量
缺失值
多重填补
肝肿瘤
multivariate
missing data
multiple imputation
liver neoplasms