摘要
随着信息技术的普及和推广,健康医疗大数据呈指数级增长,基于健康医疗大数据的临床真实世界研究日益受到关注。医院电子病历记录了真实世界下患者的诊疗全过程,是最能为临床决策提供支持的数据源之一。但电子病历数据中大量非结构化文本数据的存在,增加了数据处理难度,制约了基于电子病历数据研究的开展。急需将信息技术、人工智能等先进的方法用于非结构化电子病历数据的处理,以加速数据价值转化。本文总结了当前非结构化医学数据处理的常用方法,包括基于词典和规则的方法、基于传统机器学习和深度学习的方法和以本体为代表的基于认知模型的方法,探讨了非结构化电子病历数据处理时的标准化问题及透明化报告问题,展望了相关发展。
With the popularization and promotion of information technology,healthcare big data is growing exponentially,and clinical real-world research based on healthcare big data is receiving increasing attention.The hospital electronic medical record(EMR)records the whole process of diagnosis and treatment of patients in the"real-world",and is one of the most supportive data sources for clinical decision-making.However,the existence of a large number of unstructured text data in EMR data increases the difficulty of data processing and restricts the development of research based on EMR data.Advanced methods such as information technology and artificial intelligence need to be applied to the processing of unstructured EMR data to accelerate the transformation of data value.This paper summarizes the current common methods of unstructured medical data processing,including methods based on dictionaries and rules,methods based on traditional machine learning and deep learning,and methods based on cognitive models represented by ontology,and also discusses the problems of standardization and transparent reporting when processing unstructured EMR data and looks forward to the relevant development.
作者
阎思宇
李绪辉
陈沐坤
朱海锋
谭杰骏
高旷
王永博
黄桥
任相颖
靳英辉
王行环
Si-Yu YAN;Xu-Hui LI;Mu-Kun CHEN;Hai-Feng ZHU;Jie-Jun TAN;Kuang GAO;Yong-Bo WANG;Qiao HUANG;Xiang-Ying REN;Ying-Hui JIN;Xing-Huan WANG(Center for Evidence-Based and Translational Medicine,Zhongnan Hospital of Wuhan University,Wuhan 430071,China;School of Computer Science,Wuhan University,Wuhan 430072,China)
出处
《医学新知》
CAS
2023年第5期358-365,共8页
New Medicine
基金
国家自然科学基金面上项目(82174230)。
关键词
非结构化数据
电子病历
信息抽取
文本挖掘
自然语言处理
本体
真实世界数据
Unstructured data
Electronic medical record
Information extraction
Text mining
Natural language processing
Ontology
Real-world data
作者简介
通信作者:靳英辉,博士,副教授,硕士研究生导师,Email:jinyinghui0301@163.com;通信作者:王行环,博士,教授,博士研究生导师,Email:wangxinghuan1965@163.com。