摘要
【目的】解决中文电子病历实体识别中存在的一词多义、词识别不全等问题。【方法】采用深度学习模型RoBERTa-WWM-BiLSTM-CRF,改善中文电子病历的命名实体识别的效果并用4组实验进行对比,分析不同模型对中文电子病历实体识别的效果的影响。【结果】所提模型的实体识别效果F1值达到了0.8908。【局限】使用的数据集规模较小,部分科室实体识别效果较一般,如呼吸科F1值仅为0.8111。【结论】通过实验表明RoBERTa-WWM-BiLSTM-CRF模型更适用于中文电子病历命名实体识别任务,有效解决了中文电子病历命名实体识别中存在的一词多义和词识别不全的问题。
[Objective]This study tries to address the issues of polysemy and incomplete words facing entity recognition for Chinese Electronic Medical Records(EMR).[Methods]We constructed a deep learning model RoBERTa-WWM-BiLSTM-CRF to improve the named entity recognition of Chinese EMR.We conducted four rounds of experiments to compare their impacts on entity recognition.[Results]The highest F1 value of the new model reached 0.8908.[Limitations]The experiment data set is small,and the entity recognition results of some departments was not very impressive.For example,the F1 value of respiratory department was only 0.8111.[Conclusions]The RoBERTa-WWM-BiLSTM-CRF model could effectively conduct named entity recognition for Chinese electronic medical records.
作者
张芳丛
秦秋莉
姜勇
庄润涛
Zhang Fangcong;Qin Qiuli;Jiang Yong;Zhuang Runtao(School of Economics and Management,Beijing Jiaotong University,Beijing 100044,China;National Clinical Medical Research Center for Nervous System Diseases,Beijing Tiantan Hospital Affiliated to Capital Medical University,Beijing 100050,China;Community Health Service Center,Beijing Jiaotong University,Beijing 100044,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2022年第2期251-262,共12页
Data Analysis and Knowledge Discovery
关键词
命名实体识别
深度学习
电子病历
Named Entity Recognition
Deep Learning
Electronic Medical Records
作者简介
通讯作者:秦秋莉,ORCID:0000-0002-3787-8488 ,E-mail:qlqin@bjtu.edu.cn。