期刊文献+

融合语义及边界信息的中文电子病历命名实体识别 被引量:8

Named Entity Recognition for Chinese Electronic Medical Record by Fusing Semantic and Boundary Information
在线阅读 下载PDF
导出
摘要 中文电子病历数据专业性强,语法结构复杂,用于自然语言处理(NLP)的命名实体识别(NER)难度大。为了从电子病历数据中精确识别出医疗实体,提出了一种融合语义及边界信息的命名实体识别算法。首先,利用卷积神经网络(CNN)结构提取汉字图形信息,并与五笔特征拼接来丰富汉字的语义信息;然后,利用FLAT模型中的Lattice将医学词典作为字符潜在词组匹配文本信息;最后,将融入语义信息的Lattice模型用于中文电子病历命名实体识别。实验结果表明,该方法在Yidu-S4K数据集上的识别性能超过现有多种算法,且在Resume数据集上F1值可达到96.06%。 Chinese electronic medical record texts are highly professional, with complex grammar,it is difficult to use named entity recognition(NER) for natural language processing(NLP). In order to accurately identify medical entities from electronic medical record data, a named entity recognition algorithm combining semantic and boundary information is proposed. In this algorithm, the graphic information of Chinese characters is extracted by using the convolutional neural network(CNN) structure and the semantic information of the Chinese characters is enriched with Wubi features. And then the text information is matched with medical dictionary as a potential phrase of characters by using the Lattice in the FLAT model. Finally, the Lattice model incorporating semantic information is used for named entity recognition in Chinese electronic medical records. The experimental results show that this method has better recognition performance than other existing methods on the Yidu-S4K data set, and the F1 value on the Resume dataset is 96.06%.
作者 崔少国 陈俊桦 李晓虹 CUI Shaoguo;CHEN Junhua;LI Xiaohong(College of Computer and Information Science,Chongqing Normal University Shapingba Chongqing 401331)
出处 《电子科技大学学报》 EI CAS CSCD 北大核心 2022年第4期565-571,共7页 Journal of University of Electronic Science and Technology of China
基金 重庆市科技局项目(cstc2018jcyjAX0324,cstc2019jscx-mbdxX0061) 重庆市教委科技项目(KJQN201800539,KJQN202000510) 重庆市研究生科研创新项目(CYS22565) 教育部人文社科项目(18XJC880002)。
关键词 中文电子病历 FLAT 医学字典 命名实体识别 自然语言处理 Chinese electronic medical record FLAT medical dictionary named entity recognition(NER) natural language processing(NLP)
作者简介 崔少国(1974−),男,博士,教授,主要从事大数据与人工智能等方面的研究;通信作者:崔少国,E-mail:csg@cgnu.edu.cn。
  • 相关文献

同被引文献63

引证文献8

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部