摘要
人工智能时代赋予了海量中医医案更高的学术价值,但医案文本不规范、命名实体种类繁多,严重阻碍了医案的深入研究。本研究在回顾中医医案格式演变历程、分析医案结构要素、构建医案信息模型的基础上,研制了基于大语言模型医案实体抽取的提示词,探索基于大语言模型的医案命名实体的自动化抽取过程,最终开发出医案文本结构化工具。本研究为中医医案结构化研究、大规模中医医案科学数据抽取探索了可行路径,为基于中医医案的人工智能研究提供数据基础。
The era of artificial intelligence has bestowed greater academic value upon a vast amount of TCM medical records.However,the non-standardization of medical record texts and the multitude of named entity types present significant obstacles to in-depth research on TCM medical records.Based on a review of the evolution of TCM medical record formats,analysis of structural elements in medical records,and the construction of a medical record information model,this study developed prompts for named entity extraction in medical records using large language models,and explored the automated extraction process of named entities in medical records based on large language models and ultimately developed a tool for structuring medical record texts.The study also explored feasible paths for the structured analysis of TCM medical records and the extraction of scientific data from large-scale TCM medical records,with the purpose to establish a data foundation for artificial intelligence research based on TCM medical records.
作者
李盼飞
杨小康
白逸晨
李海燕
LI Panfei;YANG Xiaokang;BAI Yichen;LI Haiyan(Institute of Information on Traditional Chinese Medicine,China Academy of Chinese Medical Sciences,Beijing 100700,China;School of Chinese Materia Medica,Beijing University of Chinese Medicine,Beijing 100029,China)
出处
《中国中医药图书情报杂志》
2024年第2期108-113,共6页
Chinese Journal of Library and Information Science for Traditional Chinese Medicine
基金
中国博士后科学基金面上项目(2023M743920)
中国中医科学院科技创新工程-中医药信息学创新团队(CI2021B002)
中国中医科学院基本科研业务费自主选题项目(ZZ160315)。
关键词
中医医案
大语言模型
命名实体抽取
医案信息模型
人工智能
TCM medical records
large language models
named entity extraction
medical record information model
artificial intelligence
作者简介
通讯作者:李海燕,E-mail:lihy@mail.cintcm.ac.cn。