摘要
目的针对中医药数据的特点和数据挖掘的需求,开发中医药数据挖掘系统(TCM Miner)。方法TCM Miner基于中医药术语词表,围绕中医药数据挖掘工作中的数据清洗、集成、变换、选择等需求,分别构建用于数据清洗的数据拆分与合并、正异名替换、文本内容抽取、矩阵转换、中医药文本ETL等功能模块,用于数据挖掘的关联关系挖掘、聚类挖掘、贝叶斯处理等功能模块,及用于中医药翻译的专业文章翻译模块。结果TCM Miner有效地解决了中医药数据挖掘过程中数据非标准化、个性化等问题,能够辅助科研人员进行数据清洗、数据挖掘及中医药文章翻译,节省了科研人员的时间精力。结论TCM Miner为中医药数据清洗和分析提供了有效工具,为中医药传承创新提供有效途径。
Objective To develop TCM Miner in view of the characteristics of TCM data and the needs of data mining.Methods Based on the vocabulary of TCM terms,focusing on the data cleaning,integration,transformation,selection,and other requirements in the data mining of TCM,TCM Miner built function modules for data cleaning,such as data splitting and merging,positive synonym replacement,text content extraction,matrix conversion,TCM text ETL,etc.,functional modules for data mining,such as association relationship mining,cluster mining,Bayesian processing for data mining,and professional article translation modules for TCM translation.Results TCM Miner could effectively solve the problems of non-standardization and personalization of data in the process of data mining of TCM,which can be used to assist researchers in data cleaning,data mining and translation of TCM articles,saving researchers’time and energy.Conclusion TCM Miner provides an effective tool for TCM data cleaning and analysis,offering an effective way for TCM inheritance and innovation.
作者
王晰
李海燕
亢力
刘静
邢雁辉
杨策
杨乐
李小阳
雷蕾
WANG Xi;LI Hai-yan;KANG Li;LIU Jing;XING Yan-hui;YANG Ce;YANG Le;LI Xiao-yang;LEI Lei(Institute of Information on Traditional Chinese Medicine,China Academy of Chinese Medical Sciences,Beijing 100700,China)
出处
《中国中医药图书情报杂志》
2021年第4期1-6,共6页
Chinese Journal of Library and Information Science for Traditional Chinese Medicine
基金
中国中医科学院基本科研业务费自主选题(ZZ140304、ZZ140309、ZZ11-106)。
作者简介
第一作者:王晰,E-mail:3317669472@qq.com;通讯作者:雷蕾,E-mail:leilei@mail.cintcm.ac.cn。