世界中医药
文章摘要
引用本文:李灿1,镇可涵1,唐东昕2,解丹1.基于方剂数据集的知识图谱构建研究[J].世界中医药,2024,(09):.  
基于方剂数据集的知识图谱构建研究
Knowledge Mapping of TCM Formulas Based on Data Set
投稿时间:2023-04-06  
DOI:10.3969/j.issn.1673-7202.2024.09.019
中文关键词:  方剂  数据处理  知识图谱  规范化  命名实体识别  Neo4j图数据库  基于Transformer的双向编码模型-双向长短期记忆网络-条件随机场模型  中医药
English Keywords:TCM formula  Data processing  Knowledge map  Normalization  Named entity recognition  Neo4j graph database  BERT-BiLSTM-CRF model  TCM
基金项目:国家重点研发计划“中医药现代化”重点专项(2019YFC1712504):民族医防治常见病特色诊疗技法方药智慧医疗平台的建设;广东省中医药信息化重点实验室开放基金项目(2021502):基于知识属性的中医电子病历信息抽取技术研究
作者单位
李灿1,镇可涵1,唐东昕2,解丹1 1 湖北中医药大学信息工程学院武汉430065 2 贵州中医药大学第一附属医院贵阳550001 
摘要点击次数: 186
全文下载次数: 0
中文摘要:
      目的:构建基于方剂数据集的知识图谱,以系统性地展示方剂实体及其之间的关系。方法:首先建立方剂数据处理与知识图谱构建的规范化流程,获取方剂数据集,然后在4种常用命名实体识别模型中遴选最优模型进行实体抽取,最后利用Neo4j图数据库构建知识图谱。结果:最终遴选出基于Transformer的双向编码模型-双向长短期记忆网络-条件随机场(BERT-BiLSTM-CRF)模型,从数据集中抽取出症状、中西医病名、中医证候等医学实体,平均F1值达90.55%,形成了规范的方剂数据集并构建了方剂知识图谱。结论:利用本文方法抽取出的医学实体为中医药的临床实践和科学研究提供了系统性展示方剂实体及其之间关系的可靠数据基础。所建立的方剂知识图谱实现了中药方剂的知识检索,不仅有助于发现方剂数据中的潜在知识与内在关系,而且为中医药领域的信息整合和知识发现提供了坚实基础,推动中医药的现代化进程。
English Summary:
      To build a knowledge map based on the data set of TCM formulas,so as to systematically display formula entities and their relationships.Methods:Firstly,the normalized process of formula data processing and knowledge mapping were established to obtain the data set of formulas,and then the optimal model was selected from four commonly used named entity recognition models for entity extraction.Finally,the Neo4j graph database was used to build the knowledge map.Results:The bi-directional encoder representations from transformers-bi-directional long short-term memory-conditional random field(BERT-BiLSTM-CRF) model was finally selected to extract the medical entities such as symptoms,disease names of Chinese and Western medicines,and TCM syndromes from the data set,with an average F1 value of 90.55%.A normalized dataset and a knowledge map of TCM formulas were established.Conclusion:The medical entities extracted by this method provide a data basis for the clinical practice and scientific research of TCM to systematically display formula entities and their relationships.The established knowledge map realizes the knowledge retrieval of TCM formulas,which not only helps to discover the potential knowledge and internal relations in formula data but also lays a solid foundation for information integration and knowledge discovery,thus promoting the modernization of TCM.
查看全文  查看/发表评论  下载PDF阅读器