期刊文献+

基于图卷积网络的产业领域科技服务资源命名实体识别

Named Entity Recognition of Technology Service Resources in Industrial Fields Based on Graph Convolutional Networks
在线阅读 下载PDF
导出
摘要 针对产业领域科技服务资源中存在专业术语复杂、实体边界识别困难及不能有效提取文本远距离语义特征的问题,提出了一种基于图卷积网络(GCN)的产业领域科技服务资源命名实体识别方法。对现有的BERT-BiLSTM-CRF方法进行改进。首先,通过额外加入辅助特征词性特征对BERT层获取的字符向量进行扩展补充,并通过多头注意力机制设置权重来获取字符间的语义信息;然后,在双向长短期记忆网络(BiLSTM)基础上融入图卷积网络,用于挖掘字符及字符间关系的结构信息,将BiLSTM提取到的特征表示与字符间的依存关系矩阵拼接融合,充分获取文本的全局特征。最后将GCN层获取的特征向量送入条件随机场(CRF)模型进行序列解码,选取出全局最优序列,即为实体识别的结果。实验结果表明,该方法优于传统的命名实体识别方法,可以提高产业领域科技服务资源命名实体识别的准确率。 For the problems of complex terminology,difficult entity boundary recognition and ineffective extraction of long-distance semantic features in technology service resources of industrial field,a named entity recognition method of technology service resources in industrial fields is proposed based on graph convolutional network,and the existing BERT-BiLSTM-CRF mod⁃el is improved.Firsly,it complements the character vectors obtained from the BERT layer by adding part-of-speech features,and set weights to obtain semantic information between characters through a multi-headed attention mechanism.Then,the GCN is incor⁃porated after the BiLSTM for mining the structural information of characters and inter-character relationships,combining the feature representations extracted by BiLSTM with the dependency matrix between characters to fully get the global characteristic of the text.Finally,the feature vectors obtained from the GCN layer are fed into the CRF for sequence decoding,and the global optimal se⁃quence is selected,which is the result of entity recognition.It turns out that the method is better than the existing method and it can improve the accuracy of named entity recognition of technology service resources in industrial filelds.
作者 张硕 赵卓峰 刘晨 ZHANG Shuo;ZHAO Zhuofeng;LIU Chen(School of Information,North China University of Technology,Beijing 100144;Beijing Key Laboratory of Large-scale Stream Data Integration and Analysis Technology,Beijing 100144)
出处 《计算机与数字工程》 2023年第1期20-27,共8页 Computer & Digital Engineering
基金 国家重点研发计划项目(编号:2019YFB1405103) 国家自然基金重点项目“大数据环境下的大服务理论与方法研究”(编号:61832004)资助
关键词 产业领域 科技服务资源 命名实体识别 图卷积网络 industry domain technology service resources named entity recognition graph convolutional networks
作者简介 张硕,女,硕士,研究方向:领域知识图谱;赵卓峰,男,博士,研究员,研究方向:服务计算、大数据处理与分析;刘晨,男,博士,副研究员,研究方向:服务计算、大数据。
  • 相关文献

参考文献13

二级参考文献85

共引文献156

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部