摘要
当前实体关系抽取任务中普遍采用堆叠标注层的方式处理关系重叠问题.这种处理方式中很多关系对应标注层的计算是冗余的,会导致标注矩阵的稀疏化,影响模型的抽取效果.针对上述问题,文中提出基于过滤机制的链式实体关系抽取模型,先通过编码层获得文本的向量特征,再通过五阶段的链式解码结构顺序抽取关系三元组的主体、客体和关系.链式解码结构在避免标注矩阵稀疏化的同时,能够通过过滤机制完成实体和关系的自动对齐.在解码过程中:条件层规范化用于提高阶段间特征的融合程度,减少误差累积的影响;门控单元用于优化模型的拟合性能;首尾分离和关系修正模块用于关系集的多重校验.在公开数据集上的对比实验表明,文中模型取得较优性能.
Stacking labeling layer is commonly adopted to deal with relation overlap in current entity relation extraction task.In this method,the calculation of the labeling layers corresponding to many relations is redundant,resulting in sparse labeling matrix and weak extraction performance of the model.To solve these problems,a chain entity relation extraction model with filtering mechanism is proposed.Firstly,the vector feature of the text is obtained through the encoding layer,then the subject,object and relation of the relation triple are sequentially extracted through the five-stage chain decoding structure.The chain decoding structure avoids the sparse labeling matrix,and the automatic alignment of entities and relations is completed through the filtering mechanism.In the decoding process,conditional layer normalization is employed to improve the fusion degree of features between stages and reduce the impact of error accumulation.Gated unit is utilized to optimize the fitting performance of the model.Head-to-tail separation and relation correction module are applied to multiple verification of relation sets.Comparative experiments on public datasets show that the proposed model achieves better performance.
作者
夏鸿斌
沈健
刘渊
XIA Hongbin;SHEN Jian;LIU Yuan(School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi 214122;Jiangsu Key Laboratory of Media Design and Software Technology,Jiangnan University,Wuxi 214122)
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2023年第7期590-601,共12页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金项目(No.61972182)资助。
关键词
实体关系抽取
关系三元组
链式解码结构
过滤机制
门控单元
条件层规范化
Entity Relation Extraction
Relation Triples
Chain Decoding Structure
Filtering Mechanism
Gated Unit
Conditional Layer Normalization
作者简介
通信作者:夏鸿斌,博士,副教授,主要研究方向为个性化推荐系统、自然语言处理.E-mail:hbxia@jiangnan.edu.cn;沈健,硕士研究生,主要研究方向为自然语言处理.E-mail:1452112297@qq.com;刘渊,硕士,教授,主要研究方向为网络安全、社交网络.E-mail:lyuan1800@sina.com。