期刊文献+

基于组合核的中文实体关系抽取研究 被引量:3

Chinese Relation Extraction Based on Ensemble Kernel
在线阅读 下载PDF
导出
摘要 将基于特征向量的平面核和基于句法分析树的结构核组合,进行中文实体关系抽取。首先进行特征选择实验,为构造平面核中的特征向量选择最优特征集合,特征包括实体大类、实体子类、实体类别等实体信息以及实体对在句子中的前后词信息。在定义结构核函数时,从包含两个实体的句子中提取最短路径包含树(shortestpathtree,SPT),然后使用卷积树核函数来计算两棵SPT树的相似度。在ACERDC2005中文语料库上进行实体关系大类的抽取实验,其F值达到了68.50%,比两个单独核函数的方法分别提高4.36%和17.37%。同时,在组合核中也进行了特征选择实验,得到了最好关系抽取性能的F值为70.58%,说明单独平面核的最优特征集在组合核中未必最优。结果表明,本文利用实体语义信息构造平面核并与结构核组合,对于中文实体关系抽取具有较好的性能。 This paper combines the feature-based method and the Shortest Path Tree kernel method to extract relations between Chinese entities. First, the experiment to choose the best feature set for the feature-based method is carried on. The best feature set includes entity type, entity subtype and entity class, etc. To define the Shortest Path Tree kernel, we extract the Shortest Path Tree (SPT) from the sentence parsing result. Then we use the convolution kernel to calculate the similarity between two SPTs. The F-score based on the ensemble kernel on the ACE RDC 2005 corpus is 68.50% , which is higher than that based on every single kernel method by 4.36% and 17.37% respectively. Furthermore, we also choose the best feature set for the ensemble kernel method by experiment. The result shows that the best feature set for the feature based kernel method is not the best one for the ensemble kernel. The F-score based on the ensemble kernel is 70.58% using its best feature set. , The result presents that the ensemble kernel method, combined the tree kernel with the feature- based kernel benefit from the entity semantic information, performs better for extracting the relations between Chinese entities.
机构地区 大连理工大学
出处 《情报学报》 CSSCI 北大核心 2012年第7期702-708,共7页 Journal of the China Society for Scientific and Technical Information
基金 本文获得国家自然科学基金(71031002,61173101)资助.
关键词 关系抽取 组合核 平面核 卷积树核 relation extraction, ensemble kernel, feature-based kernel, convolution kernel
作者简介 李丽双,女,1967年生,获大连理工大学计算机应用专业硕士学位,副教授,主要研究方向:自然语言处理与文本挖掘、知识管理。E—mail:lilishuang314@163.com。
  • 相关文献

参考文献12

二级参考文献78

  • 1车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量:119
  • 2梁晗,陈群秀,吴平博.基于事件框架的信息抽取系统[J].中文信息学报,2006,20(2):40-46. 被引量:38
  • 3张素香,文娟,秦颖,袁彩霞,钟义信.实体关系的自动抽取研究[J].哈尔滨工程大学学报,2006,27(B07):370-373. 被引量:10
  • 4董静,孙乐,冯元勇,黄瑞红.中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-85. 被引量:55
  • 5Bunescu R.C,Raymond J.M.A Shortest Path Dependency Kernel for Relation Extraction[C]//EMNLP.Vancover,B.C,2005:724-731..
  • 6Zhang M.,Zhang J.,Su J.,and Zhou G.D.A Composite Kernel to Extract Relations between Entities with both Flat and Structured Features[C]//COLING-ACL.Sydney,Australia,2006:825-832.
  • 7Zhou G.D.,Zhang M.,Ji D.H.,and Zhu Q.M.Tree Kernel-based Relation Extraction with Context-Sensitive Structured Parse Tree Information[C]//EMNLP/CoNLL'2007.Prague Czech,2007:728-736.
  • 8Qian L.H.,Zhou G.D.,Zhu Q.M.,et al.Exploiting constituent dependencies for tree kernel-based semantic relation extraction[C]//COLING'2008.Manchester,UK,2008:697-704.
  • 9Che W.X.,Jiang,J.M.Su Z.,Pan Y.,and Liu T.Improved-Edit-Distance Kernel for Chinese Relation Extraction[C]//Proceedings of the 2nd international Joint Conference on Natural Language Processing(IJCNLP'05).Jeju Island,Korea,2005:134-139.
  • 10Huang R.H.,Sun L.,and Feng Y.Y.Study of Kernel-Based Methods for Chinese Relation Extraction[C]//LNCS(Lecture Notes in Computer Science).Springer Berlin/Heidelberg,2008:598-604.

共引文献179

同被引文献36

  • 1车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量:119
  • 2何婷婷,徐超,李晶,赵君喆.基于种子自扩展的命名实体关系抽取方法[J].计算机工程,2006,32(21):183-184. 被引量:25
  • 3董静,孙乐,冯元勇,黄瑞红.中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-85. 被引量:55
  • 4刘克彬,李芳,刘磊,韩颖.基于核函数中文关系自动抽取系统的实现[J].计算机研究与发展,2007,44(8):1406-1411. 被引量:60
  • 5Hendrickx I, Kim S N, Kozareva Z, et al. Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals[C]//Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions. [S. 1. ]. Association for Computational Linguistics, 2009:94-99.
  • 6Rink B, Harabagiu S. A generative model for unsupervised discovery of relations and argument classes from clinical texts [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.[S. 1. ] . Association for Compu- tational Linguistics, 2011:519-528.
  • 7Tratz S, Hovy E. ISI.- Automatic classification of relations between nominals using a maximum entropy classifier[C]//Pro- ceedings of the 5th International Workshop on Semantic Evaluation. [S. 1. ]: Association for Computational Linguistics, 2010:222-225.
  • 8Choi S P, Lee S, Jung H, et al. An intensive case study on kernel-based relation extraction[J]. Multimedia Tools and Appli- cations, 2013 : 1-27.
  • 9Punyakanok V, Roth D, Yih W. The importance of syntactic parsing and inference in semantic role labeling[J]. Computa- tional Linguistics, 2008,34 (2) : 257-287.
  • 10赵琦,刘建华,冯浩然.从ACE会议看信息抽取技术的发展趋势[J].现代图书情报技术,2008(3):18-23. 被引量:6

引证文献3

二级引证文献40

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部