期刊文献+

基于Boot Strapping的中文实体关系自动生成 被引量:3

Boot Strapping-based Automatic Chinese Entities Relation Extraction
在线阅读 下载PDF
导出
摘要 针对中文信息抽取系统中建立提取事件模板的难点问题,基于Bootstrapping思想,提出一种简单、可行的实体关系自动生成方法,利用由种子词和种子模板组成的知识库建立学习器,采用标量聚类的方法,通过种子模板抽取更多的与种子词相似语义关系的特征词。在此基础上,利用最近邻居的原则,进而生成更多的抽取模板。丰富了知识库,为分析二元实体关系奠定基础,使得生成复杂的消息模板成为可能,同时极大地减轻手工建立模板的复杂度,有利于系统进行移植。 A method of Chinese automatic entities relation extraction is proposed in this paper based on Bootstrapping algorithm in order to solve the problem of event template extraction in Information Extraction (IE) systems. This method makes use of seed words and seed patterns to build a learning program, which extracts more characteristic words using Scalar Clusters. These characteristic words have semantic similarity with seed words. Then more extraction patterns could be learned automatically and added to the knowledge database, which is a foundation for analysis of two-entity relation and makes it possible that complex event template could be acquired automatically. This method reduces greatly the working load in manually constructing patterns and makes IE systems more feasible and portable.
出处 《微电子学与计算机》 CSCD 北大核心 2006年第12期15-18,共4页 Microelectronics & Computer
基金 国家863计划重大项目(2001AA114210)
关键词 BOOT Strapping 种子词 种子模板 标量聚类 Boot strapping, Seed word, Seed pattern, Scalar cluster
作者简介 张素香(1973-),博士研究生,讲师。研究方向为自然语言理解、机器学习。
  • 相关文献

参考文献7

  • 1C Aone,M Ramos-Santacruz.Rees:A large-scale relation and event extraction system.In Proceddings of the 6th Applied Natural Language Processing Conference,2000:76~83
  • 2Chieu H,H Ng.A maximum entroy approach to information extraction from semi-structured and free text,In Proceedings of the Enghteenth International Conference on Artificial Intelligence (AAAI-02),Edmonton,Canada.2002
  • 3Dmitry Zelenko,Chinatsu Aone,Anthony Richardella.Kernel methods for relation extraction.Journal of Machine Learning Research 3,2003:1083~1106
  • 4Yangarber R,R Grishman,P Tapanainen,S Huttunen.Unsupervised discovery of scenario-level patterns for information extraction.In Proceedings of the Applied Natural Language Processing Conference (ANLP).Seattle,WA,2000
  • 5Roman Yangarber,Ralph Grishman.Machine learning of extraction patterns from unannotated corpora.Proc.Workshop on Machine.Learning and Information Extraction,AAAI,1999
  • 6袁里驰,钟义信.基于相似度的词聚类算法[J].微电子学与计算机,2005,22(8):93-95. 被引量:4
  • 7李宝敏.基于语义的Internet研究[J].微电子学与计算机,2005,22(9):130-133. 被引量:4

二级参考文献15

  • 1杜文华.语义网描述语言比较研究[J].情报杂志,2004,23(9):40-42. 被引量:8
  • 2John F Swn. Knowledge Representation. Thomson Publishng House, 2003.
  • 3Guus Schreiber. Knowledge Engineering and Management.MIT Press Publishng House, 2003.
  • 4Ido Dagan, et al. Context Word Similarity and Estimation From Sparse Data [J]. Computer Speech and Language,2001, 9(2): 123-152.
  • 5Firth, John Rupert. 1957. A Synopsis of Linguistic Theory 1930-1955 [C]. In Philological Society, Editor, Studies in Linguistic Analysis. Blackwell, Oxford, pages 1-32.Reprinted in Selected Papers of J. R. Firth, edited by F.Palmer. Longman, 1968.
  • 6Harris, Zelig S. Mathematical Structures of Language[M].New York: Wiley, 1965.
  • 7Cutting, D R Karger, D R Perdersen, J R Tukey, J W(1992). Scatter/garther: A Cluster-Based Approach to Browsing Large Document Collections[C]. In SIGIR 92.
  • 8Gao J Wang, H F, M Lee, K F (2003b). A Unifed Approach to Statistical Language Modeling for Chinese [C].ICASSP-2000, Istanbul, Turkey, June.
  • 9Lee Lillian. 2001. Similarity-Based approaches to Natural Language Processing. Ph.D. thesis,[D] Harvard University,Cambridge, MA.
  • 10Karov Yael, Shimon Edelman. Learning Similarity-Based Word Sense Disambiguation From Sparse Data.[C] In Proceedings of the Fourth Workshop on Very Large Corpora,Copenhagen, Denmark, 1999: 42-55.

共引文献6

同被引文献35

  • 1赵艳杰.数据挖掘方法在入侵检测系统中的应用[J].潍坊学院学报,2008,8(2):19-22. 被引量:2
  • 2苏成.基于数据挖掘的入侵检测技术综述[J].信息网络安全,2008(3):60-61. 被引量:2
  • 3刘群,张华平,俞鸿魁,程学旗.基于层叠隐马模型的汉语词法分析[J].计算机研究与发展,2004,41(8):1421-1429. 被引量:198
  • 4Bienman E, CloeteE. A. comparison of intrusion detection systems[J]. Computers & Security, 2006,20 (8):341-343.
  • 5黄伯荣,廖序东.现代汉语[M].3版.北京:高等教育出版社,2002:12.
  • 6Bach N,Badaskar S.A Review of Relation Extraction[D].Pittsburgh,USA:Carnegie Mellon School,2007.
  • 7Banko M,Cafarella M J,Soderland S,et al.Open Information Extraction from the Web[C]//Proceedings of the 20th International Joint Conference on Artifical Intelligence.New York,USA:ACM Press,2007:2670-2676.
  • 8Wu Fei,Weld D S.Open Information Extraction Using Wikipedia[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.New York,USA:ACM Press,2010:118-127.
  • 9Fader A,Soderland S,Etzioni O.Identifying Relations for Open Information Extraction[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing.New York,USA:ACM Press,2011:1535-1545.
  • 10Etzioni O,Fader A,Christensen J,et al.Open Information Extraction:The Second Generation[C]//Proceedings of the 22nd International Joint Conference on Artificial Intelligence.Berlin,Germany:Springer,2011:3-10.

引证文献3

二级引证文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部