期刊文献+

基于知网的中文问题自动分类 被引量:41

HowNet Based Chinese Question Automatic Classification
在线阅读 下载PDF
导出
摘要 问答系统应能用准确、简洁的答案回答用户用自然语言提出的问题。问题分类是问答系统所要处理的第一步,分类结果的正确率直接影响后续工作的进行。本文提出了一种使用知网作为语义资源选取分类特征,并使用最大熵模型进行分类的新方法。该方法以问题的疑问词、句法结构、疑问意向词、疑问意向词在知网中的首义原作为分类特征。实验结果表明,在知网中选取的首义原能很好的表达问题焦点词的语义信息,可作为问题分类的一个主要特征。该方法能显著地提高问题分类的精度,大类和小类的分类精度分别达到了92.18%和83.86%。 Question answering system can provides a precise and concise answer to a natural language query. Question classification is the first task of Question Answering System, and the precision of question classification has great effect on the subsequent processes. In this paper, we present a new method on feature extraction which uses HowNet as semantic resource, and use Maximum Entropy Model to realize it. We choose the interrogative words, syntax structure, question focus words and their first sememes as classification feature. The experiment result'show that the first sememes in HowNet can express the main meaning of the question focus words, ,it can he as an important feature. This method can improve the precision of question classification: the classification precision of coarse classes and fine classes reaches 92.18% and 83.86% respectively.
出处 《中文信息学报》 CSCD 北大核心 2007年第1期90-95,共6页 Journal of Chinese Information Processing
基金 国家航空基金(05J54011) 辽宁省自然科学基金(20042004)
关键词 计算机应用 中文信息处理 问答系统 问题分类 知网 最大熵模型 分类特征 computer application Chinese information processing question answering system question classification HowNet maximum entropy model classification feature
作者简介 孙景广(1981-),男,硕士生,主要研究方向为自然语言处理。
  • 相关文献

参考文献10

  • 1郑实福,刘挺,秦兵,李生.自动问答综述[J].中文信息学报,2002,16(6):46-52. 被引量:166
  • 2Dell Zhang,Wee Sun Lee.Question classification using support vector machines[A].In:the 26th ACM SIGIR[C].2003.
  • 3Xin li,Dan Roth.Learning Question classification using support vector machines[A].In:the 26^th ACM SIGIR[C].2003.
  • 4Carlson,C.Cumby,J.Rosen,etal.The SNoW learning architecture[A].In:UIUCDCS-R-99-2101,UIUC Computer Science Department[C],2004,451-458.
  • 5Xin Li,Dan Roth.The Role of Semantic Information in Learning Question Classifiers[A].In:First International Joint Conference on Natural Language Processing[C],2004,451-458.
  • 6文勖,张宇,刘挺,马金山.基于句法结构分析的中文问题分类[J].中文信息学报,2006,20(2):33-39. 被引量:82
  • 7董振东 董强.[EB/OL].知网.http://www.keenage.com/zhiwang/c_zhiwang.html,[2005—03-01].
  • 8李荣陆,王建会,陈晓云,陶晓鹏,胡运发.使用最大熵模型进行中文文本分类[J].计算机研究与发展,2005,42(1):94-101. 被引量:96
  • 9R Adwait.A maximum entropy model for Part-of-Speech tagging[A].In:Proceedings of the Empirical Methods in Natural Language Processing Conference[C].Philadelphia,USA.1996.
  • 10Darroch,J.N,Ratcliff,D.Generalized Iterative Scaling for Log-Linear models[J].Annals of Mathematical Statistics 1972,43(5):1470-1480.

二级参考文献36

  • 1张宇,刘挺,文勖.基于改进贝叶斯模型的问题分类[J].中文信息学报,2005,19(2):100-105. 被引量:47
  • 2VladimirN Vapnik著 张学工译.统计学习理论的本质[M].北京:清华大学出版社,2000.1-125.
  • 3[8]Ulf Hermjakob. Parsing and Question Classification for Question Answering. Proceeding of the workshop on Open-Domain Question Answering at ACL-2001
  • 4[9]Eugene Agichtein, Steve Lawrence, Luis Gravano. Learning Search Engine Specific Query Transformations for Question Answering. ACM 2001,169- 178
  • 5[10]Soo-Min Kim, ae-Ho Baek, Sang-Beom Kim, Hae-Chang Rim Question Answering Considering Semantic Categories and Co-occurrence Density. Proceedings of the night Text Retrieval Conference (TREC-9)
  • 6[11]Marius Pasca, Sanda Harabagiu. High-Performance Question/Answering. 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( Sigir-01 ). New Orleans, LA. September 9 - 13,2001
  • 7[1]Ittycheriah,M. Franz,W-J Zhu,A. Ratnaparkhi. IBM's Statistical Question Answering System. Proceedings of the night Text Retrieval Conference (TREC-9)
  • 8[2]D. Elworthy. Question Answering Using a Large NLP System. Proceedings of the night Text Retrieval Conference (TREC-9)
  • 9[3]L. Wu,X-j Huang,Y. Guo,B. Liu,Y. Zhang. FDU at TREC-9:CLIR,Filtering and QA Tasks. Proceedings of the night Text Retrieval Conference(TREC-9)
  • 10[4]R.J. Cooper, S. M. Rüger. A Simple Question Answering System. Proceedings of the night Text Retrieval Conference(TREC-9)

共引文献329

同被引文献530

引证文献41

二级引证文献197

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部