期刊文献+

一种基于排序学习方法的查询扩展技术 被引量:7

A Query Expansion Method Based on Learning to Rank
在线阅读 下载PDF
导出
摘要 查询扩展作为一门重要的信息检索技术,是以用户查询为基础,通过一定策略在原始查询中加入一些相关的扩展词,从而使得查询能够更加准确地描述用户信息需求。排序学习方法利用机器学习的知识构造排序模型对数据进行排序,是当前机器学习与信息检索交叉领域的研究热点。该文尝试利用伪相关反馈技术,在查询扩展中引入排序学习算法,从文档集合中提取与扩展词相关的特征,训练针对于扩展词的排序模型,并利用排序模型对新查询的扩展词集合进行重新排序,将排序后的扩展词根据排序得分赋予相应的权重,加入到原始查询中进行二次检索,从而提高信息检索的准确率。在TREC数据集合上的实验结果表明,引入排序学习算法有助于提高伪相关反馈的检索性能。 Query Expansion is an important technique for improving retrieval performance. It uses some strategies to add some relevant terms to the original query submitted by the user, which could express the user's information need more exactly and completely. Learning to rank is a hot machine learning issue addressed in in information re- trieval, seeking to automatically construct ranking mode!s determining the relevance degrees between objects. This paper attempts to improve pseudo-relevance feedback by introducing learning to rank algorithm to re-rank expansion terms. Some term features are obtained from the original query terms and the expansion terms, learning from which we can get a new ranking list of expansion terms. Adding the expansion terms list to the original query, we can acquire more relevant documents and improve the rate of accuracy. Experimental results on the TREC dataset shows that incorporating ranking algorithms in query expansion can lead to better retrieval performance.
机构地区 大连理工大学
出处 《中文信息学报》 CSCD 北大核心 2015年第3期155-161,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金(61277370 61402075) 国家863高科技计划(2006AA01Z151) 辽宁省自然科学基金(201202031 2014020003) 教育部留学回国人员科研启动基金 高等学校博士学科点专项科研基金(20090041110002) 中央高校基本科研业务费专项资金
关键词 信息检索 查询扩展 伪相关反馈 排序学习 information retrieval query expansion pseudo-relevance feedback learning to rank
作者简介 徐博(1988-),博士研究生,主要研究领域为搜索引擎、机器学习、排序学习。E—mail:xub02011@mail.dlut.edu.cn 林鸿飞(1962-),博士,教授,博士生导师,主要研究领域为搜索引擎、文本挖掘、情感计算和自然语言理解。E-mail:hflin@dlut.edu.cn 林原(1983-),博士,讲师,主要研究领域为搜索引擎、机器学习,排序学习。E—mail:zhlin@dlut.edu.cn
  • 相关文献

参考文献14

  • 1G Cao, J Y Nie, S Robertson. Selecting good expan sion terms for pseudo-relevance feedback [C]//Pro- ceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Informa- tion Retrieval, Singapore, 2008: 243-250.
  • 2L Matthew, A James, C Bruce. Regression Rank: Learning to Meet the Opportunity of Descriptive Queries[C]//Proceedings of the 32th European Conference on IR Research, Toulouse, France, 2009: 90-101.
  • 3I. Matthew. An Improved Markov Random Field Mod- el for Supporting Verbose Queries[C]//Proceedings of SIGIR2009, American, 2009: 476-483.
  • 4C J Lee, R C Chen, S H Kao and P J Cheng. A Term Dependency-Based Approach for Query Terms Ranking [C]//Proceedings of the 18th ACM Conference on In- formation and Knowledge Management, Hong Kong, China, 2009: 1267-1276.
  • 5T Y Liu. Learning to Rank for Information Retrieval [J]. Foundations and Trends in Information Retrieval, 2009, 3(3): 225-331.
  • 6T Qin, T Y Liu, J Xu, et al. LETOR: Benchmark Collection for Research on Learning to Rank for Information Retrieval [C]//Proceedings of SIGIR 2007 Workshop on Learning to Rank for Information Re trieval (LR4IR 2007), Amsterdam, The Netherlands, 2007: 3-10.
  • 7Y Freund, R D Iyer, R E Schapire, et al. An Efficient Boosting Algorithm for Combining Preferences[J]. Journal of Machine Learning Research, 2003, 4:933- 969.
  • 8Y Cao, J Xu, T Liu, et al. Adapting ranking SVM to document retrieval [C]//Proceedings of SIGIR2006, Seattle, WA, USA, 2006:186-193.
  • 9T Joaehims. Optimizing search engines using click- through data[C]//Proceedings of the 8th ACM SIGK- DD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, 2002: 133-142.
  • 10U Ozertem, O Chapelle, P Donmez, et al. Learning to suggest: a machine learning framework for ranking query suggestions [C]//Proceedings of SIGIR2012, Portland, OR, USA, 2012: 25-34.

同被引文献24

引证文献7

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部