摘要
将相关反馈技术应用于信息检索中可以学习和明确用户的信息需求,并对检索结果进行信息过滤,是提高信息检索效果的有效手段之一。除了准确率和召回率之外,过滤算法的适应性、速度也直接影响用户使用信息检索系统的体验。采用向量空间模型表示文档的内容预处理工作少,计算简单,适用于实时信息检索。结合偏差最小的基本原理,将改进的反馈文档向量的质心应用于信息重排。以重排的应用场景,在TREC Filtering Task数据集上进行仿真,并与基于关键词检索和类质心的检索方式进行了试验比较。
Applying relevant feedback technology to information retrieval is one of the effective means to improve the effect of information retrieval in which users' information demand can be learnt and determined and the retrieved results can conduct information filtration.Apart from the accuracy and recall,the adaptability and speed of filtration algorithm also impacts the experience of the user in employing the information retrieval system directly.The authors used vector space model to represent the content of document.it is less workload in pre-treatment and simple in computation and is adapted to real-time information retrieval.In combination of the rationale with least deviation,in the paper they apply the improved centroid of feedback document to information rearrangement.The simulation of the applying scene of rearrangement is executed on TREC Filtering Task dataset,and the test comparison is made between it with the keyword-based retrieval and the class-centroid based retrieval means.
出处
《计算机应用与软件》
CSCD
2011年第10期62-64,76,共4页
Computer Applications and Software
基金
国家高技术研究发展计划项目(2009AA044601)
关键词
信息检索
信息过滤
相关反馈
质心分类器
向量空间模型
Information retrieval Information filter Relevant feedback Centroid classifier Vector space model
作者简介
刘绍翰,博士,主研领域:算法,模式识别,信息检索。