摘要
[目的 ]为了克服传统视觉词袋方法(Bag-of-Visual-Words)中忽略视觉单词间的空间关系和语义信息等问题。[方法 ]本文提出一种与视觉语言模型相结合的基于LDA主题模型,并采用查询似然模型实现检索。[结果 ]实验数据表明,本文所提出的基于LDA的表示方法可以高效、准确地解决蒙古文古籍的关键词检索问题。[结论 ]同时,该方法的性能比Bo VW方法有显著提高。
[Objective]In order to overcome the problem of ignoring the spatial relations and semantic information between visual words in traditional visual word bag(Bag-of-Visual-Words).[Methods]In this paper,a LDA-based topic model was adopted which was linearly combined with a visual language model for each word image.And the basic query likelihood model was used for realizing the procedure of retrieval.[Results]The experimental results on our dataset showed that the proposed LDA-based representation approach could effi ciently and accurately attain to the aim of keyword spotting on a collection of historical Mongolian documents.[Conclusions]Meanwhile,the proposed approach improved the performance signifi cantly than the original BoVW approach.
作者
白淑霞
鲍玉来
Bai Shuxia;Bao Yulai(Library,Inner Mongolia University,Hohhot 010021,China)
出处
《现代情报》
CSSCI
北大核心
2017年第7期51-54,88,共5页
Journal of Modern Information
基金
国家自然科学基金项目"基于领域本体的蒙古文数字资源整合机制研究"(项目编号:71163029)
作者简介
白淑霞(1982-), 女,馆员,研究方向:信息检索,蒙古文信息检索。;通讯作者:鲍玉来(1975-),男,副研究馆员,研究方向:蒙古文信息检索。