期刊文献+

语义耦合相关的判别式跨模态哈希学习算法 被引量:16

Discriminative Cross-Modal Hashing with Coupled Semantic Correlation
在线阅读 下载PDF
导出
摘要 基于哈希的跨模态检索以其存储消耗低、查询速度快等优点受到广泛的关注.跨模态哈希学习的核心问题是如何对不同模态数据进行有效地共享语义空间嵌入学习.大多数算法在对多模态数据进行共享空间嵌入的过程中忽略了特征表示的语义判别性,从而导致哈希码表示的类别区分性不强,降低了最近邻搜索的准确性和鲁棒性.该文提出了基于语义耦合相关的判别式跨模态哈希特征表示学习算法.算法在模型的优化目标函数设计上综合了线性判别分类器的思想和跨模态相关性最大化思路,通过引入线性分类器,使得各模态都能够分别学习到各自具有判别性的二进制哈希码.同时利用耦合哈希表示在嵌入语义空间中最大化不同模态之间的相关性,不仅克服了把多种数据投影到一个共同嵌入语义空间的缺陷,而且能够捕捉到不同模态之间的语义相关性.算法在Wiki、LabelMe以及NUS_WID三个基准数据集上与最近相关的算法进行了实验比较.实验结果表明该文提出的方法在检索精度和计算效率上有明显的优势. A variety of multimedia data on the network have increased exponentially in recent years including multi-modal data,such as video,picture,audio,text,etc.Different modal data are often interrelated.For example,in WeChat’s friends circle moments,voice and short videos are often given when publishing pictures.When searching a topic,users expect to get rich and comprehensive retrieval results which include different media data,so how to achieve the cross-modal retrieval between different modal data has become a research hotspot in the multimedia field.The cross-modal retrieval methods based on hashing has attracted much attention for their low storage cost and fast query speed.The core problem of cross-modal hashing learning is how to learn efficiently the shared embedding semantic space of different modal data.There are two categories of approaches to handle the problem.The first category is the unsupervised methods,trying to learn the hashing function from the underlying structure,distribution,and topology information of the data in order to maintain the original data space structure.The second category is the supervised methods to combine the semantic label information in the process of the hashing learning.However,Most of algorithms neglect the semantic discrimination of feature representation in the process of embedding the multi-modal data into the shared space,which leads to weaken the classification discrimination of the hash codes from different classes and reduce accuracy and robustness of the nearest neighbor search.In this paper,a linear discriminative cross-modal hashing learning algorithm with coupled semantic correlation is proposed,which integrates linear discriminative classifier and maximizing the correlation between cross-modals in the objective function of the model.First,we apply the linear classifier into modeling the supervised hashing learning so that each modal can learn respectively the discriminative binary hash code with high classification performance.Second,we project data from different modes into their embedding spaces to get their respective hash codes,and then the correlations between different modalities are maximized in the embedding spaces by joint coupled-hashing representation,so not only the defects of projecting a variety of data into a common embedding semantic space are overcome,but also the semantic relevance between different modal data can be captured.In the experiments,three kinds of performance evaluation indexes were employed,including the mean average precision (MAP) for ten times,the precision recall curve (PR) which implies the retrieval accuracy under different recall rates and the top N precision that indicates the change of accuracy relative to the number of the retrieval instances.In order to show the effectiveness of this algorithm,we compared it with six current relevant algorithms on three benchmark datasets including two cross-modal retrieval tasks:1)the retrieving pictures with text;2)the retrieving text with pictures.The experimental results show that the proposed method achieves obvious advantages on the retrieval accuracy and the computational efficiency.Additionally,the influence of the algorithm’s parameters on its performance was also investigated by changing one parameter while fixing other parameters.The investigation demonstrates the proposed method is insensitive to the parameters varieties in a wide range and obtained good results.
作者 严双咏 刘长红 江爱文 叶继华 王明文 YAN Shuang-Yong;LIU Chang-Hong;JIANG Ai-Wen;YE Ji-Hua;WANG Ming-Wen(School of Computer and Information Engineering, Jiangxi Normal University, Nanchang 330022)
出处 《计算机学报》 EI CSCD 北大核心 2019年第1期164-175,共12页 Chinese Journal of Computers
基金 国家自然科学基金(61662030 61365002 61462042 61462045) 江西省自然科学基金(20171BAB202016) 江西省教育厅科技项目(GJJ150350)资助~~
关键词 跨模态检索 跨模态哈希 线性分类器 语义相关性 共享子空间 多模态 cross-modal retrieval cross-modal hashing linear classifier semantic correlation shared subspace multi-modal
作者简介 严双咏,男,1990年生,硕士研究生,主要研究方向为信息检索、计算机视觉.E-mail:13170884058@163.com;通信作者:刘长红,女,1977年生,博士,副教授,中国计算机学会(CCF)会员,主要研究方向为计算机视觉、机器学习、高光谱图像处理.E-mail:liuch@jxnu.edu.cn;江爱文,男,1984年生,博士,副教授,中国计算机学会(CCF)会员,主要研究方向为模式识别、图像分析与检索、机器学习;叶继华,男,1966年生,硕士,教授,中国计算机学会(CCF)会员,主要研究领域为数据融合、模式识别与物联网技术;王明文,男,1965年生,博士,教授,中国计算机学会(CCF)会员,主要研究领域为自然语言处理、信息检索.
  • 相关文献

参考文献3

二级参考文献47

  • 1施智平,胡宏,李清勇,史忠植,段禅伦.基于纹理谱描述子的图像检索[J].软件学报,2005,16(6):1039-1045. 被引量:44
  • 2张静,路红,薛向阳.基于索引结构的高效运动视频检索[J].计算机研究与发展,2006,43(11):1953-1958. 被引量:3
  • 3庄毅,庄越挺,吴飞.Composite Distance Transformation for Indexing and κ-Nearest-Neighbor Searching in High-Dimensional Spaces[J].Journal of Computer Science & Technology,2007,22(2):208-217. 被引量:3
  • 4Zhang Hong-Jiang, Zhong Di. Schema for visual featurebased image indexing Proceedings of the SPIE, Storage and Retrieval for Image and Video Database. San Diego, USA, 1995:36-46.
  • 5David R H, John S T. KCCA for different level precision in content-based image retrieval Proceedings of the 3rd International Workshop on Content-Based Multimedia Indexing. Rennes, France, 2003:51-56.
  • 6Snoek C G M, Worring M, Geusebroek J M. Semantic video search engine Proceedings of the TRECVID Workshop. Gaithersburg, USA, 2004:102-105.
  • 7Zhao Xue-Yan, Zhuang Yue-Ting, Wu Fei. Audio clip retrieval with fast relevance feedback based on constrained fuzzy clustering and stored Index table Proceedings of the Pacific-Rim Conference on Multimedia. Taiwan, China, 2002:237-244.
  • 8McGurk J M. Hearing lips and seeing voices. Nature, 1976, 264(5588) : 746-748.
  • 9Hardoon D R. A correlation approach for automatic image annotation Proceedings of the 2nd International Conference on Advanced Data Mining and Applications. Xi'an, China, 2006:681-692.
  • 10Wang Xin-Jing, Ma Wei-Ying, Xue Gui-Rong, Li Xing. Multi-model similarity propagation and its application for web image retrieval Proceedings of the ACM Multimedia Conference. New York, USA, 2004:944-951.

共引文献52

同被引文献69

引证文献16

二级引证文献37

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部