语义耦合相关的判别式跨模态哈希学习算法被引量：16

Discriminative Cross-Modal Hashing with Coupled Semantic Correlation

在线阅读下载PDF

导出

摘要基于哈希的跨模态检索以其存储消耗低、查询速度快等优点受到广泛的关注.跨模态哈希学习的核心问题是如何对不同模态数据进行有效地共享语义空间嵌入学习.大多数算法在对多模态数据进行共享空间嵌入的过程中忽略了特征表示的语义判别性,从而导致哈希码表示的类别区分性不强,降低了最近邻搜索的准确性和鲁棒性.该文提出了基于语义耦合相关的判别式跨模态哈希特征表示学习算法.算法在模型的优化目标函数设计上综合了线性判别分类器的思想和跨模态相关性最大化思路,通过引入线性分类器,使得各模态都能够分别学习到各自具有判别性的二进制哈希码.同时利用耦合哈希表示在嵌入语义空间中最大化不同模态之间的相关性,不仅克服了把多种数据投影到一个共同嵌入语义空间的缺陷,而且能够捕捉到不同模态之间的语义相关性.算法在Wiki、LabelMe以及NUS_WID三个基准数据集上与最近相关的算法进行了实验比较.实验结果表明该文提出的方法在检索精度和计算效率上有明显的优势. A variety of multimedia data on the network have increased exponentially in recent years including multi-modal data,such as video,picture,audio,text,etc.Different modal data are often interrelated.For example,in WeChat’s friends circle moments,voice and short videos are often given when publishing pictures.When searching a topic,users expect to get rich and comprehensive retrieval results which include different media data,so how to achieve the cross-modal retrieval between different modal data has become a research hotspot in the multimedia field.The cross-modal retrieval methods based on hashing has attracted much attention for their low storage cost and fast query speed.The core problem of cross-modal hashing learning is how to learn efficiently the shared embedding semantic space of different modal data.There are two categories of approaches to handle the problem.The first category is the unsupervised methods,trying to learn the hashing function from the underlying structure,distribution,and topology information of the data in order to maintain the original data space structure.The second category is the supervised methods to combine the semantic label information in the process of the hashing learning.However,Most of algorithms neglect the semantic discrimination of feature representation in the process of embedding the multi-modal data into the shared space,which leads to weaken the classification discrimination of the hash codes from different classes and reduce accuracy and robustness of the nearest neighbor search.In this paper,a linear discriminative cross-modal hashing learning algorithm with coupled semantic correlation is proposed,which integrates linear discriminative classifier and maximizing the correlation between cross-modals in the objective function of the model.First,we apply the linear classifier into modeling the supervised hashing learning so that each modal can learn respectively the discriminative binary hash code with high classification performance.Second,we project data from different modes into their embedding spaces to get their respective hash codes,and then the correlations between different modalities are maximized in the embedding spaces by joint coupled-hashing representation,so not only the defects of projecting a variety of data into a common embedding semantic space are overcome,but also the semantic relevance between different modal data can be captured.In the experiments,three kinds of performance evaluation indexes were employed,including the mean average precision (MAP) for ten times,the precision recall curve (PR) which implies the retrieval accuracy under different recall rates and the top N precision that indicates the change of accuracy relative to the number of the retrieval instances.In order to show the effectiveness of this algorithm,we compared it with six current relevant algorithms on three benchmark datasets including two cross-modal retrieval tasks:1)the retrieving pictures with text;2)the retrieving text with pictures.The experimental results show that the proposed method achieves obvious advantages on the retrieval accuracy and the computational efficiency.Additionally,the influence of the algorithm’s parameters on its performance was also investigated by changing one parameter while fixing other parameters.The investigation demonstrates the proposed method is insensitive to the parameters varieties in a wide range and obtained good results.

作者严双咏刘长红江爱文叶继华王明文 YAN Shuang-Yong;LIU Chang-Hong;JIANG Ai-Wen;YE Ji-Hua;WANG Ming-Wen(School of Computer and Information Engineering, Jiangxi Normal University, Nanchang 330022)

机构地区江西师范大学计算机信息工程学院

出处《计算机学报》 EI CSCD 北大核心 2019年第1期164-175,共12页 Chinese Journal of Computers

基金国家自然科学基金(61662030 61365002 61462042 61462045) 江西省自然科学基金(20171BAB202016) 江西省教育厅科技项目(GJJ150350)资助~~

关键词跨模态检索跨模态哈希线性分类器语义相关性共享子空间多模态 cross-modal retrieval cross-modal hashing linear classifier semantic correlation shared subspace multi-modal

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

作者简介严双咏,男,1990年生,硕士研究生,主要研究方向为信息检索、计算机视觉.E-mail:13170884058@163.com;通信作者:刘长红,女,1977年生,博士,副教授,中国计算机学会(CCF)会员,主要研究方向为计算机视觉、机器学习、高光谱图像处理.E-mail:liuch@jxnu.edu.cn;江爱文,男,1984年生,博士,副教授,中国计算机学会(CCF)会员,主要研究方向为模式识别、图像分析与检索、机器学习;叶继华,男,1966年生,硕士,教授,中国计算机学会(CCF)会员,主要研究领域为数据融合、模式识别与物联网技术;王明文,男,1965年生,博士,教授,中国计算机学会(CCF)会员,主要研究领域为自然语言处理、信息检索.

引文网络
相关文献

参考文献3

1张鸿,吴飞,庄越挺.跨媒体相关性推理与检索研究[J].计算机研究与发展,2008,45(5):869-876. 被引量：20
2张鸿,吴飞,庄越挺,陈建勋.一种基于内容相关性的跨媒体检索方法[J].计算机学报,2008,31(5):820-826. 被引量：34
3李志欣,施智平,陈宏朝,吴璟莉.基于语义学习的图像多模态检索[J].计算机工程,2013,39(3):258-263. 被引量：6

二级参考文献47

1施智平,胡宏,李清勇,史忠植,段禅伦.基于纹理谱描述子的图像检索[J].软件学报,2005,16(6):1039-1045. 被引量：44
2张静,路红,薛向阳.基于索引结构的高效运动视频检索[J].计算机研究与发展,2006,43(11):1953-1958. 被引量：3
3庄毅,庄越挺,吴飞.Composite Distance Transformation for Indexing and κ-Nearest-Neighbor Searching in High-Dimensional Spaces[J].Journal of Computer Science & Technology,2007,22(2):208-217. 被引量：3
4Zhang Hong-Jiang, Zhong Di. Schema for visual featurebased image indexing Proceedings of the SPIE, Storage and Retrieval for Image and Video Database. San Diego, USA, 1995:36-46.
5David R H, John S T. KCCA for different level precision in content-based image retrieval Proceedings of the 3rd International Workshop on Content-Based Multimedia Indexing. Rennes, France, 2003:51-56.
6Snoek C G M, Worring M, Geusebroek J M. Semantic video search engine Proceedings of the TRECVID Workshop. Gaithersburg, USA, 2004:102-105.
7Zhao Xue-Yan, Zhuang Yue-Ting, Wu Fei. Audio clip retrieval with fast relevance feedback based on constrained fuzzy clustering and stored Index table Proceedings of the Pacific-Rim Conference on Multimedia. Taiwan, China, 2002:237-244.
8McGurk J M. Hearing lips and seeing voices. Nature, 1976, 264(5588) : 746-748.
9Hardoon D R. A correlation approach for automatic image annotation Proceedings of the 2nd International Conference on Advanced Data Mining and Applications. Xi'an, China, 2006:681-692.
10Wang Xin-Jing, Ma Wei-Ying, Xue Gui-Rong, Li Xing. Multi-model similarity propagation and its application for web image retrieval Proceedings of the ACM Multimedia Conference. New York, USA, 2004:944-951.

共引文献52

1吴飞,刘亚楠,庄越挺.基于张量表示的直推式多模态视频语义概念检测[J].软件学报,2008,19(11):2853-2868. 被引量：10
2陈铭,郭同强,吴飞,王叶钧,庄越挺.情景式跨媒体数字城市系统[J].计算机辅助设计与图形学学报,2008,20(11):1432-1439. 被引量：3
3刘亚楠,吴飞,庄越挺.基于多模态子空间相关性传递的视频语义挖掘[J].计算机研究与发展,2009,46(1):1-8. 被引量：12
4刘扬,郑逢斌,姜保庆,蔡坤.基于多模态融合和时空上下文语义的跨媒体检索模型的研究[J].计算机应用,2009,29(4):1182-1187. 被引量：7
5杨易,郭同强,庄越挺,王文华.基于综合推理的多媒体语义挖掘和跨媒体检索[J].计算机辅助设计与图形学学报,2009,21(9):1307-1314. 被引量：12
6吴飞,庄越挺.互联网跨媒体分析与检索:理论与算法[J].计算机辅助设计与图形学学报,2010,22(1):1-9. 被引量：35
7柳培忠,王守觉.适用于视觉媒体检索的视频镜头分割算法[J].计算机应用研究,2010,27(5):1935-1937. 被引量：4
8吴飞,韩亚洪,庄越挺,邵健.图像-文本相关性挖掘的Web图像聚类方法[J].软件学报,2010,21(7):1561-1575. 被引量：10
9张鸿,顾进广.数据网格环境下的多媒体资源检索[J].计算机工程,2011,37(8):275-277. 被引量：1
10王琦,鲁东明.知识发现在古代壁画展示中的应用[J].中国图象图形学报,2011,16(7):1326-1334.

同被引文献69

1杜海骏,刘学亮.融合约束学习的图像字幕生成方法[J].中国图象图形学报,2020,0(2):333-342. 被引量：6
2刘连,王孝通.基于变分贝叶斯推断的字典学习算法[J].控制与决策,2020,35(2):469-473. 被引量：7
3杨飞,唐乾,林果园.带加性时变时滞的不确定神经网络鲁棒散耗性研究[J].计算机应用研究,2020,37(1):118-122. 被引量：1
4胡艳明,李德才,何玉庆,韩建达.基于增量式RBF网络的Q学习算法[J].机器人,2019,41(5):562-573. 被引量：7
5武海平,余宏亮,郑纬民,周德铭.联网审计系统中海量数据的存储与管理策略[J].计算机学报,2006,29(4):618-624. 被引量：16
6刘伟,孟小峰,孟卫一.Deep Web数据集成研究综述[J].计算机学报,2007,30(9):1475-1489. 被引量：137
7王元卓,靳小龙,程学旗.网络大数据:现状与展望[J].计算机学报,2013,36(6):1125-1138. 被引量：722
8李武军,周志华.大数据哈希学习:现状与趋势[J].科学通报,2015,60(5):485-490. 被引量：50
9孟小峰,杜治娟.大数据融合研究:问题与挑战[J].计算机研究与发展,2016,53(2):231-246. 被引量：140
10文庆福,王建民,朱晗,曹越,龙明盛.面向近似近邻查询的分布式哈希学习方法[J].计算机学报,2017,40(1):192-206. 被引量：11

引证文献16

1陶友山.姚桥矿井改扩建工程移交生产[J].煤矿设计,2000(5):11-12.
2顾岩,赵崇宇,黄平.基于高阶统计信息的深度哈希学习模型[J].计算机工程,2020,46(7):260-267. 被引量：1
3张万桢,刘同来,李志梅.重构约束的离散矩阵因式分解跨模态哈希[J].计算机工程与设计,2021,42(2):525-532.
4邱一城,杨立身.结合残差学习和双模态CAE的图像描述方法[J].光学技术,2021,47(1):93-100.
5张成,万源,强浩鹏.基于知识蒸馏的深度无监督离散跨模态哈希[J].计算机应用,2021,41(9):2523-2531. 被引量：2
6闵康凌,张国宾,王磊,李丹萍.耦合保持投影哈希跨模态检索[J].中国图象图形学报,2021,26(7):1558-1567.
7康培培,林泽航,杨振国,张子同,刘文印.成对相似度迁移哈希用于无监督跨模态检索[J].计算机应用研究,2021,38(10):3025-3029. 被引量：5
8周丽华,王家龙,王丽珍,陈红梅,孔兵.异质信息网络表征学习综述[J].计算机学报,2022,45(1):160-189. 被引量：12
9朱杰.基于文本引导对抗哈希的跨模态检索方法[J].计算机应用研究,2022,39(2):628-632. 被引量：2
10熊威,王展青,王晓雨.深度联合语义跨模态哈希算法[J].小型微型计算机系统,2022,43(3):589-597. 被引量：1

二级引证文献37

1许炫淦,房小兆,孙为军,韩娜,吴惠粦,黄永慧.语义嵌入重构的跨模态哈希检索[J].计算机应用研究,2022,39(6):1645-1650. 被引量：4
2柳兴华,曹桂涛,林秋斌,曹文明.自适应混合注意力深度跨模态哈希[J].计算机应用,2022,42(12):3663-3670. 被引量：1
3陈冠恒,郭子瑜,梅广旭,刘士军,潘丽.一种针对关系不确定性的贝叶斯异质图神经网络[J].计算机学报,2023,46(3):552-567. 被引量：3
4王亚峰,周丽华,陈伟,王丽珍,陈红梅.异质信息网络的互信息最大化社区搜索[J].浙江大学学报（工学版）,2023,57(2):287-298.
5刘超,孔兵,杜国王,周丽华,陈红梅,包崇明.高阶互信息最大化与伪标签指导的深度聚类[J].浙江大学学报（工学版）,2023,57(2):299-309. 被引量：1
6丁淑艳,余恒,李伦波,郭剑辉.基于图卷积网络的无监督跨模态哈希检索算法[J].计算机应用研究,2023,40(3):789-793. 被引量：3
7杨慧,施水才.基于内容的图像检索技术研究综述[J].软件导刊,2023,22(4):229-244. 被引量：4
8于诗睿,李爱花,林紫洛,唐小利.基于异构网络的相关数据挖掘任务研究综述[J].医学信息学杂志,2023,44(4):28-34. 被引量：2
9谭钰,王小琴,蓝如师,刘振丙,罗笑南.基于判别性矩阵分解的多标签跨模态哈希检索[J].计算机应用,2023,43(5):1349-1354. 被引量：1
10张宇峰.基于改进Transformer的时序数据预测方法[J].电脑编程技巧与维护,2023(9):84-86. 被引量：2

1王雪颖,王昊,张紫玄.中文专利文献中连续符号串的语义识别[J].数据分析与知识发现,2018,2(5):11-22. 被引量：1
2何霞,汤一平,陈朋,王丽冉,袁公萍.多任务分段紧凑特征的车辆检索方法[J].中国图象图形学报,2018,23(12):1801-1812. 被引量：1
3可能投影显示的厨房秤[J].学苑创造（C版）,2018,0(11):27-27.
4第—种用塑料瓶制成的气凝胶问世[J].纺织科学研究,2018,0(12):6-6.
5张宇航,李琳,李尚林,刘晓平.基于语义的徽州民居快速建模方法[J].合肥工业大学学报（自然科学版）,2018,41(12):1629-1635. 被引量：2
6张丽萍,孟卫平,谭家海.基于稀疏自编码的无监督哈希算法[J].液晶与显示,2018,33(11):950-957. 被引量：1
7李新卫,吴飞,荆晓远.基于协同矩阵分解的单标签跨模态检索[J].计算机技术与发展,2018,28(11):99-102.
8韩建敏,李国伟,王振飞.感兴趣视觉特征耦合种子繁衍的图像检索算法[J].计算机工程与设计,2018,39(12):3785-3790. 被引量：1
9李霖,沈航,刘羽,朱海红,罗振威,孙涛,李昭熹.基于语义的地理概念范例构造方法[J].武汉大学学报（信息科学版）,2018,43(12):2243-2249. 被引量：2
10江智如.例谈不等式恒成立与存在性问题的解题策略[J].中学数学研究（华南师范大学）（上半月）,2018,22(11):45-47. 被引量：5

计算机学报

2019年第1期

浏览历史

内容加载中请稍等...

语义耦合相关的判别式跨模态哈希学习算法被引量：16

参考文献3

二级参考文献47

共引文献52

同被引文献69

引证文献16

二级引证文献37

相关作者

相关机构

相关主题

浏览历史

语义耦合相关的判别式跨模态哈希学习算法 被引量：16

参考文献3

二级参考文献47

共引文献52

同被引文献69

引证文献16

二级引证文献37

相关作者

相关机构

相关主题

浏览历史

语义耦合相关的判别式跨模态哈希学习算法被引量：16