期刊文献+

基于自注意力特征交互的红外与可见光图像融合方法

Infrared and Visible Image Fusion Method Based on The Interactions of Self-Attention Features
在线阅读 下载PDF
导出
摘要 现有多传感器图像融合方法存在层次特征融合不充分、解耦的互补特征难以鉴别的问题,为此本文提出了一种基于自注意力特征交互的红外与可见光图像融合方法。该方法在孪生分支上约束深层特征的相似性,促使多层级多尺度的互补特征通过交互模块进行合理的交换与融合。具体地,交互模块利用跨模态注意力机制计算多模态图像之间的局部特征不相似度,并以此作为特征的交互系数实现上下分支特征的交互。但是不相似度量易受噪声、伪影等信息影响,误判其为互补信息。由于该类信息对于本模态图像较为孤立,本方法通过计算全局自注意力系数判别该类信息。最终的交互系数由跨模态注意力系数与全局自注意力系数两部分组成,可以有效地提取互补特征。同时,为了保证融合特征的完整性与一致性,本方法提出特征循环一致性损失对融合特征进行约束,促使融合图像具备更丰富的源图像信息。为适应融合场景的多样性,本文提出了一种基于掩码与池化的融合损失函数。通过在TNO、RoadScene等数据集上与其他State-of-the-Art方法进行主客观指标比较,检验了本文方法的优越性。 Current multi-sensor image fusion methods encounter challenges such as inadequacy in integrating hierarchical features and difficulty in accurately identifying decoupled complementary features.To address these issues,this paper introduces a novel image fusion approach that leverages the interactions between local cross-modal features and global self-attention mechanisms.The proposed method constrains the similarity of deep features across twin branches,facilitating the effective exchange and fusion of multi-level and multiscale complementary features through interactive modules.Specifically,the interaction module employs a cross-modal attention mechanism to compute the local feature dissimilarities between multi-modal images that are then used as interaction coefficients to enable the fusion of the upper and lower branch features.However,this dissimilarity metric is prone to noise,artifacts,and other irrelevant information and is often misinterpreted as complementary features.Given the isolated nature of this information within a specific modality,the proposed method differentiates this information by calculating a global self-attention coefficient.The final interaction coefficient,comprising both cross-modal attention and global self-attention components,enables the efficient extraction of complementary features.Furthermore,to ensure the completeness and consistency of the fused features,a feature cyclic consistency loss is introduced to constrain the fusion process,thereby promoting the preservation of richer source image information.To accommodate a wide range of fusion scenarios,this study proposes a fusion loss function based on masking and pooling operations.The effectiveness and superiority of the proposed approach are demonstrated through comprehensive comparisons with state-of-the-art methods using both subjective and objective metrics on benchmark datasets such as TNO and RoadScene.
作者 管芳景 蒋琦炜 罗晓清 金琦淳 GUAN Fangjing;JIANG Qiwei;LUO Xiaoqing;JIN Qichun(Engineering Technology Research Center of Big Data Intelligent Application,Wuxi City College of Vocational Technology,Wuxi 214153,China;Research Centre of Environment Science and Engineering,Wuxi 214000,China;School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi 214122,China)
出处 《红外技术》 北大核心 2025年第11期1406-1414,共9页 Infrared Technology
基金 国家自然科学基金(61772237) 江苏省高等学校基础科学(自然科学)研究面上项目(22KJD210005)。
关键词 特征相似性 特征交互 自注意力 孪生网络 红外与可见光 图像融合 feature similarity feature interaction self-attention attention siamese networks infrared and visible image fusion
作者简介 管芳景(1980-),硕士,副教授,主要研究方向为模式识别与图像处理。;通信作者:罗晓清(1980-),博士,教授,主要研究方向为多模态信息融合与计算机视觉。E-mail:xqluo@jiangnan.edu.cn。
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部