基于双层路由注意力及特征融合的细粒度图像分类被引量：2

Fine-grained Image Classification Based on Bi-level Routing Attention and Feature Fusion

在线阅读下载PDF

导出

摘要近年来,视觉Transformer(Vision Transformer,ViT)在图像识别领域取得了突破性进展,其自注意力机制能够从图像中提取出不同像素块的判别性标记信息,进而提升图像分类的精度。在图像分类领域中,细粒度图像分类具有类与类之间的特征差距小、类内的特征差距大的特点,从而导致了分类困难。针对细粒度图像分类中数据分布具有小型、非均匀和难以发现类与类之间的差异等特征,提出一种基于双层路由注意力(Bi-level Routing Attention,BRA)的细粒度图像分类模型。基准骨干网络采用多阶段层级架构设计的新型视觉Transformer模型作为视觉特征提取器,从中获得局部信息和全局信息以及多尺度的特征。同时引入特征增强、融合模块,以此提高网络对关键特征的学习能力。实验结果表明,该模型在CUB-200-2011和Stanford Dogs这两个细粒度图像数据集上的分类精度分别达到了91.7%和92.2%,相较于多个主流细粒度图像分类模型,该模型具有更好的分类结果。 In recent years,Vision Transformer(ViT)has made a breakthrough in the field of image recognition.Its self-attention module can extract discriminative labeling information of different pixel blocks from images,thereby improving the accuracy of image classification.In the field of image classification,fine-grained image classification is difficult to classify due to the characteristics of small feature differences between classes and large feature differences within classes.A fine-grained image classification model based on Bi-level Routing Attention(BRA)is proposed to address the characteristics of small,non-uniform,and imperceptible differences between classes in data distribution in fine-grained image classification.The benchmark backbone network adopts a new visual Transformer model designed with a multi-stage hierarchical architecture as the visual feature extractor,which obtains local and global information as well as multi-scale features.At the same time,feature boosting and fusion modules are introduced to improve the network's learning ability for key features.The experimental results show that the classification accuracy of such model on two fine-grained image datasets,CUB-200-2011 and Stanford Dogs,reaches 91.7%and 92.2%.Compared with multiple mainstream fine-grained image classification models,such model has better classification results.

作者沈宇麒崔衍 SHEN Yu-qi;CUI Yan(School of Internet of Things,Nanjing University of Posts and Telecommunications,Nanjing 210003,China)

机构地区南京邮电大学物联网学院

出处《计算机技术与发展》 2024年第6期23-28,共6页 Computer Technology and Development

基金中国国家博士后基金(2020M671554)。

关键词细粒度图像分类神经网络视觉Transformer 注意力机制特征融合 fine-grained image classification neural network vision Transformer attention mechanism feature fusion

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

作者简介沈宇麒(1999-),男,硕士研究生,研究方向为人工智能、计算机视觉;通讯作者:崔衍(1982-),女,博士,副教授,研究方向为模式识别、生物信息学等。

引文网络
相关文献

参考文献3

1马瑶,智敏,殷雁君,萍萍.CNN和Transformer在细粒度图像识别中的应用综述[J].计算机工程与应用,2022,58(19):53-63. 被引量：14
2罗建豪,吴建鑫.基于深度卷积特征的细粒度图像分类研究综述[J].自动化学报,2017,43(8):1306-1318. 被引量：156
3李佳盈,蒋文婷,杨林,罗铁坚.基于ViT的细粒度图像分类[J].计算机工程与设计,2023,44(3):916-921. 被引量：8

二级参考文献7

1晓莉,达飞鹏.基于排除算法的快速三维人脸识别方法[J].自动化学报,2010,36(1):153-158. 被引量：32
2张琳波,王春恒,肖柏华,邵允学.基于Bag-of-phrases的图像表示方法[J].自动化学报,2012,38(1):46-54. 被引量：25
3颜雪军,赵春霞,袁夏.2DPCA-SIFT:一种有效的局部特征描述方法[J].自动化学报,2014,40(4):675-682. 被引量：29
4余旺盛,田孝华,侯志强.基于区域边缘统计的图像特征描述新方法[J].计算机学报,2014,37(6):1398-1410. 被引量：15
5高莹莹,朱维彬.深层神经网络中间层可见化建模[J].自动化学报,2015,41(9):1627-1637. 被引量：17
6李祥霞,吉晓慧,李彬.细粒度图像分类的深度学习方法[J].计算机科学与探索,2021,15(10):1830-1842. 被引量：11
7Yifan Xu,Huapeng Wei,Minxuan Lin,Yingying Deng,Kekai Sheng,Mengdan Zhang,Fan Tang,Weiming Dong,Feiyue Huang,Changsheng Xu.Transformers in computational visual media:A survey[J].Computational Visual Media,2022,8(1):33-62. 被引量：17

共引文献174

1张哲,邵允学,吕刚.基于机器视觉的台架上钢坯位置分割[J].计算机系统应用,2022,31(10):254-260.
2赵毅力,李禹成,陈皓.云南野生鸟类图像自动识别系统[J].计算机应用研究,2020,37(S01):423-425. 被引量：5
3王铮,刘纪平,车向红,王勇,杜凯旋.基于卷积神经网络的地图相似度匹配方法研究[J].测绘科学,2022,47(7):169-175. 被引量：10
4崔岩,方春华,文中,方萌,游海鑫,郭俊康.基于时频谱图和自适应动态权重PSO-CNN的外破振动信号识别[J].国外电子测量技术,2023,42(1):144-152. 被引量：7
5许学斌,刘燊莲,路龙宾,刘晨光.多尺度混合注意力胶囊网络的海洋鱼类识别[J].光电子．激光,2022,33(11):1158-1164. 被引量：3
6商立军,臧益民,王四旺.耐钙心肌细胞的分离及基本电生理特性[J].第四军医大学学报,2000,21(2):247-249. 被引量：12
7田娟秀,刘国才,谷珊珊,鞠忠建,刘劲光,顾冬冬.医学图像分析深度学习方法研究与挑战[J].自动化学报,2018,44(3):401-424. 被引量：114
8张潜,桑军,吴伟群,吴中元,向宏,蔡斌.基于Xception的细粒度图像分类[J].重庆大学学报（自然科学版）,2018,41(5):85-91. 被引量：17
9胡清华,王煜,周玉灿,赵红,钱宇华,梁吉业.大规模分类任务的分层学习方法综述[J].中国科学：信息科学,2018,48(5):487-500. 被引量：22
10张号逵,李映,姜晔楠.深度学习在高光谱图像分类领域的研究现状与展望[J].自动化学报,2018,44(6):961-977. 被引量：80

同被引文献14

1曾平平,李林升.基于卷积神经网络的水果图像分类识别研究[J].机械设计与研究,2019,35(1):23-26. 被引量：42
2刘小刚,范诚,李加念,高燕俐,章宇阳,杨启良.基于卷积神经网络的草莓识别方法[J].农业机械学报,2020,51(2):237-244. 被引量：74
3Zongmei Gao,Yuanyuan Shao,Guantao Xuan,Yongxian Wang,Yi Liu,Xiang Han.Real-time hyperspectral imaging for the in-field estimation of strawberry ripeness with deep learning[J].Artificial Intelligence in Agriculture,2020(1):31-38. 被引量：14
4周胜安,黄耿生,张译匀,高东发.基于深度学习的水果缺陷实时检测方法[J].食品与机械,2021,37(11):123-129. 被引量：9
5焦方圆,申金媛,郝同盟.一种基于卷积神经网络的烟叶等级识别方法[J].食品与机械,2022,38(2):222-227. 被引量：10
6陈建瑜,邹春龙,王生怀,夏力,陈哲.改进YOLOv5的路面缺陷快速检测方法研究[J].电子测量技术,2023,46(10):129-135. 被引量：12
7郑世杰,王高才.基于ConvNeXt热图定位和对比学习的细粒度图像分类研究[J].计算机科学,2023,50(10):119-125. 被引量：6
8黄家才,赵雪迪,高芳征,温鑫,金少宇,张洋.基于改进YOLOv5s的草莓多阶段识别检测轻量化算法[J].农业工程学报,2023,39(21):181-187. 被引量：10
9孙杰,杨静,丁书杰,李少波,胡建军.基于多注意力机制与编译图神经网络的高光谱图像分类[J].农业机械学报,2024,55(3):183-192. 被引量：4
10逯登科,罗亦泳,张紫怡,张震,田晓鹏.基于ResNet50与通道注意力的遥感图像场景分类[J].江西科学,2024,42(2):396-404. 被引量：6

引证文献2

1王伟,杨世忠,宫钰程,高升,邓兆鹏.EfficientNet V2算法融合GCN和CA-Transformer的腐烂草莓分类方法[J].食品与机械,2024,40(12):81-88.
2吴杨,刘毅.基于改进YOLOv8的路面缺陷检测方法研究[J].自动化应用,2025,66(8):45-49.

1胡淼,姜麟,陶友凤,张志坚.改进YOLOv7的自动驾驶目标检测算法[J].计算机工程与应用,2024,60(11):165-172. 被引量：8
2孙睿琦,窦修超,李志华,蒋雪梅,孙宇豪.基于改进YOLOv5的复杂路况密集行人检测方法[J].计算机与现代化,2024(5):85-91. 被引量：1
3赵洋,任劼.基于空频域特征提取的小样本图像分类算法[J].自动化应用,2024,65(7):13-16. 被引量：1
4王素珍,吕基岳,葛润东,邓成禹.基于YOLO的双层注意力缺陷检测算法[J].组合机床与自动化加工技术,2024(5):91-95. 被引量：2
5Hoda Amiri,Khadijeh Yazdanparast,Mohsen Pourkhosravani,Maryam Rastegar.Spatial analysis of animal bites in Iran(2015-2020):A cross-sectional study[J].Journal of Acute Disease,2024,13(2):67-73.
6Josephine Braunsteiner,Stephanie Siedler,Dominik Jarczak,Stefan Kluge,Axel Nierhaus.Septic shock due to Capnocytophaga canimorsus treated with IgM-enriched immunoglobulin as adjuvant therapy in an immunocompetent woman[J].Journal of Intensive Medicine,2024,4(2):265-268.
7Miao-Miao Zhang,Jian-Qi Mao,Lin-Xin Shen,Ai-Hua Shi,Xin Lyu,Jia Ma,Yi Lyu,Xiao-Peng Yan.Optimization of tracheoesophageal fistula model established with Tshaped magnet system based on magnetic compression technique[J].World Journal of Gastroenterology,2024,30(16):2272-2280. 被引量：1
8Qun Duan,Xiaojin Zheng,Zhiqiang Gan,Dongyue Lyu,Hanyu Sha,Xinmin Lu,Xiaoling Zhao,Asaiti Bukai,Ran Duan,Shuai Qin,Li Wang,Jinxiao Xi,Di Wu,Peng Zhang,Deming Tang,Zhaokai He,Huaiqi Jing,Biao Kan,Xin Wang.Relationship Between Climate Change and Marmot Plague of Marmota himalayana Plague Focus—the Altun Mountains of the Qinghai-Xizang Plateau,China,2000–2022[J].China CDC weekly,2024,6(4):69-74. 被引量：2
9Kaoutar Stitou,Ilias Zahir,Oualid Mohammed Hmamouche,Marouane Hammoud,Faycal Lakhdar,Mohammed Benzagmout,Khalid Chakour,Mohammed El Faiz Chaoui.Primary Multiple Cerebral Hydatid Cyst in 8 Year-Old Girl: A Rare Cause of Childhood Seizure[J].Open Journal of Modern Neurosurgery,2024,14(2):149-157.
10Xiong Xie,Qiaoshuai Lan,Jinyi Zhao,Sulin Zhang,Lu Liu,Yumin Zhang,Wei Xu,Maolin Shao,Jingjing Peng,Shuai Xia,Yan Zhu,Keke Zhang,Xianglei Zhang,Ruxue Zhang,jian Li,Wenhao Dai,Zhen Ge,Shulei Hu,Changyue Yu,Jiang Wang,Dakota Ma,Mingyue Zheng,Haitao Yang,Gengfu Xiao,Zihe Rao,Lu Lu,Leike Zhang,Fang Bai,Yao Zhao,Shibo Jiang,Hong Liu.Structure-based design of pan-coronavirus inhibitors targeting host cathepsin L and calpain-1[J].Signal Transduction and Targeted Therapy,2024,9(4):1651-1664. 被引量：1

计算机技术与发展

2024年第6期

浏览历史

内容加载中请稍等...

基于双层路由注意力及特征融合的细粒度图像分类被引量：2

参考文献3

二级参考文献7

共引文献174

同被引文献14

引证文献2

相关作者

相关机构

相关主题

浏览历史

基于双层路由注意力及特征融合的细粒度图像分类 被引量：2

参考文献3

二级参考文献7

共引文献174

同被引文献14

引证文献2

相关作者

相关机构

相关主题

浏览历史

基于双层路由注意力及特征融合的细粒度图像分类被引量：2