MVContrast:Unsupervised Pretraining for Multi-view 3D Object Recognition 被引量：2

导出

摘要 3D shape recognition has drawn much attention in recent years.The view-based approach performs best of all.However,the current multi-view methods are almost all fully supervised,and the pretraining models are almost all based on ImageNet.Although the pretraining results of ImageNet are quite impressive,there is still a significant discrepancy between multi-view datasets and ImageNet.Multi-view datasets naturally retain rich 3D information.In addition,large-scale datasets such as ImageNet require considerable cleaning and annotation work,so it is difficult to regenerate a second dataset.In contrast,unsupervised learning methods can learn general feature representations without any extra annotation.To this end,we propose a three-stage unsupervised joint pretraining model.Specifically,we decouple the final representations into three fine-grained representations.Data augmentation is utilized to obtain pixel-level representations within each view.And we boost the spatial invariant features from the view level.Finally,we exploit global information at the shape level through a novel extract-and-swap module.Experimental results demonstrate that the proposed method gains significantly in 3D object classification and retrieval tasks,and shows generalization to cross-dataset tasks.

作者 Luequan Wang Hongbin Xu Wenxiong Kang

机构地区 School of Automation Science and Engineering

出处《Machine Intelligence Research》 EI CSCD 2023年第6期872-883,共12页 机器智能研究（英文版）

基金 This work was supported in part by National Natural Science Foundation of China(No.61976095) the Science and Technology Planning Project of Guangdong Province,China(No.2018B030323026).

关键词 Multi view unsupervised pretraining contrastive learning 3D vision shape recognition

分类号 TP391.41 [自动化与计算机技术—计算机应用技术] TP18 [自动化与计算机技术—控制理论与控制工程]

作者简介 Luequan Wang received the B.Sc.degree in automation from South China University of Technology,China in 2020.He is a master student in automation science and engineering at South China University of Technology,China.His research interests include self-supervised learning,3D vision and deep learning.E-mail:875713197@qq.com ORCID iD:0000-0001-9320-6873;Hongbin Xu received the M.Sc.degree from South China University of Technology,China in 2021.He is currently a Ph.D.degree candidate in automation science and engineering at South China University of Technology(SCUT),China.His research interests include 3D vision,multi-view stereo and self-supervised learning.E-mail:hongbinxu1013@gmail.com ORCID iD:0000-0002-3455-1527;Wenxiong Kang received the M.Sc.degree from Northwestern Polytechnical University,China in 2003,and the Ph.D.degree in automation science and engineering from South China University of Technology,China in 2009.He is currently a professor with School of Automation Science and Engineering,South China University of Technology,China.His research interests include biometrics identification,image processing,pattern recognition and computer vision.E-mail:auwxkang@scut.edu.cn(Corresponding author)ORCID iD:0000-0001-9023-7252。

引文网络
相关文献

同被引文献8

1Yi Yang,Fan Qiu,Hao Li,Lu Zhang,Mei-Ling Wang,Meng-Yin Fu.Large-scale 3D Semantic Mapping Using Stereo Vision[J].International Journal of Automation and computing,2018,15(2):194-206. 被引量：1
2Bing-Xing Wu,Suat Utku Ay,Ahmed Abdel-Rahim.Pedestrian Height Estimation and 3D Reconstruction Using Pixel-resolution Mapping Method Without Special Patterns[J].International Journal of Automation and computing,2019,16(4):449-461. 被引量：2
3Fu-Qiang Liu,Zong-Yi Wang.Automatic"Ground Truth"Annotation and Industrial Workpiece Dataset Generation for Deep Learning[J].International Journal of Automation and computing,2020,17(4):539-550. 被引量：3
4Zun-Ran Wang,Chen-Guang Yang,Shi-Lu Dai.A Fast Compression Framework Based on 3D Point Cloud Data for Telepresence[J].International Journal of Automation and computing,2020,17(6):855-866. 被引量：3
5Dong-Yu She,Kun Xu.Contrastive Self-supervised Representation Learning Using Synthetic Data[J].International Journal of Automation and computing,2021,18(4):556-567. 被引量：3
6Rui Jiang,Ruixiang Zhu,Hu Su,Yinlin Li,Yuan Xie,Wei Zou.Deep Learning-based Moving Object Segmentation:Recent Progress and Research Prospects[J].Machine Intelligence Research,2023,20(3):335-369. 被引量：2
7Mengya Han,Yibing Zhan,Baosheng Yu,Yong Luo,Han Hu,Bo Du,Yonggang Wen,Dacheng Tao.Region-adaptive Concept Aggregation for Few-shot Visual Recognition[J].Machine Intelligence Research,2023,20(4):554-568. 被引量：1
8Wei Wu,Hanyang Peng,Shiqi Yu.YuNet: A Tiny Millisecond-level Face Detector[J].Machine Intelligence Research,2023,20(5):656-665. 被引量：3

引证文献2

1Yuanxun Lu,Xinya Ji,Hao Zhu,Xun Cao.Learning Hierarchical Adaptive Code Clouds for Neural 3D Shape Representation[J].Machine Intelligence Research,2025,22(2):304-323.
2Chengcheng Ma,Weiming Dong,Changsheng Xu.TENET:Beyond Pseudo-labeling for Semi-supervised Few-shot Learning[J].Machine Intelligence Research,2025,22(3):511-523.

1王刘洋,林仁军.基于解耦变分自编码器的多视图表示学习[J].工业控制计算机,2023,36(11):87-89.
2Haoxing CHEN,Huaxiong LI,Yaohui LI,Chunlin CHEN.Sparse spatial transformers for few-shot learning[J].Science China(Information Sciences),2023,66(11):19-30. 被引量：1
3张康佳,张鹏伟,陈景霞,龙闵翔,林文涛.基于改进YOLOv5s的X光图像危险品检测[J].陕西科技大学学报,2023,41(6):176-183. 被引量：2
4Lei Huang,Dan Wang,Haodong Chen,Jinnan Hu,Xuechen Dai,Chuan Liu,Anduo Li,Xuechun Shen,Chen Qi,Haixi Sun,Dengwei Zhang,Tong Chen,Yuan Jiang.Corrigendum to“CRISPR-detector:fast and accurate detection,visualization,and annotation of genome-wide mutations induced by genome editing events”[Journal of Genetics and Genomics(2023)563-572][J].Journal of Genetics and Genomics,2023,50(10):813-813. 被引量：1
5Wenjun Hui,Guanghua Gu,Bo Wang.Shallow Feature-driven Dual-edges Localization Network for Weakly Supervised Localization[J].Machine Intelligence Research,2023,20(6):923-936.
6Kasturi Vasudevan.Minus Infinity Plus Infinity[J].Semiconductor Science and Information Devices,2023,5(1):1-2.
7Chaolemen Borjigin,Qingwen Jin.Large-scale data archiving: At the interface of archive science and computer science[J].Data Science and Informetrics,2023,3(3):1-17.
8Hanshuo Zhang,Tao Li,Yongzhao Li,Zhijin Wen.Virtual electromagnetic environment modeling based data augmentation for drone signal identification[J].Journal of Information and Intelligence,2023,1(4):308-320. 被引量：1
9Xiaofei QIN,Wenkai HU,Chen XIAO,Changxiang HE,Songwen PEI,Xuedian ZHANG.Attention-based efficient robot grasp detection network[J].Frontiers of Information Technology & Electronic Engineering,2023,24(10):1430-1444. 被引量：2
10Ziyuan Wang,Jun Chen,Chenwei Ni,Wei Nie,Dongfeng Li,Na Ta,Deyun Zhang,Yimeng Sun,Fusai Sun,Qian Li,Yuran Li,Ruotian Chen,Tiankai Bu,Fengtao Fan,Can Li.Visualizing the role of applied voltage in non-metal electrocatalysts[J].National Science Review,2023,10(9):40-49. 被引量：1

Machine Intelligence Research

2023年第6期

浏览历史

内容加载中请稍等...

MVContrast:Unsupervised Pretraining for Multi-view 3D Object Recognition 被引量：2

同被引文献8

引证文献2

相关作者

相关机构

相关主题

浏览历史