摘要
针对水稻病害图像数据集样本较少而影响深度神经网络模型学习的精度问题,提出一种改进的对抗生成网络模型ViT-WGAN-GP(Vision Transformer and Wasserstein Generative Adversarial Networks with Gradient Penalty)用于对图像数据集进行增强。首先在生成模型引入Vision Transformer结构加强对全局特征的学习;其次在判别模型采用WGAN-GP结构,通过Wasserstein衡量函数和梯度惩罚项保证模型训练的稳定性,提升生成图像的效果;最后使用增强后的样本集训练深度神经网络模型。实验结果表明,针对水稻病害图像,ViT-WGAN-GP模型与GAN、WGAN-GP相比生成图像效果提升显著。使用增强后的水稻病害样本集训练VGG16、ResNet34和GoogLeNet模型,水稻病害识别平均准确率分别达到94.3%,96.2%,97.5%,分别提升了9.7%,2.8%,4.8%。由此可见,该ViT-WGAN-GP模型能生成较为真实的水稻病害图像,且能在小样本集下,较大幅度提高深度神经网络模型的识别准确率。
In order to solve the problem that the accuracy of deep neural network model learning is affected by the small sample of rice disease image dataset,an improved adversarial generative network model ViT-WGAN-GP(The Fusion of Vision Transformer and Wasserstein Generative Adversarial Networks with Gradient Penalty)is proposed for enhancing the image dataset.Firstly,the Vision Transformer structure is introduced in the generation model to enhance the learning of global features.Secondly,the WGAN-GP structure is used in the discrimination model to ensure the stability of the model training and improved the effect of the generated images through the Wasserstein measure function and the gradient penalty term.Finally,the enhanced sample set is used to train the deep neural network model.The experimental results show that the ViT-WGAN-GP model generates images with significant improvement compared with GAN and WGAN-GP.The average accuracy of rice disease recognition is 94.3%,96.2%,and 97.5%for VGG16,ResNet34,which are improved by 9.7%,2.8%,and 4.8%,respectively.The proposed ViT-WGAN-GP model can generate more realistic rice disease images and can improve the recognition accuracy of deep neural network models significantly with small sample sets.
作者
路阳
许思源
陶贤鹏
刘启旺
管闯
LU Yang;XU Siyuan;TAO Xianpeng;LIU Qiwang;GUAN Chuang(School of Information and Electrical Engineering,Heilongjiang Bayi Agricultural Reclamation University,Daqing 163319,China;Intelligent Vehicle Control Department,Shanghai Shangtai Automotive Information System Limited,Shanghai 200020,China;Heilongjiang Key Laboratory of Networking and Intelligent Control,Northeast Petroleum University,Daqing 163318,China)
出处
《吉林大学学报(信息科学版)》
2025年第4期747-754,共8页
Journal of Jilin University(Information Science Edition)
基金
国家自然科学基金资助项目(62476081)
黑龙江省自然科学基金联合引导基金资助项目(LH2024F048)。
作者简介
路阳(1976-),男,黑龙江双城人,黑龙江八一农垦大学教授,博士生导师,主要从事模式识别与机器学习研究,(Tel)86-13845989360(E-mail)luyanga@sina.com。