互学习神经网络训练方法研究被引量：32

Research of Mutual Learning Neural Network Training Method

在线阅读下载PDF

导出

摘要由于BP神经网络具有表达能力强,模型简单等特点,经过近30年的发展,在理论和应用研究上都取得了巨大的进步,然而容易陷入局部最优和泛化能力差等问题却限制了神经网络的发展.同时,大数据的出现和深度学习算法的提出与应用,为神经网络向更类脑的方向发展提出了新的要求.针对上述问题,该文从模拟生物双向认知能力的角度出发,构造了一种新的神经网络模型——互学习神经网络模型,该模型在标准正向神经网络的基础上,引入了与其具有结构对称性的负向神经网络,利用正、负向神经网络分别模拟生物的顺向和逆向认知过程,并在此基础上提出了一种新的神经网络训练方法——互学习神经网络训练方法,该方法通过网络连接权值转置共享,正、负双向交替训练的方式对互学习神经网络模型进行训练,从而实现输入数据和输出标签之间的相互学习,使网络具有双向认知能力.实验表明,互学习神经网络训练方法可以同时训练正、负两个神经网络,并使网络收敛.同时,在此基础上提出了"互学习预训练+标准正向训练"的两阶段学习策略和相应的转换学习方法,这种转换学习方法起到了和"无监督预训练+监督微调"相同的效果,能够使网络训练效果更好,是一种快速、稳定、泛化能力强的新型神经网络学习方法. Since BP neural network is expressive and model is simple, there has been a great improvement in both theoretical and applied research over the past 30 years. But its development is held back due to model limitations on local optimism and overfitting. With the emergence of big data and application of deep learning, there are new requirements which gears neural network development towards more pseudo brain. To solve the above problem, this paper presents a new neural network model based on the simulation of biological bidirectional cognitive ability： the mutual learning neural network model. The design of the mutual learning neural network model originates from human being＇s bidirectional cognitive ability, that is, the forward cognitive ability and the backward cognitive ability, and the previous one possesses the cause and demands the result while conversely the latter one possesses the result and demands the cause. The mutual learning neural network model is composed of the positive neural network and the negative neural network. The positive neural network is a feedforward neural network with a hidden layer, which is used to set up the cognitive relationship from the cause（data）to the result（label）and simulate the forward cognitive ability. The negative neural network has the symmetrical relationship with the positive neural network, and it＇s mainly used to set up the cognitive relationship from the result（label）to the cause（data）and simulate the backward cognitive ability. These two neural networks are combined together by weight sharing and construct the neural network model together, which simulates human being＇s bidirectional cognitive procedure. Based on it, this paper proposes a new training method of neural network： mutual learning neural network training method. Firstly, the input of the mutual learning neural network trainirtg method is the data and output of it is the label, which trains the positive neural network through BP learning algorithm. After a certain times of training, the forward link weight matrix is updated and the value assigned to it is transported to the negative neural network（the bias term is independent of each other）. Then use the label to input and the data to output, and train the negative neural network by BP learning algorithm. After certain times of training, the backward link weight matrix is also updated and the value assigned to it is transposed to the positive neural network（the bias term is independent of each other）. Such reciprocate alternation is continued until the end of iteration. So the mutual learning neural network training method realizes the mutual learning procedure between the input of the data and the output of the label, and by training it enables the mutual learning neural network model possess the bidirectional cognitive ability. Experiment results show that the mutual learning neural network training method can train both positive and negative network simultaneously, and it is a convergent learning algorithm. In addition, this paper also proposes the ＂mutual learning neural network training ＋ standard positive neural network training＂, a two stage learning strategy, making it as effective as ＂pre training q- fine-tuning＂ learning strategies, thus, making the network training more effective. This is a fast, stable, and widely generalized neural network training method.

作者刘威刘尚白润才周璇周定宁 LIU Wei LIU Shang BAI Run-Cai ZHOU Xuan ZHOU Ding-Ning(institute of Mathematics and Systems Science, Liaoning Technical University, Fuxin, Liaoning 123000 institute of Intelligence Engineering and Mathematics, Liaoning Technical University, Fuxin, Liaoning 123000 College of Mining Engineering, Liaoning Technical University, Fuxin, Liaoning 123000)

机构地区辽宁工程技术大学数学与系统科学研究所辽宁工程技术大学智能工程与数学研究院辽宁工程技术大学矿业学院

出处《计算机学报》 EI CSCD 北大核心 2017年第6期1291-1308,共18页 Chinese Journal of Computers

基金国家自然科学基金(51304114 71371091)资助~~

关键词神经网络互学习权值共享 BP算法双向认知分类识别人工智能 neural network mutual learning weights sharing back propagation algorithm bidirectional cognitive classification artificial intelligence

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

作者简介刘威，男，1977年生，博士，副教授，中国计算机学会（CCF）会员，主要研究方向为机器学习、深度神经网络和矿业系统工程．E-mail：1v8218218@126．com．刘尚（通信作者），男，1988年生，硕士，主要研究方向为人工智能与模式识别、机器学习．E-mail：whiteinblue@126．com．白润才，男，1961年生，博士，教授，主要研究领域为矿业系统工程．周璇，女，1992年生，硕士，主要研究方向为机器学习、深度神经网络．周定宁，男，1993年生，硕士，主要研究方向为机器学习、深度神经网络．

引文网络
相关文献

参考文献3

1焦李成,杨淑媛,刘芳,王士刚,冯志玺.神经网络七十年:回顾与展望[J].计算机学报,2016,39(8):1697-1716. 被引量：385
2刘建伟,刘媛,罗雄麟.深度学习研究进展[J].计算机应用研究,2014,31(7):1921-1930. 被引量：296
3邓万宇,郑庆华,陈琳,许学斌.神经网络极速学习方法研究[J].计算机学报,2010,33(2):279-287. 被引量：163

二级参考文献125

1王守觉,曹文明.半导体神经计算机的硬件实现及其在连续语音识别中的应用[J].电子学报,2006,34(2):267-271. 被引量：3
2Hornik K. Approximation capabilities of multilayer feedforward networks. Neural Networks, 1991, 4(2): 251-257.
3Leshno M, Lin V Y, Pinkus A, Schocken S. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Networks, 1993, 6(6) : 861-867.
4Huang G-B, Babri H A. Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions. IEEE Transactions on Neural Networks, 1998, 9(1): 224-229.
5Huang G-B. Learning capability and storage capacity of two hidden-layer feedforward networks. IEEE Transactions on Neural Networks, 2003, 14(2): 274-281.
6Huang G-B, Zhu Q-Y, Siew C-K. Extreme learning machine: Theory and applications. Neurocomputing, 2006, 70 (1-3): 489-501.
7Vapnik V N. The Nature of Statistical Learning Theory. New York: Springer, 1995.
8Rousseeuw P J, Leroy A. Robust Regression and Outlier Detection. New York: Wiley, 1987.
9Rumelhart D E, McClelland J L. Parallel Distributed Processing. Cambridge.. MIT Press, 1986, 1(2): 125-187.
10Cristianini N, Shawe-Taylor J. An Introduction to Support Vector Machines. Cambridge: Cambridge University Press, 2000.

共引文献833

1朱新乐.基于BP神经网络的绿色供应链优化研究[J].运输经理世界,2023(11):156-158.
2乔延松,贺泽华,赵绪营.面向动态数据流的改进OSELM算法研究[J].北京电子科技学院学报,2020(3):1-12. 被引量：2
3邢毅雪,朱永华,高海燕,周金,张克.基于注意力机制的远程监督实体关系抽取[J].上海大学学报（自然科学版）,2021,27(5):983-992. 被引量：7
4冯建英,吴丹丹,王博,王智,穆维松.中文在线评论文本分析对生鲜农产品电商影响研究综述[J].农业机械学报,2021,52(S01):504-512. 被引量：10
5张珂,王军政,丁嘉豪,彭析竹.恶劣气候背景下基于CNN的毫米波近程测距技术研究[J].微电子学,2023,53(6):1053-1058.
6徐美,刘春腊.湖南省资源环境承载力预警评价与警情趋势分析[J].经济地理,2020,40(1):187-196. 被引量：22
7田琳,舒康安,黄远明,黄志生,孙谦,盛剑胜.发电商滥用市场力行为识别方法研究——基于朴素贝叶斯方法的分析[J].价格理论与实践,2021(5):43-48. 被引量：3
8何海洋,路玉,乔保军.一种改进Octave神经网络的图像识别模型[J].河南大学学报（自然科学版）,2020(6):700-706.
9南敬昌,孙雯雯,杜有益,王明寰.一维卷积神经网络超宽带天线建模方法[J].电子测量与仪器学报,2023,37(2):204-210. 被引量：1
10龚晗.深度学习与教育治理现代化:内涵、机遇、挑战及启示[J].中国多媒体与网络教学学报（电子版）,2020(15):65-67.

同被引文献307

1桂鹍鹏,蒋鑫,宋欣,丁益.5G通信技术在智慧水利中的应用前景分析[J].人民长江,2021,52(S02):283-288. 被引量：14
2代才,石晓琪.基于新的适应度函数和多搜索策略的高维多目标进化算法[J].计算机应用研究,2020,37(1):85-88. 被引量：11
3李兴旺,冯宝平.基于BP神经网络的土壤含水量预测[J].水土保持学报,2002,16(5):117-119. 被引量：16
4白晨明,孟建军,周晓丽,蒋兆远.神经网络在机场物流预测中的应用研究[J].兰州交通大学学报,2004,23(3):39-43. 被引量：12
5康重庆,夏清,张伯明.电力系统负荷预测研究综述与发展方向的探讨[J].电力系统自动化,2004,28(17):1-11. 被引量：504
6王娟茹,赵嵩正,杨瑾.隐性知识共享模型与机制研究[J].科学学与科学技术管理,2004,25(10):65-67. 被引量：77
7江新,郑兰琴,黄荣怀.关于隐性知识的分类研究[J].开放教育研究,2005,11(1):28-31. 被引量：70
8徐玉英.土壤含水量计算方法[J].东北水利水电,2005,23(7):29-30. 被引量：4
9肖海波,秦鲁敏,阳劲.基于BP神经网络的机场旅客吞吐量预测[J].山西科技,2005(5):121-122. 被引量：10
10杨南昌,谢云,熊频.SECI:一种教师共同体知识创新与专业发展的模型[J].中国电化教育,2005(10):16-20. 被引量：32

引证文献32

1任兀.艺术管理者隐性知识管理探赜[J].艺术管理（中英文）,2020(3):153-160. 被引量：2
2刘威,黄敏,白润才,刘光伟,成秘,付杰,王薪予.AutoLSTM下的降水量预测方法[J].辽宁工程技术大学学报（自然科学版）,2020(5):451-458. 被引量：3
3韦丽玲.乡镇(村)图书馆(室)为振兴农村经济服务初探[J].图书馆界,2000(1):1-4. 被引量：5
4孙慧婷,闫磊.基于大数据分析的复值BP神经网络方法[J].佳木斯大学学报（自然科学版）,2018,36(4):543-546. 被引量：1
5陈吕鹏,殷林飞,余涛,王克英.基于深度森林算法的电力系统短期负荷预测[J].电力建设,2018,39(11):42-50. 被引量：33
6张铮,王顺帆,董雷.基于深度学习的验证码识别[J].湖北工业大学学报,2018,33(2):5-8. 被引量：13
7钟翔,朱彩云,韩旭.基于BP神经网络的机场安检旅客流量预测模型[J].航空工程进展,2019,10(5):655-663. 被引量：1
8杨静宗,杨天晴,周成江,潘安宁.基于改进ABC-LSSVM的浆体管道临界淤积流速预测[J].南京师大学报（自然科学版）,2020,43(1):136-142. 被引量：4
9谢聪敏.基于神经网络的安卓恶意软件检测设计[J].电子设计工程,2020,28(9):50-53. 被引量：4
10何忠文.电网信息物理系统耦合决策控制技术研究进展[J].集成电路应用,2020,37(6):1-3.

二级引证文献126

1赵子凌,李晋宏.铝电解分子比预测算法研究及应用[J].轻金属,2022(1):30-39. 被引量：2
2黄灿,田冷,王恒力,王嘉新,蒋丽丽.基于条件生成式对抗网络的油藏单井产量预测模型[J].计算物理,2022,39(4):465-478. 被引量：3
3程月华,江文建,杨浩,薛琪,廖鹤.基于深度森林的卫星ACS执行机构与传感器故障识别[J].航空学报,2020(S01):195-205. 被引量：13
4周靖.探讨验证码[J].办公自动化,2021(4):22-23. 被引量：1
5刘琳.农村文化室建设模式初探[J].科技资讯,2007,5(34):219-219. 被引量：1
6周会群.县级图书馆信息服务模式转型初探[J].图书馆,2004(4):105-106. 被引量：3
7孟玉珠.牧区旗县级图书馆如何建立纵横网络为牧区生产开展优质服务[J].内蒙古图书馆工作,2007(1):48-49.
8耿建华,侯莉.试论乡镇图书馆的四个基本建设[J].科技情报开发与经济,2002,12(2):31-32. 被引量：1
9谢海燕.乡镇图书馆信息服务的深化[J].图书馆界,2002(2):17-19. 被引量：2
10周义凯,王宇,赵勇飞,袁燕.基于CNN的人体姿态识别[J].计算机与现代化,2019(2):49-54. 被引量：7

1鄢家奇.C#通用倒排表生成算法[J].电脑编程技巧与维护,2012(21):14-20.
2张会清.也谈Word表格巧转置[J].计算机应用文摘,2004(13):98-98.
3张平,苑明哲,王宏.基于遗传算法的FUZZY+CMAC优化设计[J].电机与控制学报,2006,10(2):195-198. 被引量：4
4李林,李建兵,牛鹏超.基于粒子群算法的RBF神经网络的优化方法[J].山东电力高等专科学校学报,2010,13(1):51-53. 被引量：4
5蒋林涛.互联网与物联网[J].电信工程技术与标准化,2010,23(2):1-5. 被引量：55
6田丽,林锦国,刘建峰,张光云.基于演化算法的客户关系管理系统研究[J].微处理机,2005,26(3):27-29.
7王家军,孙嘉豪,郑致远.基于滑模学习神经网络的开关磁阻电动机位置控制[J].控制与决策,2017,32(6):1133-1136. 被引量：1
8朱福珍,李金宗,朱兵,李冬冬,杨学峰.基于径向基函数神经网络的超分辨率图像重建[J].光学精密工程,2010,18(6):1444-1451. 被引量：21
9谢蓄芬,刘泊,王德军.一种改进BP神经网络在模式识别中的应用[J].哈尔滨理工大学学报,2004,9(5):63-65. 被引量：7
10三线IT[J].信息方略,2009(19):10-10.

计算机学报

2017年第6期

浏览历史

内容加载中请稍等...

互学习神经网络训练方法研究被引量：32

参考文献3

二级参考文献125

共引文献833

同被引文献307

引证文献32

二级引证文献126

相关作者

相关机构

相关主题

浏览历史

互学习神经网络训练方法研究 被引量：32

参考文献3

二级参考文献125

共引文献833

同被引文献307

引证文献32

二级引证文献126

相关作者

相关机构

相关主题

浏览历史

互学习神经网络训练方法研究被引量：32