期刊文献+

互学习神经网络训练方法研究 被引量:32

Research of Mutual Learning Neural Network Training Method
在线阅读 下载PDF
导出
摘要 由于BP神经网络具有表达能力强,模型简单等特点,经过近30年的发展,在理论和应用研究上都取得了巨大的进步,然而容易陷入局部最优和泛化能力差等问题却限制了神经网络的发展.同时,大数据的出现和深度学习算法的提出与应用,为神经网络向更类脑的方向发展提出了新的要求.针对上述问题,该文从模拟生物双向认知能力的角度出发,构造了一种新的神经网络模型——互学习神经网络模型,该模型在标准正向神经网络的基础上,引入了与其具有结构对称性的负向神经网络,利用正、负向神经网络分别模拟生物的顺向和逆向认知过程,并在此基础上提出了一种新的神经网络训练方法——互学习神经网络训练方法,该方法通过网络连接权值转置共享,正、负双向交替训练的方式对互学习神经网络模型进行训练,从而实现输入数据和输出标签之间的相互学习,使网络具有双向认知能力.实验表明,互学习神经网络训练方法可以同时训练正、负两个神经网络,并使网络收敛.同时,在此基础上提出了"互学习预训练+标准正向训练"的两阶段学习策略和相应的转换学习方法,这种转换学习方法起到了和"无监督预训练+监督微调"相同的效果,能够使网络训练效果更好,是一种快速、稳定、泛化能力强的新型神经网络学习方法. Since BP neural network is expressive and model is simple, there has been a great improvement in both theoretical and applied research over the past 30 years. But its development is held back due to model limitations on local optimism and overfitting. With the emergence of big data and application of deep learning, there are new requirements which gears neural network development towards more pseudo brain. To solve the above problem, this paper presents a new neural network model based on the simulation of biological bidirectional cognitive ability: the mutual learning neural network model. The design of the mutual learning neural network model originates from human being's bidirectional cognitive ability, that is, the forward cognitive ability and the backward cognitive ability, and the previous one possesses the cause and demands the result while conversely the latter one possesses the result and demands the cause. The mutual learning neural network model is composed of the positive neural network and the negative neural network. The positive neural network is a feedforward neural network with a hidden layer, which is used to set up the cognitive relationship from the cause(data)to the result(label)and simulate the forward cognitive ability. The negative neural network has the symmetrical relationship with the positive neural network, and it's mainly used to set up the cognitive relationship from the result(label)to the cause(data)and simulate the backward cognitive ability. These two neural networks are combined together by weight sharing and construct the neural network model together, which simulates human being's bidirectional cognitive procedure. Based on it, this paper proposes a new training method of neural network: mutual learning neural network training method. Firstly, the input of the mutual learning neural network trainirtg method is the data and output of it is the label, which trains the positive neural network through BP learning algorithm. After a certain times of training, the forward link weight matrix is updated and the value assigned to it is transported to the negative neural network(the bias term is independent of each other). Then use the label to input and the data to output, and train the negative neural network by BP learning algorithm. After certain times of training, the backward link weight matrix is also updated and the value assigned to it is transposed to the positive neural network(the bias term is independent of each other). Such reciprocate alternation is continued until the end of iteration. So the mutual learning neural network training method realizes the mutual learning procedure between the input of the data and the output of the label, and by training it enables the mutual learning neural network model possess the bidirectional cognitive ability. Experiment results show that the mutual learning neural network training method can train both positive and negative network simultaneously, and it is a convergent learning algorithm. In addition, this paper also proposes the "mutual learning neural network training + standard positive neural network training", a two stage learning strategy, making it as effective as "pre training q- fine-tuning" learning strategies, thus, making the network training more effective. This is a fast, stable, and widely generalized neural network training method.
作者 刘威 刘尚 白润才 周璇 周定宁 LIU Wei LIU Shang BAI Run-Cai ZHOU Xuan ZHOU Ding-Ning(institute of Mathematics and Systems Science, Liaoning Technical University, Fuxin, Liaoning 123000 institute of Intelligence Engineering and Mathematics, Liaoning Technical University, Fuxin, Liaoning 123000 College of Mining Engineering, Liaoning Technical University, Fuxin, Liaoning 123000)
出处 《计算机学报》 EI CSCD 北大核心 2017年第6期1291-1308,共18页 Chinese Journal of Computers
基金 国家自然科学基金(51304114 71371091)资助~~
关键词 神经网络 互学习 权值共享 BP算法 双向认知 分类识别 人工智能 neural network mutual learning weights sharing back propagation algorithm bidirectional cognitive classification artificial intelligence
作者简介 刘威,男,1977年生,博士,副教授,中国计算机学会(CCF)会员,主要研究方向为机器学习、深度神经网络和矿业系统工程.E-mail:1v8218218@126.com. 刘尚(通信作者),男,1988年生,硕士,主要研究方向为人工智能与模式识别、机器学习.E-mail:whiteinblue@126.com. 白润才,男,1961年生,博士,教授,主要研究领域为矿业系统工程. 周璇,女,1992年生,硕士,主要研究方向为机器学习、深度神经网络. 周定宁,男,1993年生,硕士,主要研究方向为机器学习、深度神经网络.
  • 相关文献

参考文献3

二级参考文献125

  • 1王守觉,曹文明.半导体神经计算机的硬件实现及其在连续语音识别中的应用[J].电子学报,2006,34(2):267-271. 被引量:3
  • 2Hornik K. Approximation capabilities of multilayer feedforward networks. Neural Networks, 1991, 4(2): 251-257.
  • 3Leshno M, Lin V Y, Pinkus A, Schocken S. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Networks, 1993, 6(6) : 861-867.
  • 4Huang G-B, Babri H A. Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions. IEEE Transactions on Neural Networks, 1998, 9(1): 224-229.
  • 5Huang G-B. Learning capability and storage capacity of two hidden-layer feedforward networks. IEEE Transactions on Neural Networks, 2003, 14(2): 274-281.
  • 6Huang G-B, Zhu Q-Y, Siew C-K. Extreme learning machine: Theory and applications. Neurocomputing, 2006, 70 (1-3): 489-501.
  • 7Vapnik V N. The Nature of Statistical Learning Theory. New York: Springer, 1995.
  • 8Rousseeuw P J, Leroy A. Robust Regression and Outlier Detection. New York: Wiley, 1987.
  • 9Rumelhart D E, McClelland J L. Parallel Distributed Processing. Cambridge.. MIT Press, 1986, 1(2): 125-187.
  • 10Cristianini N, Shawe-Taylor J. An Introduction to Support Vector Machines. Cambridge: Cambridge University Press, 2000.

共引文献833

同被引文献307

引证文献32

二级引证文献126

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部