Adaptive Levenberg–Marquardt Algorithm: A New Optimization Strategy for Levenberg–Marquardt Neural Networks
Abstract
:1. Introduction
- (1).
- F(w) and f(w) are nonlinearly continuous and differentiable. The f(w) is a neural network model, w is a parameter of the neural network model, and F(w) is the cost function of the neural network. In Refs. [12,13,14,15], the global optimization ability of the proposed algorithm based on nonlinearly continuous and differential F(w) and f(w) was verified. In detail, the activation functions in the neural networks are not necessarily continuous and differentiable, so the above references are only valid in the current setting without proving that the global optimization can be achieved in the neural network model.
- (2).
- J = ∂F(w)/∂w. J is a Jacobian matrix. The Jacobian matrix in Refs. [16,17,18] was obtained by directly solving the system of nonlinear equations. The calculation results were relatively accurate, but the calculation process was very complex. In the neural networks, the Jacobian matrix participating in the proof process was obtained by back-propagation rather than by the derivation of the objective function. Therefore, the theory of the above references may be not suitable for the field of neural networks.
2. Explanation and Problem of LM Algorithms
2.1. LM Algorithm
- (1).
- If q < 0, μ = μ × v, v = v × 2;
- (2).
- If q > 0, μ = μ × max{⅓, 1 − (2q − 1)3}.
2.2. Problem of LM Algorithm
3. Analysis of Falling into “Bad” Local Minima
- (1).
- When σ is sigmoid:
- (2).
- When σ is tanh:
- (3).
- When σ is ReLU:
- (4).
- When σ is PReLU:
4. The Proposed Algorithm: AdaLM
- (1).
- If q < 0, μ = (μ + 1) × v, v = v ×2;
- (2).
- If q > 0, μ =0, v = 2.
Algorithm 1 AdaLM algorithm. |
Begin |
k = 0; v = 2; x = x0; μ = 0; |
A = JTJ; g = JTf(x); |
While (||g||∞ ≥ ε1) and (k < kmax) |
{ |
k = k + 1; |
hk = − (A +μI) −1g/(1 + exp(k −10)) − g/||g||(1 + exp(k −10)); |
if (|| hk || ≤ ε2(||x|| + ε2)) |
break; |
else |
{ |
xnew = x + hk; |
q = (F(x) − F(xnew))/(L(0) − L(hk)); |
if (q > 0) |
{ |
x = x new; |
A = JTJ; g = JTf(x); |
if (||g||∞ ≤ ε1) |
break; |
μ =0; v = 2; |
} |
else |
{ |
}μ = (μ + 1) × v; v = v × 2; |
} |
} |
} |
End |
5. Case Studies
5.1. Experimental Preparation
{x2, x3, ……, xn+1} → {x n+2}
…
{xi, xi+1, ……, xi+n−1} → {xi+n}
…
{xN−n, xN−n+1, ……, xN−1} → {xN}
5.2. The Influence of Each Algorithm on Activation Function
5.3. Evaluation of Prediction Effect
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Luo, G.; Zou, L.; Wang, Z.; Lv, C.; Ou, J.; Huang, Y. A novel kinematic parameters calibration method for industrial robot based on Levenberg-Marquardt and Differential Evolution hybrid algorithm. Robot. Comput. Integr. Manuf. 2021, 71, 102165. [Google Scholar] [CrossRef]
- Kumar, S.S.; Sowmya, R.; Shankar, B.M.; Lingaraj, N.; Sivakumar, S. Analysis of Connected Word Recognition systems using Levenberg Mar-quardt Algorithm for cockpit control in unmanned aircrafts. Mater. Today Proc. 2021, 37, 1813–1819. [Google Scholar] [CrossRef]
- Mahmoudabadi, Z.S.; Rashidi, A.; Yousefi, M. Synthesis of 2D-Porous MoS2 as a Nanocatalyst for Oxidative Desulfuriza-tion of Sour Gas Condensate: Process Parameters Optimization Based on the Levenberg–Marquardt Algorithm. J. Environ. Chem. Eng. 2021, 9, 105200. [Google Scholar] [CrossRef]
- Transtrum, M.K.; Machta, B.B.; Sethna, J.P. Why are nonlinear fits to data so challenging? Phys. Rev. Lett. 2010, 104, 060201. [Google Scholar] [CrossRef]
- Amini, K.; Rostami, F. A modified two steps Levenberg–Marquardt method for nonlinear equations. J. Comput. Appl. Math. 2015, 288, 341–350. [Google Scholar] [CrossRef]
- Kim, M.; Cha, J.; Lee, E.; Pham, V.H.; Lee, S.; Theera-Umpon, N. Simplified Neural Network Model Design with Sensitivity Analysis and Electricity Consumption Prediction in a Commercial Building. Energies 2019, 12, 1201. [Google Scholar] [CrossRef] [Green Version]
- Zhao, L.; Otoo, C.O.A. Stability and Complexity of a Novel Three-Dimensional Envi-ronmental Quality Dynamic Evolution System. Complexity 2019, 2019, 3941920. [Google Scholar] [CrossRef]
- Zhou, W.; Liu, D.; Hong, T. Application of GA-LM-BP Neural Network in Fault Prediction of Drying Furnace Equipment. Matec. Web Conf. 2018, 232, 01041. [Google Scholar] [CrossRef]
- Jia, P.; Zhang, P. Type Identification of Coal Mining Face Based on Wavelet Packet Decomposition and LM-BP. In Proceedings of the 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 23–25 November 2018. [Google Scholar]
- Hua, L.; Bo, L.; Tong, L.; Wang, M.; Fu, H.; Guo, R. Angular Acceleration Sensor Fault Diagnosis Based on LM-BP Neural Network. In Proceedings of the 37th Chinese Control Conference, Wuhan, China, 25–27 July 2018; pp. 6028–6032. [Google Scholar]
- Hossein, A.M.; Nazari, M.A.; Madah, R.G.H.; BehshadShafii, M.; Ahmadi, M.A. Thermal conductivity ratio prediction of Al2O3/water nanofluid by applying connectionist methods. Colloids Surf. A Physicochem. Eng. Asp. 2018, 541, 154–164. [Google Scholar]
- Yang, X. A higher-order Levenberg–Marquardt method for nonlinear equations. Appl. Math. Comput. 2013, 219, 10682–10694. [Google Scholar] [CrossRef]
- Chen, L. A high-order modified Levenberg–Marquardt method for systems of nonlinear equations with fourth-order convergence. Appl. Math. Comput. 2016, 285, 79–93. [Google Scholar] [CrossRef]
- Derakhshandeh, S.Y.; Pourbagher, R.; Kargar, A. A novel fuzzy logic Leven-berg-Marquardt method to solve the ill-conditioned power flow problem. Int. J. Electr. Power Energy Syst. 2018, 99, 299–308. [Google Scholar] [CrossRef]
- Qiao, J.; Wang, L.; Yang, C.; Gu, K. Adaptive levenberg-marquardt algorithm based echo state network for chaotic time series prediction. IEEE Access 2018, 6, 10720–10732. [Google Scholar] [CrossRef]
- Ma, C.; Tang, J. The quadratic convergence of a smoothing Levenberg–Marquardt method for nonlinear complementarity problem. Appl. Math. Comput. 2008, 197, 566–581. [Google Scholar] [CrossRef]
- Du, S.Q.; Gao, Y. Global convergence property of modified Levenberg-Marquardt meth-ods for nonsmooth equations. Appl. Math. 2011, 56, 481. [Google Scholar] [CrossRef] [Green Version]
- Zhou, W. On the convergence of the modified Levenberg–Marquardt method with a non-monotone second order Armijo type line search. J. Comput. Appl. Math. 2013, 239, 152–161. [Google Scholar] [CrossRef]
- Moré, J.J. The Levenberg-Marquardt algorithm: Implementation and theory. In Numerical Analysis; Springer: Berlin/Heidelberg, Germany, 1978; pp. 105–116. [Google Scholar]
- Madsen, K.; Nielsen, H.B.; Tingleff, O. Methods for Non-Linear Least Squares Problems, 2nd ed.; Technical University of Denmark: Lyngby, Denmark, 2004. [Google Scholar]
- Zhang, Z.; Ma, X.; Yang, Y. Bounds on the number of hidden neurons in three-layer binary neural networks. Neural Netw. 2003, 16, 995–1002. [Google Scholar] [CrossRef]
- Liang, X.; Chen, R.C. A unified mathematical form for removing neurons based on or-thogonal projection and crosswise propagation. Neural Comput. Appl. 2010, 19, 445–457. [Google Scholar] [CrossRef]
- Chua, C.G.; Goh, A.T.C. A hybrid Bayesian back-propagation neural network approach to multivariate modelling. Int. J. Numer. Anal. Methods Geomech. 2003, 27, 651–667. [Google Scholar] [CrossRef]
- Sequin, C.H.; Clay, R.D. Fault tolerance in artificial neural networks. In Proceedings of the 1990 IJCNN International Joint Conference on Neural Networks, San Diego, CA, USA, 17–21 June 1990; pp. 703–708. [Google Scholar]
- Li, Y.G.; Nilkitsaranont, P. Gas turbine performance prognostic for condition-based maintenance. Appl. Energy 2009, 86, 2152–2161. [Google Scholar] [CrossRef]
Algorithm | Full Name | Description |
---|---|---|
LM | LM | LM(Δxk) = [JT(xk)J(xk) + μI]−1 J(xk)e(xk) |
HLM | High order LM | HLM(Δxk) = LM(Δxk) + LM(Δxk +1) + LM(Δxk +2) |
TSLM | Three step LM | TSLM(Δxk) = LM(Δxk) + αLM(Δxk +1) + LM(Δxk +2) |
Algorithm | Year | Sigmoid | Tanh | ReLU | PReLU |
---|---|---|---|---|---|
LM | 1978 | √ | × | × | × |
HLM | 2013 | √ | ○ | √ | ○ |
TSLM | 2018 | √ | √ | √ | √ |
AdaLM | 2020 | √ | √ | √ | √ |
Algorithm | LM | HLM | TSLM | AdaLM |
---|---|---|---|---|
Time/s | 140 | 181 | 173 | 62 |
MAE | 41.4847 | 38.2276 | 42.6715 | 32.9175 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yan, Z.; Zhong, S.; Lin, L.; Cui, Z. Adaptive Levenberg–Marquardt Algorithm: A New Optimization Strategy for Levenberg–Marquardt Neural Networks. Mathematics 2021, 9, 2176. https://doi.org/10.3390/math9172176
Yan Z, Zhong S, Lin L, Cui Z. Adaptive Levenberg–Marquardt Algorithm: A New Optimization Strategy for Levenberg–Marquardt Neural Networks. Mathematics. 2021; 9(17):2176. https://doi.org/10.3390/math9172176
Chicago/Turabian StyleYan, Zhiqi, Shisheng Zhong, Lin Lin, and Zhiquan Cui. 2021. "Adaptive Levenberg–Marquardt Algorithm: A New Optimization Strategy for Levenberg–Marquardt Neural Networks" Mathematics 9, no. 17: 2176. https://doi.org/10.3390/math9172176
APA StyleYan, Z., Zhong, S., Lin, L., & Cui, Z. (2021). Adaptive Levenberg–Marquardt Algorithm: A New Optimization Strategy for Levenberg–Marquardt Neural Networks. Mathematics, 9(17), 2176. https://doi.org/10.3390/math9172176