Data-Driven Stability Assessment of Multilayer Long Short-Term Memory Networks
Abstract
:1. Introduction
2. System Dynamic Models
2.1. Linear Model
2.2. Nonlinear Model
3. Method and Implementation
3.1. Method
3.2. Software Implementation
- A sequence of diverse inputs is generated and fed to the model simulated in a MATLAB/Simulink framework. The output signal is saved and together with the input data constitutes the dataset;
- The LSTM network is trained with the input/output dataset previously generated;
- A new dataset is generated and fed both to Simulink and to the LSTM;
- The datasets are analysed with built-in MATLAB functions, and the poles are identified.
3.2.1. Training Dataset Generation
3.2.2. Neural Network Definition and Training
3.2.3. Testing Dataset Generation
3.2.4. Poles’ Identification and Stability Assessment
4. Results and Discussion
- The tuning of the parameters of the network is explained in Section 4.1;
- The physical parameters of the models are introduced, and theoretical poles obtained with that parameter set are reported in Section 4.2;
- The statistics of the identified poles are given and compared to the theoretical ones in Section 4.3;
- Additional analyses are reported to integrate the presented results in Section 4.4 and Section 4.5;
- Lessons learned and additional design considerations are reported in Section 4.6.
4.1. Tuning of the Network Parameters
- the dimension of LSTM cells;
- the number of training epochs;
- the look-back factor.
4.2. Tuning of the Model Parameters and Theoretical Poles
4.3. Poles’ Statistics in Different Models
4.4. Unstable Model Identification
4.5. Introducing Noise
4.6. Additional Design Considerations
- Ensuring that the training dataset is composed by different signals in terms of magnitude and type is a key factor to obtain an NN capable of generalising the output. Additionally, other experimental results proved to be key to improve the ability of the network to learn different dynamic behaviour with corresponding improvements in the associated poles. During the design of the datasets, two main factors emerged, which are capable of jeopardising the accuracy of the identified poles:
- the presence of ramps with low slopes;
- the presence of sinusoidal inputs with long periods.
More generally, the presence of inputs exciting only slow dynamics is believed to cause issues while using tfest. To avoid the issues associated with the sinusoidal inputs, it is possible to set a lower boundary to the period of those signals. By defining the period of the chosen sinusoidal inputs and the period of the simulation run , it is possible to impose: .Regarding the slope of the ramp (), no analytical boundary was identified, and a trial-and-error approach led to finding different values for each model simulated (for reference: and for the linear and nonlinear models, respectively). - It is also important to recall that the dataset size affects the training capabilities of the DNN. In this application, the dataset, formed by 500 runs (corresponding to 500 different inputs), has points. Previous training tests carried out with a smaller dataset did not show satisfactory identification performances. Ensuring that the training set is large enough is therefore an important aspect of the procedure.
- The last factor affecting the performance of the framework is identified as the scaling of the dataset. The input gate of the LSTM cells is sensible relative to the magnitude of the input signal: if the latter is excessive, the gate can saturate and show degrading identification performances. The input time series can be bounded between or (with denoting the standard deviation) by means of a normalisation or a standardisation, respectively. For this work, the dataset was standardised since the method proved to be more robust with respect to the presence of outstanding outliers. For instance, the presence of a single step signal with a final value one order of magnitude higher than the average value of the rest of the dataset in a normalised dataset could lead the NN to assign a higher weight to some neurons in order to minimise the error for that case. That would in turn cause a degradation of the performance for the signals with a smaller magnitude. On the contrary, the standardisation of the dataset allows detecting when the high value signal is outside the interval of the dataset and consequently assigning to it a relative lower weight during the training step.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
RNN | Recurrent Neural Networks |
NARX | Nonlinear AutoRegressive models with eXogenous inputs |
NARMAX | Nonlinear AutoRegressive Moving-Average with eXogenous inputs |
DNN | Deep Neural Networks |
ESN | Echo State Networks |
GRU | Gated Recurrent Units |
MSE | Mean Square Error |
RL | Reinforcement Learning |
References
- Zhang, Y.; Chu, B.; Shu, Z. A Preliminary Study on the Relationship between Iterative Learning Control and Reinforcement Learning. IFAC-PapersOnLine 2019, 59, 314–319. [Google Scholar] [CrossRef]
- Huang, J.W.; Gao, J.W. How could data integrate with control? A review on data based control strategy. Int. J. Dyn. Control 2020, 8, 1189–1199. [Google Scholar] [CrossRef]
- Sak, H.; Senior, A.W.; Beaufays, F. Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling; International Speech Communication Association (ISCA): Singapore, 2014. [Google Scholar]
- Graves, A.; Mohamed, A.R.; Hinton, G. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 6645–6649. [Google Scholar]
- Fan, Y.; Qian, Y.; Xie, F.L.; Soong, F.K. TTS synthesis with bidirectional LSTM based recurrent neural networks. In Proceedings of the Fifteenth Annual Conference of the International Speech Communication Association, Singapore, 14–18 September 2014. [Google Scholar]
- Cheng, M.; Sori, W.J.; Jiang, F.; Khan, A.; Liu, S. Recurrent neural network based classification of ECG signal features for obstruction of sleep apnea detection. In Proceedings of the 2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), Guangzhou, China, 21–24 July 2017; Volume 2, pp. 199–202. [Google Scholar]
- Shahid, F.; Zameer, A.; Muneeb, M. Predictions for COVID-19 with deep learning models of LSTM, GRU and Bi-LSTM. Chaos Solitons Fractals 2020, 140, 110212. [Google Scholar] [CrossRef] [PubMed]
- Luna-Perejón, F.; Domínguez-Morales, M.J.; Civit-Balcells, A. Wearable fall detector using recurrent neural networks. Sensors 2019, 191, 4885. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ma, L.; Liu, M.; Wang, N.; Wang, L.; Yang, Y.; Wang, H. Room-level fall detection based on ultra-wideband (UWB) monostatic radar and convolutional long short-term memory (LSTM). Sensors 2020, 20, 1105. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Jaeger, H. Tutorial on Training Recurrent Neural Networks, Covering BPPT, RTRL, EKF and the Echo State Network Approach; GMD-Forschungszentrum Informationstechnik: Bonn, Germany, 2002; Volume 5. [Google Scholar]
- Petneházi, G. Recurrent neural networks for time series forecasting. arXiv 2019, arXiv:1901.00069. [Google Scholar]
- Bonassi, F.; Terzi, E.; Farina, M.; Scattolini, R. LSTM neural networks: Input to state stability and probabilistic safety verification. Proc. Mach. Learn. Res. 2020, 120, 1–10. [Google Scholar]
- Greff, K.; Srivastava, R.K.; Koutník, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Networks Learn. Syst. 2016, 28, 2222–2232. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, Y. A new concept using lstm neural networks for dynamic system identification. In Proceedings of the 2017 American Control Conference (ACC), Seattle, WA, USA, 24–26 May 2017; pp. 5324–5329. [Google Scholar]
- Gonzalez, J.; Yu, W. Non-linear system modelling using LSTM neural networks. IFAC-PapersOnLine 2018, 51, 485–489. [Google Scholar] [CrossRef]
- Brusaferri, A.; Matteucci, M.; Portolani, P.; Spinelli, S. Nonlinear system identification using a recurrent network in a Bayesian framework. In Proceedings of the 2019 IEEE 17th International Conference on Industrial Informatics (INDIN), Helsinki, Finland, 22–25 July 2019; Volume 1, pp. 319–324. [Google Scholar]
- Li, X.; Wu, X. Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Australia, 19–24 April 2015; pp. 4520–4524. [Google Scholar]
- Stipanović, D.M.; Murmann, B.; Causo, M.; Lekić, A.; Royo, V.R.; Tomlin, C.J.; Beigne, E.; Thuries, S.; Zarudniev, M.; Lesecq, S. Some local stability properties of an autonomous long short-term memory neural network model. In Proceedings of the 2018 IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy, 27–30 May 2018; pp. 1–5. [Google Scholar]
- Prakash, A.; Hasan, S.A.; Lee, K.; Datla, V.; Qadir, A.; Liu, J.; Farri, O. Neural Paraphrase Generation with Stacked Residual LSTM Networks. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING 2016), Saka, Japan, 11–16 December 2016; pp. 2923–2934. [Google Scholar]
- Hewamalage, H.; Bergmeir, C.; Bandara, K. Recurrent neural networks for time series forecasting: Current status and future directions. arXiv 2019, arXiv:1909.00590. [Google Scholar]
- Barabanov, N.E.; Prokhorov, D.V. Stability analysis of discrete-time recurrent neural networks. IEEE Trans. Neural Netw. 2002, 13, 292–303. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Amrouche, M.; An, D.S.; Lekić, A.; Royo, V.R.; Chai, E.T.; Stipanović, D.M.; Murmann, B.; Tomlin, C.J. Long short-term memory neural network equilibria computation and analysis. In Proceedings of the Workshop on Modeling and Decision-Making in the Spatiotemporal Domain, 32nd Conference on Neural Information Processing Systems (NIPS), Montréal, QC, Canada, 3–8 December 2018. [Google Scholar]
Model | Damping | Pole 1 | Pole 2 |
---|---|---|---|
(rad/s) | (rad/s) | ||
linear | underdamped | −0.50 + 0.87j | −0.50 − 0.87j |
linear | overdamped | −0.10 | −9.90 |
nonlinear | underdamped | −1.00 + 2.97j | −1.00 − 2.97j |
nonlinear | overdamped | −1.94 | −5.06 |
Dynamics | Model | Slow Pole | Fast Pole |
---|---|---|---|
linear | Simulink | 50.01% | 15.57% |
overdamped | LSTM | 34.77% | 92.08% |
nonlinear | Simulink | 7.67% | 21.99% |
overdamped | LSTM | 43.29% | 47.89% |
Dynamics | Model | ||
---|---|---|---|
linear | Simulink | 0.17% | 1.64% |
underdamped | LSTM | 74.2% | 26.5% |
nonlinear | Simulink | 4.38% | 4.38% |
underdamped | LSTM | 50.86% | 3.93% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Grande, D.; Harris, C.A.; Thomas, G.; Anderlini, E. Data-Driven Stability Assessment of Multilayer Long Short-Term Memory Networks. Appl. Sci. 2021, 11, 1829. https://doi.org/10.3390/app11041829
Grande D, Harris CA, Thomas G, Anderlini E. Data-Driven Stability Assessment of Multilayer Long Short-Term Memory Networks. Applied Sciences. 2021; 11(4):1829. https://doi.org/10.3390/app11041829
Chicago/Turabian StyleGrande, Davide, Catherine A. Harris, Giles Thomas, and Enrico Anderlini. 2021. "Data-Driven Stability Assessment of Multilayer Long Short-Term Memory Networks" Applied Sciences 11, no. 4: 1829. https://doi.org/10.3390/app11041829
APA StyleGrande, D., Harris, C. A., Thomas, G., & Anderlini, E. (2021). Data-Driven Stability Assessment of Multilayer Long Short-Term Memory Networks. Applied Sciences, 11(4), 1829. https://doi.org/10.3390/app11041829