1. Introduction
Time series refers to a sequence of data points that vary over time, which widely exists in economic, industrial, and other sectors. Time-series prediction is the prediction of future trends at a specific time point or interval, derived from an established set of sequence observations, which serves as a guiding principle for making informed decisions about future actions. For example, investors frequently analyze recent stock price fluctuations to anticipate future movements. In recent years, there have been comprehensive studies on time-series prediction involving various machine-learning techniques. Among these, neural networks have been demonstrated to yield generally superior prediction outcomes compared to traditional machine-learning methods [
1,
2].
Because of the differences in time series themselves, they may exhibit various patterns of change, such as periodic, stability, and non-stability [
3]. General time series is not only unstable in the trend but also has obvious fluctuation at the micro level, i.e., the change of data has strong uncertainty between adjacent sampling time points. Stock prices as well as sales, visitor flow, etc., belong to this type. Compared with others, these nonlinear and non-stationary sequences have more complex features, so that traditional machine-learning and deep-learning models have a larger prediction error [
4]. Sequence stabilization, such as Fourier transform, wavelet transform, and Empirical Mode Decomposition (EMD), is a common method used to deal with such sequences as those above. In recent years, EMD has been widely used in the prediction of this type of sequence. Through EMD, the short-term change and the long-term trend can be separated from the original sequences to improve the prediction effect.
However, considering the dataset selection, current studies of EMD have been conducted on a single sequence with many time nodes for each time. In fact, most datasets are composed of more than one sequence obtained from relatively independent entities of a single system, which are called “multiple sequences” in this paper. Although these sequences have the same attributes, there are many differences in value ranges and variation characteristics among them. As for the application of EMD to multiple sequences, there is still a lack of relevant research.
On the other hand, most studies use the same prediction model for the decomposed sequences. However, since the different subsequences have different characteristics and influence on the original sequence, the effect of a single model is different among each subsequence as well. Multiple models are sometimes considered in actual research. Consequently, we considered inputting each subsequence into these models, then selecting the model with the best prediction effect for each for integration, which is the idea of “decomposition–prediction–integration” [
4].
The attention mechanism, initially developed for natural language processing, is a neural network model that can be applied to extract a set of feature vectors from any given problem. Consequently, the general attention model demonstrates its versatility across various domains, including time-series analysis [
5]. It can be employed in conjunction with other neural networks, such as the Transformer [
6], to predict time series.
Most studies regarding attention mechanisms with sequence decomposition are still limited to using the same model for each subsequence. However, among the subsequences, there is still the complexity of local features and the differences of global features, such as period, to be considered. Consequently, our focus is on selecting the most suitable integration model based on the prediction performance of each subsequence to enhance the overall prediction efficacy, in line with the “decomposition–prediction–integration” concept.
To summarize, this research presents an integrated time-series prediction model constructed upon EMD with two attention mechanisms, namely Self-Attention (SA, a.k.a. partial attention) and Temporal Attention (TA, a.k.a. global attention). The primary innovations of our model are as follows:
Employing the concept of “disintegration–prediction–integration”, subsequences decomposed by CEEMDAN are separately input into three networks, namely LSTM, LSTM-Self-Attention (LSTM-SA), and LSTM-Temporal Attention (LSTM-TA), for training, followed by the selection of the optimal model for each subsequence to integrate.
Experiments were conducted on both single sequence and multiple sequence datasets. In addition, considering the characteristics of multiple sequences, two data preprocessing methods, “global normalization” and “separate normalization”, are investigated and compared.
Author Contributions
Data curation, X.W. and R.Z.; investigation, X.W.; methodology, X.W., S.D. and R.Z.; project administration, X.W.; resources, X.W.; software, S.D.; validation, S.D. and R.Z.; writing—original draft, S.D.; writing—review and editing, X.W., S.D. and R.Z. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
Conflicts of Interest
The authors declare no conflict of interest.
References
- Lai, G.K.; Chang, W.C.; Yang, Y.M.; Liu, H.X. Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks. In Proceedings of the 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Ann Arbor, MI, USA, 8–12 July 2018; pp. 95–104. [Google Scholar]
- Zerveas, G.; Jayaraman, S.; Patel, D.; Bhamidipaty, A.; Eickhoff, C. A Transformer-based Framework for Multivariate Time Series Representation Learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’21), Virtual Event, Singapore, 14–18 August 2021; pp. 2114–2124. [Google Scholar]
- Ren, H.S.; Xu, B.X.; Wang, Y.J.; Yi, C.; Huang, C.R.; Kou, X.Y.; Xing, T.; Yang, M.; Tong, J.; Zhang, Q. Time-Series Anomaly Detection Service at Microsoft. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ‘19), Anchorage, AK, USA, 4–8 August 2019; pp. 3009–3017. [Google Scholar]
- Zhai, N.N. Prediction of Exchange Rate and Shanghai Composite Index Based on Two Integrated Models. Master’ Thesis, Lanzhou University, Lanzhou, China, 2022. [Google Scholar]
- Brauwers, G.; Frasincar, F. A General Survey on Attention Mechanisms in Deep Learning. IEEE Trans. Knowl. Data Eng. 2023, 35, 3279–3298. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Huang, N.E.; Shen, Z.; Long, S.R. The Empirical Mode Decomposition and the Hilbert Spectrum for Nonlinear and Non-stationary Time Series Analysis. Proc. R. Soc. London. Ser. A Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
- Wu, Z.; Huang, N.E. Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal. 2009, 1, 1–41. [Google Scholar] [CrossRef]
- Torres, M.E.; Coloinas, M.A.; Schlotthauer, G.; Flandrin, P. A complete ensemble empirical mode decomposition with adaptive noise. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague Congress Ctr, Prague, Czech Republic, 22–27 May 2011; pp. 4144–4147. [Google Scholar]
- Zheng, J.D.; Su, M.X.; Ying, W.M.; Tong, J.Y.; Pan, Z.W. Improved uniform phase empirical mode decomposition and its application in machinery fault diagnosis. Measurement 2021, 179, 109425. [Google Scholar] [CrossRef]
- Adam, A.M.; Kyei, K.; Moyo, S.; Gill, R.; Gyamfi, E.N. Similarities in Southern African Development Community (SADC) Exchange Rate Markets Structure: Evidence from the Ensemble Empirical Mode Decomposition. J. Afr. Bus. 2022, 23, 516–530. [Google Scholar] [CrossRef]
- Mousavi, A.A.; Zhang, C.W.; Masri, S.F.; Gholipour, G. Structural damage detection method based on the complete ensemble empirical mode decomposition with adaptive noise: A model steel truss bridge case study. Struct. Health Monit.–Int. J. 2022, 21, 887–912. [Google Scholar] [CrossRef]
- Wang, J.Y.; Li, J.G.; Wang, H.T.; Guo, L.X. Composite fault diagnosis of gearbox based on empirical mode decomposition and improved variational mode decomposition. J. Low Freq. Noise Vib. Act. Control 2021, 40, 332–346. [Google Scholar] [CrossRef]
- Ying, W.M.; Zheng, J.D.; Pan, H.Y.; Liu, Q.Y. Permutation entropy-based improved uniform phase empirical mode decomposition for mechanical fault diagnosis. Digit. Signal Process. 2021, 117. [Google Scholar] [CrossRef]
- Seyrek, P.; Sener, B.; Ozbayoglu, A.M.; Unver, H.O. An Evaluation Study of EMD, EEMD, and VMD For Chatter Detection in Milling. In Proceedings of the 3rd International Conference on Industry 4.0 and Smart Manufacturing, Upper Austria Univ Appl Sci, Hagenberg Campus, Linz, Austria, 17–19 November 2021; pp. 160–174. [Google Scholar]
- Nguyen, H.P.; Baraldi, P.; Zio, E. Ensemble empirical mode decomposition and long short-term memory neural network for multi-step predictions of time series signals in nuclear power plants. Appl. Energy 2021, 283, 116346. [Google Scholar] [CrossRef]
- Peng, K.C.; Cao, X.Q.; Liu, B.N.; Guo, Y.N.; Tian, W.L. Ensemble Empirical Mode Decomposition with Adaptive Noise with Convolution Based Gated Recurrent Neural Network: A New Deep Learning Model for South Asian High Intensity Forecasting. Symmetry 2021, 13, 931. [Google Scholar] [CrossRef]
- Jin, Z.B.; Jin, Y.X.; Chen, Z.Y. Empirical mode decomposition using deep learning model for financial market forecasting. PeerJ Comput. Sci. 2022, 8, e1076. [Google Scholar] [CrossRef]
- Guo, X.; Li, W.J.; Qiao, J.F. A modular neural network with empirical mode decomposition and multi-view learning for time series prediction. Soft Comput. 2023, 27, 12609–12624. [Google Scholar] [CrossRef]
- Zhan, C.J.; Jiang, W.; Lin, F.B.; Zhang, S.T.; Li, B. A decomposition-ensemble broad learning system for AQI forecasting. Neural Comput. Appl. 2022, 34, 18461–18472. [Google Scholar] [CrossRef]
- Xie, G.; Qian, Y.T.; Wang, S.-Y. A decomposition-ensemble approach for tourism forecasting. Ann. Tour. Res. 2020, 81, 102891. [Google Scholar] [CrossRef] [PubMed]
- Yu, S.L. Study on Stock Index Prediction Based on Empirical Mode Decomposition and CNN-LSTM Neural Network Hybrid Model. Master Dissertation, Jiangxi University of Finance and Economies, Nanchang, China, 2022. [Google Scholar]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2015, arXiv:1409.0473. [Google Scholar]
- Li, Y.R.; Yang, J. Hydrological Time Series Prediction Model Based on Attention-LSTM Neural Network. In Proceedings of the 2019 2nd International Conference on Machine Learning and Machine Intelligence (MLMI ‘19), Jakarta, Indonesia, 18–20 September 2019; pp. 21–25. [Google Scholar]
- Hu, J.; Zheng, W.D. Multistage attention network for multivariate time series prediction. Neurocomputing 2020, 383, 122–137. [Google Scholar] [CrossRef]
- He, Q.; Liu, D.X.; Song, W.; Huang, D.-M.; Du, Y.-L. Typhoon trajectory prediction model based on dual attention mechanism. Mar. Sci. Bull. 2021, 40, 387–395. [Google Scholar]
- Hu, Y.T.; Xiao, F.Y. Network self attention for forecasting time series. Appl. Soft Comput. 2022, 124, 109092. [Google Scholar] [CrossRef]
- Wang, D.Z.; Chen, C.Y. Spatiotemporal Self-Attention-Based LSTNet for Multivariate Time Series Prediction. Int. J. Intell. Syst. 2023, 2023, 9523230. [Google Scholar] [CrossRef]
- Li, S.Y.; Jin, X.Y.; Xuan, Y.; Zhou, X.Y.; Chen, W.H.; Wang, Y.X.; Yan, X.F. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Adv. Neural Inf. Process. Syst. 2019, 32, 5243–5253. [Google Scholar]
- Choromanski, K.; Likhosherstov, V.; Dohan, D.; Song, X.Y.; Gane, A.; Sarlos, T.; Hawkins, P.; Davis, J.; Mohiuddin, A.; Kaiser, L.; et al. Rethinking Attention with Performers. arXiv 2020, arXiv:2009.14794. [Google Scholar]
- Kitaev, N.; Kaiser, L.; Levskaya, A. Reformer: The efficient transformer. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Zhou, H.Y.; Zhang, S.H.; Peng, J.Q.; Zhang, S.; Li, J.X.; Xiong, H.; Zhang, W.C. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI, virtually, 2–9 February 2021. [Google Scholar]
- Wu, H.X.; Xu, J.H.; Wang, J.M.; Long, M.S. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. arXiv 2022, arXiv:2106.13008. [Google Scholar]
- Zhou, T.; Ma, Z.Q.; Wen, Q.S.; Wang, X.; Sun, L.; Jin, R. FEDformer: Frequency enhanced decomposed transformer for long-term series forecasting. In Proceedings of the 39th International Conference on Machine Learning (ICML 2022), Baltimore, MA, USA, 17–23 July 2022. [Google Scholar]
- Shabani, M.A.; Abdi, A.; Meng, L.L.; Sylvain, T. Scaleformer: Iterative multi-scale refining transformers for time series forecasting. In ICLR 2023. arXiv 2022, arXiv:2206.04038. [Google Scholar]
- Chen, L.; Chi, Y.G.; Guan, Y.Y.; Fan, J.L. A Hybrid Attention-Based EMD-LSTM Model for Financial Time Series Prediction. In Proceedings of the 2019 2nd International Conference on Artificial Intelligence and Big Data (ICAIBD), Chengdu, China, 25–28 May 2019; pp. 113–118. [Google Scholar]
- Hu, Z.D. Crude oil price prediction using CEEMDAN and LSTM-attention with news sentiment index. Oil Gas Sci. Technol.-Rev. IFP Energ. Nouv. 2021, 76, 28–38. [Google Scholar] [CrossRef]
- Neeraj; Mathew, J.; Behera, R.K. EMD-Att-LSTM: A Data-driven Strategy Combined with Deep Learning for Short-term Load Forecasting. J. Mod. Power Syst. Clean Energy 2022, 10, 1229–1240. [Google Scholar] [CrossRef]
- Huang, H.; Mao, J.N.; Lu, W.K.; Hu, G.; Liu, L. DEASeq2Seq: An attention based sequence to sequence model for short-term metro passenger flow prediction within decomposition-ensemble strategy. Transp. Res. Part C-Emerg. Technol. 2023, 146, 103965. [Google Scholar] [CrossRef]
- Yu, M.; Niu, D.X.; Wang, K.K.; Du, R.Y.; Yu, X.Y.; Sun, L.J.; Wang, F.R. Short-term photovoltaic power point-interval forecasting based on double-layer decomposition and WOA-BiLSTM-Attention and considering weather classification. Energy 2023, 275, 127348. [Google Scholar] [CrossRef]
- Liu, Y.J.; Liu, X.H.; Zhang, Y.X.; Li, S.P. CEGH: A Hybrid Model Using CEEMD, Entropy, GRU, and History Attention for Intraday Stock Market Forecasting. Entropy 2023, 75, 71. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Chung, J.; Gulcehre, C.; Cho, K.-H.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
- Qiu, X.P. Neural Networks and Deep Learning; China Machine Press: Beijing, China, 2020; p. 147. [Google Scholar]
- Li, W.; Chen, J.W.; Liu, R.X.; Hou, Y.G.; Du, S.G. T-Transformer Model for Predicting Tensor Time Series. Comput. Eng. Appl. 2023, 59, 57–62. [Google Scholar]
- Ojagh, S.; Cauteruccio, F.; Terracina, G.; Liang, S.H.L. Enhanced air quality prediction by edge-based spatiotemporal data preprocessing. Comput. Electrical Eng. 2021, 96, 107572. [Google Scholar] [CrossRef]
Figure 1.
Basic structure of LSTM.
Figure 2.
Basic structure of attention mechanism.
Figure 3.
Framework of our model.
Figure 4.
The images of the normalized NYtem dataset and its two subsequences after CEEMDAN.
Figure 5.
Procedure of self-attention (T = 3, D = 2).
Figure 6.
Procedure of temporal attention (T = 3, D = 2).
Figure 7.
The images of the MRTS dataset and its partial subsequences after CEEMDAN. (a) The dataset is globally normalized; (b) The dataset is separately normalized.
Table 1.
Predicting effect comparison of LSTM on NYtem between the subsequences without and with secondary normalization.
Subsequence | R2 without Secondary Normalization | R2 with Secondary Normalization |
---|
IMF1 | 0.209830 | 0.226047 |
IMF2 | 0.915300 | 0.919399 |
IMF3 | 0.997985 | 0.997993 |
IMF4 | 0.999910 | 0.999810 |
IMF5 | 0.999945 | 0.999879 |
IMF6 | 0.999975 | 0.999956 |
IMF7 | 0.999161 | 0.999540 |
IMF8 | 0.998578 | 0.999868 |
RES | −4.207980 | 0.999857 |
Table 2.
Prediction results of LSTM, LSTM-SA, and LSTM-TA for each decomposed subsequence of NYtem.
Subsequences | LSTM | LSTM-SA | LSTM-TA |
---|
MAE | RMSE | R2 | MAE | RMSE | R2 | MAE | RMSE | R2 |
---|
IMF1 | 0.027933 | 0.035760 | 0.221853 | 0.027081 | 0.034681 | 0.268124 | 0.023834 | 0.032200 | 0.369078 |
IMF2 | 0.007363 | 0.010486 | 0.918929 | 0.007320 | 0.010011 | 0.926119 | 0.006302 | 0.008748 | 0.943575 |
IMF3 | 0.000999 | 0.001522 | 0.997991 | 0.003152 | 0.005126 | 0.977205 | 0.000953 | 0.001505 | 0.998036 |
IMF4 | 0.000293 | 0.000369 | 0.999808 | 0.001652 | 0.002133 | 0.993582 | 0.000132 | 0.000175 | 0.999957 |
IMF5 | 0.000202 | 0.000259 | 0.999884 | 0.001201 | 0.001677 | 0.995127 | 0.000507 | 0.000540 | 0.999495 |
IMF6 | 0.000895 | 0.001158 | 0.999955 | 0.005709 | 0.007149 | 0.998275 | 0.000443 | 0.000566 | 0.999989 |
IMF7 | 0.000892 | 0.001029 | 0.999538 | 0.002317 | 0.003177 | 0.995594 | 0.002020 | 0.002384 | 0.997518 |
IMF8 | 0.000213 | 0.000344 | 0.999875 | 0.000706 | 0.000937 | 0.999072 | 0.001254 | 0.001464 | 0.997735 |
RES | 0.000035 | 0.000057 | 0.999830 | 0.000127 | 0.000163 | 0.998614 | 0.000057 | 0.000063 | 0.999793 |
Table 3.
Results of the experiment on NYtem dataset.
Type | Network Model | MAE | RMSE | MAPE | R2 |
---|
Without CEEMDAN | SVR-linear | 3.913410 | 5.143536 | 0.082907 | 0.905441 |
SVR-RBF | 3.934860 | 5.164794 | 0.083349 | 0.904657 |
XGBoost | 4.073632 | 5.336617 | 0.087877 | 0.898208 |
LightGBM | 4.048071 | 5.328266 | 0.087172 | 0.868526 |
BP | 3.867268 | 5.132837 | 0.083532 | 0.905834 |
CNN | 3.862023 | 5.108569 | 0.082561 | 0.906722 |
RNN | 3.846566 | 5.100761 | 0.082990 | 0.907007 |
LSTM | 3.850042 | 5.093421 | 0.082190 | 0.907274 |
GRU | 3.848090 | 5.091207 | 0.082539 | 0.907354 |
LSTM-SA | 3.954155 | 5.211856 | 0.085696 | 0.902911 |
LSTM-TA | 3.833786 | 5.067739 | 0.081731 | 0.908325 |
CEEMDAN + single network | RNN | 2.422320 | 3.136646 | 0.051033 | 0.964835 |
LSTM | 2.386664 | 3.102749 | 0.050591 | 0.965591 |
GRU | 2.393581 | 3.128216 | 0.050683 | 0.965024 |
LSTM-SA | 2.394424 | 3.095457 | 0.050952 | 0.965752 |
LSTM-TA | 2.104671 | 2.813940 | 0.044123 | 0.971698 |
CEEMDAN + multi-network integration | RLG integration | 2.384882 | 3.102448 | 0.050459 | 0.965597 |
SA integration | 2.293810 | 2.974200 | 0.048594 | 0.968383 |
TA integration | 2.093080 | 2.808107 | 0.044101 | 0.971816 |
Table 4.
Pre-experiment results of the two ways of normalization (without decomposition).
Dataset | Way of Normalization | MAE | RMSE | MAPE | R2 |
---|
MRTS | Global | 567.8171 | 991.9713 | 0.239498 | 0.747510 |
Separate | 473.0790 | 805.7689 | 0.176781 | 0.833403 |
SES | Global | 4.599162 | 6.930198 | 0.178287 | 0.931681 |
Separate | 4.414155 | 6.155032 | 0.200523 | 0.946109 |
Table 5.
Predicting effect comparison of LSTM in MRTS between the two normalizations.
Subsequences | R2 of Global Normalization | R2 of Separate Normalization |
---|
IMF1 | −14.070469 | 0.467846 |
IMF2 | −7.345910 | 0.845353 |
IMF3 | 0.885757 | 0.981216 |
IMF4 | 0.980300 | 0.988858 |
IMF5 | 0.976094 | 0.998820 |
IMF6 | 0.965597 | 0.999776 |
IMF7 | 0.981883 | 0.999878 |
IMF8 | 0.993806 | 0.999924 |
IMF9 | 0.996590 | 0.999465 |
IMF10 | 0.988181 | 0.997533 |
IMF11 | 0.996145 | 0.999978 |
IMF12 | 0.962748 | N/A |
RES | 0.999179 | 0.987735 |
Table 6.
Prediction results of LSTM and two LSTMs with attention for each decomposed subsequence of MRTS.
Subsequences | LSTM | LSTM-SA | LSTM-TA |
---|
MAE | RMSE | R2 | MAE | RMSE | R2 | MAE | RMSE | R2 |
---|
IMF1 | 0.068942 | 0.097606 | 0.467846 | 0.065848 | 0.092235 | 0.524803 | 0.057779 | 0.080788 | 0.635440 |
IMF2 | 0.037138 | 0.050192 | 0.845353 | 0.035867 | 0.048719 | 0.854293 | 0.026625 | 0.038382 | 0.909564 |
IMF3 | 0.005232 | 0.010196 | 0.981216 | 0.006584 | 0.010877 | 0.978625 | 0.005231 | 0.010331 | 0.980717 |
IMF4 | 0.001694 | 0.003324 | 0.988858 | 0.008016 | 0.011720 | 0.861490 | 0.000901 | 0.001767 | 0.996852 |
IMF5 | 0.000490 | 0.000722 | 0.998820 | 0.001642 | 0.002323 | 0.987764 | 0.000186 | 0.000271 | 0.999833 |
IMF6 | 0.000398 | 0.000536 | 0.999776 | 0.002089 | 0.002830 | 0.993748 | 0.000209 | 0.000286 | 0.999936 |
IMF7 | 0.000545 | 0.000696 | 0.999878 | 0.002323 | 0.003289 | 0.997277 | 0.000268 | 0.000344 | 0.999970 |
IMF8 | 0.000947 | 0.001202 | 0.999924 | 0.004963 | 0.006195 | 0.997991 | 0.000263 | 0.000341 | 0.999994 |
IMF9 | 0.000995 | 0.001297 | 0.999465 | 0.001978 | 0.002493 | 0.998023 | 0.000688 | 0.000879 | 0.999754 |
IMF10 | 0.000416 | 0.000526 | 0.997533 | 0.001791 | 0.002043 | 0.962785 | 0.002017 | 0.002209 | 0.956496 |
IMF11 | 0.000159 | 0.000224 | 0.999978 | 0.000552 | 0.000700 | 0.999784 | 0.001272 | 0.001806 | 0.998562 |
RES | 0.000607 | 0.000729 | 0.987735 | 0.003025 | 0.003572 | 0.705902 | 0.002139 | 0.002418 | 0.865267 |
Table 7.
Results of the experiment on the MRTS dataset.
Type | Network Model | MAE | RMSE | MAPE | R2 |
---|
Without CEEMDAN | SVR-linear | 526.0849 | 969.2392 | 0.218535 | 0.758949 |
SVR-RBF | 463.5917 | 817.6718 | 0.176396 | 0.828445 |
XGBoost | 457.5255 | 819.7784 | 0.166906 | 0.827560 |
LightGBM | 454.6330 | 828.9988 | 0.167352 | 0.823658 |
BP | 515.1523 | 962.2463 | 0.212070 | 0.762415 |
CNN | 516.4068 | 955.7559 | 0.213203 | 0.765609 |
RNN | 529.6522 | 931.3066 | 0.217999 | 0.777448 |
LSTM | 464.5581 | 766.3151 | 0.173935 | 0.849318 |
GRU | 475.4387 | 766.6409 | 0.180633 | 0.849190 |
LSTM-SA | 460.4295 | 823.7485 | 0.167469 | 0.825885 |
LSTM-TA | 454.5888 | 731.5010 | 0.170132 | 0.862698 |
CEEMDAN + single network | RNN | 471.7071 | 717.2306 | 0.205216 | 0.868003 |
LSTM | 386.4517 | 574.5975 | 0.169103 | 0.915282 |
GRU | 360.5774 | 536.4187 | 0.158428 | 0.926167 |
LSTM-SA | 367.6106 | 546.2686 | 0.163798 | 0.923430 |
LSTM-TA | 319.7796 | 481.9345 | 0.136828 | 0.940403 |
CEEMDAN + multi-network integration | RLG integration | 360.7650 | 536.8711 | 0.158490 | 0.926042 |
SA integration | 358.9588 | 534.3266 | 0.160022 | 0.926741 |
TA integration | 319.0218 | 480.0937 | 0.136684 | 0.940858 |
Table 8.
Results of the experiment on SES dataset.
Type | Network Model | MAE | RMSE | MAPE | R2 |
---|
Without CEEMDAN | SVR-linear | 4.220502 | 6.381952 | 0.182274 | 0.942063 |
SVR-RBF | 4.109615 | 6.111284 | 0.178699 | 0.946873 |
XGBoost | 4.397779 | 6.321137 | 0.192321 | 0.943161 |
LightGBM | 4.365276 | 6.217920 | 0.189436 | 0.945003 |
BP | 4.414531 | 6.233717 | 0.188793 | 0.944723 |
CNN | 4.442804 | 6.231893 | 0.194008 | 0.944755 |
RNN | 4.354441 | 6.101125 | 0.191706 | 0.947049 |
LSTM | 4.489087 | 6.286058 | 0.210149 | 0.943791 |
GRU | 4.395441 | 6.142638 | 0.198855 | 0.946326 |
LSTM-SA | 4.421953 | 6.132696 | 0.196389 | 0.946500 |
LSTM-TA | 4.337513 | 6.047821 | 0.197218 | 0.947970 |
CEEMDAN + single network | RNN | 2.687751 | 3.686859 | 0.119663 | 0.980664 |
LSTM | 2.677974 | 3.646671 | 0.117467 | 0.981083 |
GRU | 2.680054 | 3.668520 | 0.120776 | 0.980856 |
LSTM-SA | 2.665125 | 3.683923 | 0.116307 | 0.980695 |
LSTM-TA | 2.398506 | 3.588900 | 0.097303 | 0.981678 |
CEEMDAN + multi-network integration | RLG integration | 2.671909 | 3.690628 | 0.118131 | 0.980624 |
SA integration | 2.601956 | 3.604627 | 0.109592 | 0.981517 |
TA integration | 2.400186 | 3.583131 | 0.097391 | 0.981737 |
Table 9.
Comparison of training results between two TA-integrated models and LSTM-TA.
Dataset | Model | MAE | RMSE | MAPE | R2 |
---|
NYtem | LSTM-TA | 2.104671 | 2.813940 | 0.044123 | 0.971698 |
TA integration | 2.093080 | 2.808107 | 0.044101 | 0.971816 |
TA′ integration | 2.092645 | 2.808982 | 0.044135 | 0.971798 |
MRTS | LSTM-TA | 319.7796 | 481.9345 | 0.136828 | 0.940403 |
TA integration | 319.0218 | 480.0937 | 0.136684 | 0.940858 |
TA′ integration | 319.3912 | 481.1079 | 0.136731 | 0.940608 |
SES | LSTM-TA | 2.398506 | 3.588900 | 0.097303 | 0.981678 |
TA integration | 2.400186 | 3.583131 | 0.097391 | 0.981737 |
TA′ integration | 2.396544 | 3.585854 | 0.097185 | 0.981709 |
Table 10.
The training effect and performance comparison of neural network models (on MRTS dataset).
Neural Network Model | Relative MAE with Decomposition (LSTM = 1) | Relative RMSE with Decomposition (LSTM = 1) | Total Number of Training Parameters | Training Time Cost (s/100 Epochs) |
---|
RNN | 1.220611 | 1.248231 | 1153 | 9 |
LSTM | 1.000000 | 1.000000 | 4480 | 12 |
GRU | 0.933046 | 0.933556 | 3393 | 15 |
LSTM-SA | 0.951246 | 0.950698 | 7585 | 17 |
LSTM-TA | 0.827476 | 0.838734 | 12993 | 20 |
TFencoder | N/A | N/A | 989607 | 173 |
TF-FF | N/A | N/A | 1515777 | 206 |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).