Taming the Chaos in Neural Network Time Series Predictions
Abstract
:1. Introduction
- An insufficient amount of data, i.e., the data are not fine-grained or long enough;
- Random data, with its blueprint, the Brownian motion (cf. [2]);
- Bad parameterization of the algorithm.
2. Related Work
3. Methodology
4. Datasets
- Monthly international airline passengers: January 1949 to December 1960, 144 data points, given in units of 1000. Source: Time Series Data Library, [25];
- Monthly car sales in Quebec: January 1960 to December 1968, 108 data points. Source: Time Series Data Library [25];
- Monthly mean air temperature in Nottingham Castle: January 1920 to December 1939, given in degrees Fahrenheit, 240 data points. Source: Time Series Data Library [25];
- Perrin Freres monthly champagne sales: January 1964 to September 1972, 105 data points. Source: Time Series Data Library [25];
- CFE specialty monthly writing paper sales. This dataset spans 12 years and 3 months, 147 data points. Source: Time Series Data Library [25].
5. Applied Interpolation Techniques
5.1. Fractal Interpolation of Time Series Data
5.2. Fractal Interpolation Applied
- Divide time series into m sub-sets of size l;
- For each sub-set i, calculate the corresponding Hurst exponent ;
- For each subset i, the following routine is performed times:
- (a)
- Use the fractal interpolation method from Section 5.1 with a random parameter , where was set constant for the whole sub-set;
- (b)
- Calculate the Hurst exponent for the interpolated time series;
- (c)
- If there was set beforehand, compare it to . If is closer to , keep , and the corresponding fractal interpolation and is set to .
- The Hurst exponent was calculated using R/S Analysis [20];
- The number of iterations k was set to 500 for each dataset;
- No threshold was set for the Hurst exponent of the interpolated time series to match the one from the original time series, since, for some sub-intervals, several thresholds that have been tried could not be reached.
5.3. Linear Interpolation
6. Measuring the Complexity of the Data
6.1. The Hurst Exponent (R/S Analysis)
6.2. The Lyapunov Exponents’ Spectrum
6.3. Fisher’s Information
6.4. SVD Entropy
6.5. Shannon’s Entropy
6.6. Initial Complexity
- The Hurst exponent: The most persistent dataset, with a Hurst exponent of , is the dataset of monthly car sales in Quebec. According to [33], we expected that time series data with a very high Hurst exponent can be predicted with higher accuracy than ones with a value close to , as it is considered more random. The datasets under study are three persistent ones, i.e., with a Hurst exponent larger than . Contrary to that, we obtained two anti-persistent ones with a Hurst exponent below ;
- The largest Lyapunov exponent: All largest Lyapunov exponents of all time series data under study are positive, just as we would expect from chaotic or complex real-life data. The dataset with the highest value is the monthly car sales data in the Quebec dataset. As the Lyapunov exponents of experimental time series data serve as a measure for predictability, we suggest this dataset, therefore, to be forecast with the least accuracy;
- Fisher’s information: The dataset with the highest value of Fisher’s information is the monthly international airline passengers dataset. The lowest value can be found for the Perrin Freres monthly champagne sales dataset. [38]: It is expected that Fisher’s information behaves contrary to entropy measures since it is a measure for order/quality, which we observed only for SVD entropy. This means that the Fisher’s information value for the monthly international airline passengers dataset is the highest, and the corresponding value for the SVD entropy is the lowest among all datasets. The reason for this may be that, just like Fisher’s information, SVD entropy is based on a Single Value Decomposition [39]. Considering Shannon’s entropy, which is also an entropy measure but not one based on Single Value decomposition, its behavior differs;
- SVD entropy: The largest value of SVD entropy is possessed by the Perrin Freres monthly champagne sales dataset, which has, just as expected, the lowest value of Fisher’s information. We expect, as it has a very high SVD entropy value, the Perrin Freres champagne sales dataset not to be predicted with high accuracy;
- Shannon’s entropy: Shannon’s entropy is based on the frequency of occurrence of a specific value. As we deal with non-integer-valued complex datasets, we expect Shannon’s entropy not to be of much use. Shannon’s entropy’s highest value is found for the monthly mean temperature in Nottingham Castle dataset. Since reoccurring temperature distributions possess a higher regularity than, e.g., airline passengers, this explains the corresponding value.
6.7. Complexity Plots
- The Hurst exponent of the fractal and the linear interpolated datasets behave very similar to each other for the monthly international airline passengers dataset, see Figure 3. We observe similar behavior for the other datasets as well, see Appendix A. Though the Hurst exponent is initially lower for the fractal-interpolated data for some datasets, the Hurst exponent does not differ significantly between fractal and linear interpolated time series data. In addition, adding more interpolation points increases the Hurst exponent and makes the datasets more persistent;
- The Largest Lyapunov exponents of the fractal-interpolated data are much closer to the original data than the ones for the linear-interpolated data; see Figure 4. We observe the same behavior for all datasets; see Appendix A;
- Fisher’s information for the fractal-interpolated dataset is closer to that of the original dataset (see Figure 3). We observe the same behavior for all datasets, as can be seen in Appendix A;
- Just as expected, SVD entropy behaves contrary to Fisher’s information. In addition, the SVD entropy of the fractal interpolated time series is closer to that of the non-interpolated time series; see Figure 5. The same behavior and, specifically, the behavior contrary to that of Fisher’s information can be observed for all datasets under study; see Appendix A;
- Shannon’s entropy increases. This can be explained as follows: As more data points are added, the probability of hitting the same value increases. However, this is just what Shannon’s entropy measures. For small numbers of interpolation points, Shannon’s entropy of the fractal interpolated time series data is closer to the original complexity than the linear interpolated time series data. For large numbers of interpolation points, Shannon’s entropy performs very similarly, not to say overlaps, for the fractal- and linear-interpolated time series data. This behavior can be observed for all datasets, see Figure 4 and Appendix A.
- The fractal interpolation captures the original data complexity better, compared to the linear interpolation. We observe a significant difference in their behavior when studying SVD entropy, Fisher’s information, and the largest Lyapunov exponent. This is especially true for the largest Lyapunov exponent, where the behavior completely differs. The largest Lyapunov exponent of the fractal interpolated time series data stays mostly constant or behaves linearly. The largest Lyapunov exponent of the linear-interpolated data behaves approximately like a sigmoid function, and for some datasets even decreases again for large numbers of interpolation points.
- Both Shannon’s entropy and the Hurst exponent seem not suitable for differentiating between fractal- and linear-interpolated time series data.
7. LSTM Ensemble Predictions
7.1. Data Preprocessing
7.2. Random Ensemble Architecture
- LSTM layers: min 1, max 5;
- Size of the input layer: min 1, max depending on the size of the data, i.e., length of the training data −1;
- Epochs: min 1, max 30;
- Neurons: For both LSTM and Dense layers, min 1, max 30;
- Batchsize: The batchsize was chosen randomly from .
8. Error Analysis
9. Complexity Filters
- Hurst exponent filter: The Hurst exponent of each prediction of the ensemble was calculated and compared to the complexity of the corresponding training dataset: (The complexity of the fractal interpolated training data was compared with the complexity of the fractal-interpolated predictions, the same is true with linear-interpolated prediction and training datasets and non-interpolated training and test datasets.) (Note that this is crucial for real-time predictions as there is no validation dataset.)
- Lyapunov exponents filter: The first 4 Lyapunov exponents of each prediction of the ensemble were calculated and compared with those of the training dataset.
- Fisher’s information filter: Fisher’s information of each prediction of the ensemble was calculated and compared to Fisher’s information of the training dataset:
- SVD entropy filter: The SVD entropy of each prediction of the ensemble was calculated and compared to the SVD entropy of the training dataset:
- Shannon’s entropy filter: Shannon’s entropy of each prediction of the ensemble was calculated and compared to the Shannon’s entropy of the training dataset:
10. Baseline Predictions
11. Results and Discussion
11.1. Interpolation Techniques
11.2. Complexity Filters
11.3. Remarks and Summary
- Random ensemble predictions can significantly be improved using fractal and linear interpolation techniques. The authors recommend using a fractal interpolation approach as the shown results feature a more stable behavior than those for the linear interpolation;
- Random ensemble predictions can significantly be improved using complexity filters to reduce the number of predictions in an ensemble. Taking into account the unfiltered and non-interpolated results shown in Table A5 and Table A6 and comparing them to the best results, shown in Table 5 and Table A1, Table A2, Table A3 and Table A4, we see that the RMSE was reduced by a factor of on average;
- The best results of the random ensemble, i.e., the single step-by-step predictions always outperformed the baseline predictions, Table 2, Table 3 and Table 4, and Appendix D. Here, we note that the given baseline predictions are probably not the best results that can be achieved with an optimized LSTM neural network but are still reasonable results and serve as baseline to show the quality of the ensemble predictions;
- Though the unfiltered results (Table A5 and Table A6) suggest a trend and a minimum for the errors depending on the number of interpolation points, this trend vanishes when applying complexity filters. Therefore, we could not find a trend for the number of interpolation points for any interpolation technique and any complexity filters.
12. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Complexity Plots for All Datasets
Appendix B. Error Tables
Interpolation Technique | # of Interpolation Points | Filter | Error |
---|---|---|---|
non-interpolated | - | shannon fisher | 0.08814 ± 0.01496 |
non-interpolated | - | shannon svd | 0.08814 ± 0.01496 |
non-interpolated | - | fisher hurst | 0.09098 ± 0.01511 |
non-interpolated | - | svd hurst | 0.09098 ± 0.01511 |
non-interpolated | - | fisher | 0.09932 ± 0.00602 |
fractal-interpolated | 13 | shannon fisher | 0.08099 ± 0.00961 |
fractal-interpolated | 13 | shannon svd | 0.08099 ± 0.00961 |
fractal-interpolated | 9 | lyap hurst | 0.08585 ± 0.01546 |
fractal-interpolated | 17 | fisher lyap | 0.08659 ± 0.01869 |
fractal-interpolated | 17 | svd lyap | 0.08659 ± 0.01869 |
linear-interpolated | 11 | lyap hurst | 0.07567 ± 0.03563 |
linear-interpolated | 7 | fisher | 0.08500 ± 0.02254 |
linear-interpolated | 7 | fisher svd | 0.08500 ± 0.02254 |
linear-interpolated | 7 | fisher shannon | 0.08500 ± 0.02254 |
linear-interpolated | 5 | svd | 0.08500 ± 0.02254 |
Interpolation Technique | # of Interpolation Points | Filter | Error |
---|---|---|---|
non-interpolated | - | shannon fisher | 0.05728 ± 0.00418 |
non-interpolated | - | fisher svd | 0.05877 ± 0.01496 |
non-interpolated | - | svd | 0.05877 ± 0.01496 |
non-interpolated | - | svd shannon | 0.05877 ± 0.01496 |
non-interpolated | - | shannon fisher | 0.05901 ± 0.00263 |
fractal-interpolated | 1 | shannon svd | 0.05724 ± 0.00495 |
fractal-interpolated | 7 | shannon hurst | 0.05873 ± 0.00684 |
fractal-interpolated | 5 | shannon lyap | 0.05943 ± 0.01648 |
fractal-interpolated | 7 | fisher hurst | 0.05946 ± 0.00519 |
fractal-interpolated | 3 | hurst | 0.05998 ± 0.00544 |
linear-interpolated | 3 | lyap hurst | 0.05625 ± 0.00632 |
linear-interpolated | 7 | lyap fisher | 0.05635 ± 0.00481 |
linear-interpolated | 3 | hurst | 0.05742 ± 0.00623 |
linear-interpolated | 7 | lyap svd | 0.05786 ± 0.00511 |
linear-interpolated | 3 | svd lyap | 0.05862 ± 0.00416 |
Interpolation Technique | # of Interpolation Points | Filter | Error |
---|---|---|---|
non-interpolated | - | shannon fisher | 0.06383 ± 0.02706 |
non-interpolated | - | shannon svd | 0.06383 ± 0.02706 |
non-interpolated | - | hurst fisher | 0.07245 ± 0.01571 |
non-interpolated | - | hurst svd | 0.07387 ± 0.01695 |
non-interpolated | - | hurst lyap | 0.07403 ± 0.01740 |
fractal-interpolated | 13 | fisher hurst | 0.04968 ± 0.02155 |
fractal-interpolated | 13 | svd hurst | 0.04968 ± 0.02155 |
fractal-interpolated | 11 | shannon hurst | 0.05001 ± 0.01416 |
fractal-interpolated | 17 | hurst lyap | 0.05166 ± 0.01066 |
fractal-interpolated | 13 | hurst | 0.05386 ± 0.0154 |
linear-interpolated | 17 | hurst | 0.05449 ± 0.0280 |
linear-interpolated | 17 | hurst shannon | 0.05449 ± 0.0280 |
linear-interpolated | 9 | fisher | 0.05730 ± 0.03250 |
linear-interpolated | 9 | fisher shannon | 0.05730 ± 0.03250 |
linear-interpolated | 9 | svd fisher | 0.05730 ± 0.03250 |
Interpolation Technique | # of Interpolation Points | Filter | Error |
---|---|---|---|
non-interpolated | - | shannon fisher | 0.18996 ± 0.00957 |
non-interpolated | - | shannon svd | 0.19200 ± 0.01041 |
non-interpolated | - | hurst fisher | 0.19314 ± 0.01057 |
non-interpolated | - | hurst svd | 0.19314 ± 0.01057 |
non-interpolated | - | fisher hurst | 0.19328 ± 0.01021 |
fractal-interpolated | 5 | fisher lyap | 0.17685 ± 0.00601 |
fractal-interpolated | 5 | svd lyap | 0.17685 ± 0.00601 |
fractal-interpolated | 5 | lyap hurst | 0.18138 ± 0.00939 |
fractal-interpolated | 1 | hurst fisher | 0.18332 ± 0.00751 |
fractal-interpolated | 1 | hurst svd | 0.18332 ± 0.00751 |
linear-interpolated | 7 | hurst fisher | 0.17651 ± 0.01096 |
linear-interpolated | 7 | hurst svd | 0.17651 ± 0.01096 |
linear-interpolated | 15 | shannon lyap | 0.18026 ± 0.00973 |
linear-interpolated | 3 | shannon hurst | 0.18149 ± 0.01623 |
linear-interpolated | 7 | fisher lyap | 0.18201 ± 0.00619 |
Appendix C. Prediction Plots
Appendix D. Baseline Predictions
Appendix E. Unfiltered Ensemble Prediction Errors
# of Interpolation Points | Monthly International Airline Passengers | Monthly Car Sales in Quebec | Monthly Mean Air Temperature in Nottingham Castle | Perrin Freres Monthly Champagne Sales | CFE Specialty Monthly Writing Paper Sales |
---|---|---|---|---|---|
0 | 0.16771 ± 0.01537 | 0.20779 ± 0.03961 | 0.28503 ± 0.04816 | 0.20641 ± 0.05741 | 0.38041 ± 0.05203 |
1 | 0.16076 ± 0.01917 | 0.20239 ± 0.04121 | 0.31272 ± 0.04877 | 0.19946 ± 0.06391 | 0.38070 ± 0.05338 |
3 | 0.16487 ± 0.01758 | 0.18964 ± 0.04448 | 0.30624 ± 0.05029 | 0.18770 ± 0.07113 | 0.38474 ± 0.05309 |
5 | 0.15347 ± 0.01988 | 0.18814 ± 0.04654 | 0.30236 ± 0.05090 | 0.18858 ± 0.07011 | 0.37807 ± 0.05443 |
7 | 0.15710 ± 0.02002 | 018252 ± 0.04657 | 0.30020 ± 0.05112 | 0.18439 ± 0.07311 | 0.37935 ± 0.05479 |
9 | 0.14610 ± 0.02088 | 0.17944 ± 0.04784 | 0.29381 ± 0.05185 | 0.17920 ± 0.07327 | 0.39043 ± 0.05280 |
11 | 0.15410 ± 0.02082 | 0.18092 ± 0.04787 | 0.30588 ± 0.05123 | 0.18487 ± 0.07284 | 0.36708 ± 0.05583 |
13 | 0.15361 ± 0.02014 | 0.17582 ± 0.04781 | 0.30105 ± 0.05151 | 0.18408 ± 0.07380 | 0.39382 ± 0.05228 |
15 | 0.15359 ± 0.02091 | 0.17476 ± 0.04773 | 0.31103 ± 0.05033 | 0.17973 ± 0.07522 | 0.39385 ± 0.05276 |
17 | 0.16245 ± 0.02004 | 0.17571 ± 0.04754 | 0.30171 ± 0.05125 | 0.18219 ± 0.07404 | 0.37625 ± 0.05515 |
# of Interpolation Points | Monthly International Airline Passengers | Monthly Car Sales in Quebec | Monthly Mean Air Temperature in Nottingham Castle | Perrin Freres Monthly Champagne Sales | CFE Specialty Monthly Writing Paper Sales |
---|---|---|---|---|---|
0 | 0.16771 ± 0.01537 | 0.20779 ± 0.03961 | 0.28503 ± 0.04816 | 0.20641 ± 0.05741 | 0.38041 ± 0.05203 |
1 | 0.16294 ± 0.01917 | 0.20283 ± 0.04332 | 0.26516 ± 0.05052 | 0.21457 ± 0.06303 | 0.36967 ± 0.05309 |
3 | 0.15199 ± 0.01758 | 0.19681 ± 0.04584 | 0.29448 ± 0.05088 | 0.19397 ± 0.0691 | 0.37922 ± 0.05435 |
5 | 0.15088 ± 0.01988 | 0.17882 ± 0.04761 | 0.27367 ± 0.05107 | 0.19438 ± 0.07132 | 0.35778 ± 0.05520 |
7 | 0.14553 ± 0.02002 | 0.17105 ± 0.04771 | 0.28405 ± 0.05130 | 0.18642 ± 0.07327 | 0.37685 ± 0.05533 |
9 | 0.15033 ± 0.02088 | 0.18831 ± 0.04813 | 0.28135 ± 0.05186 | 0.20273 ± 0.07183 | 0.35956 ± 0.05501 |
11 | 0.15664 ± 0.02082 | 0.18738 ± 0.04832 | 0.28566 ± 0.05130 | 0.18151 ± 0.07370 | 0.38573 ± 0.05557 |
13 | 0.15459 ± 0.02014 | 0.17700 ± 0.04855 | 0.30069 ± 0.05168 | 0.19560 ± 0.07281 | 0.38573 ± 0.05504 |
15 | 0.15090 ± 0.02091 | 0.18368 ± 0.04825 | 0.30349 ± 0.05114 | 0.19760±0.07235 | 0.38506 ± 0.05515 |
17 | 0.15428 ± 0.02004 | 0.18468 ± 0.04903 | 0.28451 ± 0.05208 | 0.18838 ± 0.07347 | 0.36044 ± 0.05591 |
Appendix F. Phase Space Embeddings
- Monthly international airline passengers:Time delay,Embedding dimension, ;
- Monthly car sales in Quebec:Time delay,Embedding dimension, ;
- Monthly mean temperature in Nottingham Castle:Time delay,Embedding dimension, ;
- Perrin Freres monthly champagne sales:Time delay,Embedding dimension, ;
- CFE specialty monthly writing paper sales:Time delay,Embedding dimension, .
References
- Rasheed, K.; Qayyum, A.; Qadir, J.; Sivathamboo, S.; Kwan, P.; Kuhlmann, L.; O’Brien, T.; Razi, A. Machine Learning for Predicting Epileptic Seizures Using EEG Signals: A Review. IEEE Rev. Biomed. Eng. 2021, 14, 139–155. [Google Scholar] [CrossRef]
- Wang, M.C.; Uhlenbeck, G.E. On the Theory of the Brownian Motion II. Rev. Mod. Phys. 1945, 17, 323–342. [Google Scholar] [CrossRef]
- Karaca, Y.; Zhang, Y.D.; Muhammad, K. A Novel Framework of Rescaled Range Fractal Analysis and Entropy-Based Indicators: Forecasting Modelling for Stock Market Indices. Expert Syst. Appl. 2019. [Google Scholar] [CrossRef]
- Manousopoulos, P.; Drakopoulos, V.; Theoharis, T. Curve Fitting by Fractal Interpolation. In Transactions on Computational Science I; Gavrilova, M.L., Tan, C.J.K., Eds.; Springer: Berlin/Heidelberg, Germany, 2008; pp. 85–103. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Cho, K.; van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; Association for Computational Linguistics: Doha, Qatar, 2014; pp. 1724–1734. [Google Scholar] [CrossRef]
- Zhang, J.; Man, K.F. Time series prediction using RNN in multi-dimension embedding phase space. In Proceedings of the 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No. 98CH36218), San Diego, CA, USA, 14 October 1998; Volume 2, pp. 1868–1873. [Google Scholar]
- Sapankevych, N.I.; Sankar, R. Time Series Prediction Using Support Vector Machines: A Survey. IEEE Comput. Intell. Mag. 2009, 4, 24–38. [Google Scholar] [CrossRef]
- Trafalis, T.B.; Ince, H. Support vector machine for regression and applications to financial forecasting. In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium, Como, Italy, 27 July 2000; Volume 6, pp. 348–353. [Google Scholar] [CrossRef] [Green Version]
- Cao, L.J.; Tay, F.E.H. Support vector machine with adaptive parameters in financial time series forecasting. IEEE Trans. Neural Netw. 2003, 14, 1506–1518. [Google Scholar] [CrossRef] [Green Version]
- Chang, M.; Chen, B.; Lin, C. EUNITE Network Competition: Electricity Load Forecasting; Technical Report; National Taiwan University: Taipei, Taiwan, 2001. [Google Scholar]
- Voyant, C.; Notton, G.; Kalogirou, S.; Nivet, M.L.; Paoli, C.; Motte, F.; Fouilloy, A. Machine learning methods for solar radiation forecasting: A review. Renew. Energy 2017, 105, 569–582. [Google Scholar] [CrossRef]
- Qi, M.; Zhang, G.P. Trend Time—Series Modeling and Forecasting With Neural Networks. IEEE Trans. Neural Netw. 2008, 19, 808–816. [Google Scholar] [CrossRef] [PubMed]
- Zhang, J.S.; Xiao, X.C. Predicting Chaotic Time Series Using Recurrent Neural Network. Chin. Phys. Lett. 2000, 17, 88–90. [Google Scholar] [CrossRef]
- Connor, J.T.; Martin, R.D.; Atlas, L.E. Recurrent neural networks and robust time series prediction. IEEE Trans. Neural Netw. 1994, 5, 240–254. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sherstinsky, A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef] [Green Version]
- Zhang, G.; Berardi, V. Time series forecasting with neural network ensembles: An application for exchange rate prediction. J. Oper. Res. Soc. 2001, 52, 652–664. [Google Scholar] [CrossRef]
- Chen, J.; Zeng, G.Q.; Zhou, W.; Du, W.; Lu, K.D. Wind speed forecasting using nonlinear-learning ensemble of deep learning time series prediction and extremal optimization. Energy Convers. Manag. 2018, 165, 681–695. [Google Scholar] [CrossRef]
- Zhang, G.P. A neural network ensemble method with jittered training data for time series forecasting. Inf. Sci. 2007, 177, 5329–5346. [Google Scholar] [CrossRef]
- Hurst, H.; Black, R.; Sinaika, Y. Long-Term Storage in Reservoirs: An Experimental Study; Constable: London, UK, 1965. [Google Scholar]
- Cincotta, P.M.; Helmi, A.; Méndez, M.; Núñez, J.A.; Vucetich, H. Astronomical time-series analysis—II. A search for periodicity using the Shannon entropy. Mon. Not. R. Astron. Soc. 1999, 302, 582–586. [Google Scholar] [CrossRef] [Green Version]
- Vörös, Z.; Jankovičová, D. Neural network prediction of geomagnetic activity: A method using local Hölder exponents. Nonlinear Process. Geophys. 2002, 9, 425–433. [Google Scholar] [CrossRef]
- Castillo, O.; Melin, P. Hybrid Intelligent Systems for Time Series Prediction Using Neural Networks, Fuzzy Logic, and Fractal Theory. IEEE Trans. Neural Netw. 2002, 13, 1395–1408. [Google Scholar] [CrossRef]
- Raubitzek, S.; Neubauer, T. A fractal interpolation approach to improve neural network predictions for difficult time series data. Expert Syst. Appl. 2021, 169, 114474. [Google Scholar] [CrossRef]
- Hyndman, R.; Yang, Y. Time Series Data Library v0.1.0. 2018. Available online: https://pkg.yangzhuoranyang.com/tsdl/ (accessed on 28 October 2021).
- Mazel, D.; Hayes, M. Using iterated function systems to model discrete sequences. IEEE Trans. Signal Process. 1992, 40, 1724–1734. [Google Scholar] [CrossRef]
- Barnsley, M.F.; Demko, S.; Powell, M.J.D. Iterated function systems and the global construction of fractals. Proc. R. Soc. Lond. A Math. Phys. Sci. 1985, 399, 243–275. [Google Scholar] [CrossRef]
- Rasekhi, S.; Shahrazi, M. Modified R/S and DFA Analyses of Foreign ExchangeMarket Efficiency under Two Exchange Rate Regimes: A Case Study of Iran. Iran. J. Econ. Res. 2014, 18, 1–26. [Google Scholar]
- Feder, J. Fractals; Physics of Solids and Liquids; Springer: New York, NY, USA, 1988. [Google Scholar]
- Julián, M.; Alcaraz, R.; Rieta, J. Study on the Optimal Use of Generalized Hurst Exponents for Noninvasive Estimation of Atrial Fibrillation Organization. In Proceedings of the Computing in Cardiology 2013, Zaragoza, Spain, 22–25 September 2013. [Google Scholar]
- Di Matteo, T. Multi-scaling in finance. Quant. Financ. 2007, 7, 21–36. [Google Scholar] [CrossRef]
- Feller, W. An Introduction to Probability Theory and Its Applications; Wiley-Blackwell: Hoboken, NJ, USA, 1971. [Google Scholar]
- Selvaratnam, S.; Kirley, M. Predicting Stock Market Time Series Using Evolutionary Artificial Neural Networks with Hurst Exponent Input Windows. Lect. Notes Comput. Sci. 2006, 4304. [Google Scholar] [CrossRef]
- Schölzel, C. Nonlinear Measures for Dynamical Systems, version 0.5.2; Zenodo, 2019. [Google Scholar] [CrossRef]
- Eckmann, J.P.; Kamphorst, S.; Ciliberto, S. Liapunov exponents from time series. Phys. Rev. A 1987, 34, 4971–4979. [Google Scholar] [CrossRef] [PubMed]
- Henriques, T.; Ribeiro, M.; Teixeira, A.; Castro, L.; Antunes, L.; Costa Santos, C. Nonlinear Methods Most Applied to Heart-Rate Time Series: A Review. Entropy 2020, 22, 309. [Google Scholar] [CrossRef] [Green Version]
- Zengrong, L.; Liqun, C.; Ling, Y. On properties of hyperchaos: Case study. Acta Mech. Sin. 1999, 15, 366–370. [Google Scholar] [CrossRef]
- Mayer, A.L.; Pawlowski, C.W.; Cabezas, H. Fisher Information and dynamic regime changes in ecological systems. Ecol. Model. 2006, 195, 72–82. [Google Scholar] [CrossRef]
- Klema, V.; Laub, A. The singular value decomposition: Its computation and some applications. IEEE Trans. Autom. Control 1980, 25, 164–176. [Google Scholar] [CrossRef] [Green Version]
- Makowski, D.; Pham, T.; Lau, Z.J.; Brammer, J.C.; Lespinasse, F.; Pham, H.; Schölzel, C.; Chen, S.A. NeuroKit2: A Python Toolbox for Neurophysiological Signal Processing. Behav. Res. Methods 2020, 1–8. [Google Scholar] [CrossRef]
- Fraser, A.M.; Swinney, H.L. Independent coordinates for strange attractors from mutual information. Phys. Rev. A 1986, 33, 1134–1140. [Google Scholar] [CrossRef]
- Krakovská, A.; Mezeiová, K.; Budáčová, H. Use of False Nearest Neighbours for Selecting Variables and Embedding Parameters for State Space Reconstruction. J. Complex Syst. 2015, 2015, 932750. [Google Scholar] [CrossRef] [Green Version]
- Stark, J.; Broomhead, D.S.; Davies, M.E.; Huke, J. Takens embedding theorems for forced and stochastic systems. Nonlinear Anal. Theory Methods Appl. 1997, 30, 5303–5314. [Google Scholar] [CrossRef]
- Caraiani, P. The predictive power of singular value decomposition entropy for stock market dynamics. Phys. Stat. Mech. Its Appl. 2014, 393, 571–578. [Google Scholar] [CrossRef]
- Gu, R.; Shao, Y. How long the singular value decomposed entropy predicts the stock market? —Evidence from the Dow Jones Industrial Average Index. Phys. A Stat. Mech. Its Appl. 2016, 453. [Google Scholar] [CrossRef]
- Roberts, S.J.; Penny, W.; Rezek, I. Temporal and spatial complexity measures for electroencephalogram based brain-computer interfacing. Med Biol. Eng. Comput. 1999, 37, 93–98. [Google Scholar] [CrossRef] [PubMed]
- Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef] [Green Version]
- Zhou, R.; Cai, R.; Tong, G. Applications of Entropy in Finance: A Review. Entropy 2013, 15, 4909–4931. [Google Scholar] [CrossRef]
- Kim, S.; Ku, S.; Chang, W.; Song, J.W. Predicting the Direction of US Stock Prices Using Effective Transfer Entropy and Machine Learning Techniques. IEEE Access 2020, 8, 111660–111682. [Google Scholar] [CrossRef]
- Kim, S.S. Time-delay recurrent neural network for temporal correlations and prediction. Neurocomputing 1998, 20, 253–263. [Google Scholar] [CrossRef]
- Waibel, A. Waibel. Modular Construction of Time-Delay Neural Networks for Speech Recognition. Neural Comput. 1989, 1, 39–46. [Google Scholar] [CrossRef]
- Sun, W.; Wang, Y. Short-term wind speed forecasting based on fast ensemble empirical mode decomposition, phase space reconstruction, sample entropy and improved back-propagation neural network. Energy Convers. Manag. 2018, 157, 1–12. [Google Scholar] [CrossRef]
- Rhodes, C.; Morari, M. The false nearest neighbors algorithm: An overview. Comput. Chem. Eng. 1997, 21, S1149–S1154. [Google Scholar] [CrossRef]
Hurst Exponent | Largest Lyapunov Exponent | Fisher’s Information | SVDEntropy | Shannon’s Entropy | |
---|---|---|---|---|---|
Monthly international airline passengers | 0.4233 | 0.0213 | 0.7854 | 0.3788 | 6.8036 |
Monthly car sales in Quebec | 0.7988 | 0.0329 | 0.5965 | 0.5904 | 6.7549 |
Monthly mean air temperature in Nottingham Castle | 0.4676 | 0.0069 | 0.6617 | 0.5235 | 7.0606 |
Perrin Freres monthly champagne sales | 0.7063 | 0.0125 | 0.3377 | 0.8082 | 6.6762 |
CFE specialty monthly writing paper sales | 0.6830 | 0.0111 | 0.5723 | 0.6138 | 7.0721 |
Dataset | Train Error | Test Error | Single Step Error |
---|---|---|---|
Monthly international airline passengers | 0.04987 | 0.08960 | 0.11902 |
Monthly car sales in Quebec | 0.09735 | 0.11494 | 0.12461 |
Monthly mean air temperature in Nottingham Castle | 0.06874 | 0.06193 | 0.05931 |
Perrin Freres monthly champagne sales | 0.07971 | 0.07008 | 0.08556 |
CFE specialty monthly writing paper sales | 0.07084 | 0.22353 | 0.21495 |
Dataset | Train Error | Test Error | Single Step Error |
---|---|---|---|
Monthly international airline passengers | 0.04534 | 0.07946 | 0.10356 |
Monthly car sales in Quebec | 0.09930 | 0.11275 | 0.11607 |
Monthly mean air temperature in Nottingham Castle | 0.07048 | 0.06572 | 0.06852 |
Perrin Freres monthly champagne sales | 0.06704 | 0.05916 | 0.07136 |
CFE specialty monthly writing paper sales | 0.09083 | 0.22973 | 0.23296 |
Dataset | Train Error | Test Error | Single Step Error |
---|---|---|---|
Monthly international airline passengers | 0.05606 | 0.08672 | 0.10566 |
Monthly car sales in Quebec | 0.10161 | 0.12748 | 0.12075 |
Monthly mean air temperature in Nottingham Castle | 0.07467 | 0.07008 | 0.06588 |
Perrin Freres monthly champagne sales | 0.08581 | 0.07362 | 0.07812 |
CFE specialty monthly writing paper sales | 0.07195 | 0.22121 | 0.21316 |
Interpolation Technique | # of Interpolation Points | Filter | Error |
---|---|---|---|
non-interpolated | - | fisher svd | 0.04122 ± 0.00349 |
non-interpolated | - | svd | 0.04122 ± 0.00349 |
non-interpolated | - | svd shannon | 0.04122 ± 0.00349 |
non-interpolated | - | fisher | 0.04166 ± 0.00271 |
non-interpolated | - | fisher shannon | 0.04166 ± 0.00271 |
fractal-interpolated | 1 | fisher hurst | 0.03597 ± 0.00429 |
fractal-interpolated | 1 | svd hurst | 0.03597 ± 0.00429 |
fractal-interpolated | 5 | hurst fisher | 0.03980 ± 0.00465 |
fractal-interpolated | 5 | hurst svd | 0.03980 ± 0.00465 |
fractal-interpolated | 5 | shannon | 0.04050 ± 0.00633 |
linear-interpolated | 3 | shannon svd | 0.03542 ± 0.00625 |
linear-interpolated | 3 | shannon fisher | 0.03804 ± 0.00672 |
linear-interpolated | 5 | fisher | 0.04002 ± 0.00357 |
linear-interpolated | 5 | fisher shannon | 0.04002 ± 0.00357 |
linear-interpolated | 5 | svd fisher | 0.04002 ± 0.00357 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Raubitzek, S.; Neubauer, T. Taming the Chaos in Neural Network Time Series Predictions. Entropy 2021, 23, 1424. https://doi.org/10.3390/e23111424
Raubitzek S, Neubauer T. Taming the Chaos in Neural Network Time Series Predictions. Entropy. 2021; 23(11):1424. https://doi.org/10.3390/e23111424
Chicago/Turabian StyleRaubitzek, Sebastian, and Thomas Neubauer. 2021. "Taming the Chaos in Neural Network Time Series Predictions" Entropy 23, no. 11: 1424. https://doi.org/10.3390/e23111424
APA StyleRaubitzek, S., & Neubauer, T. (2021). Taming the Chaos in Neural Network Time Series Predictions. Entropy, 23(11), 1424. https://doi.org/10.3390/e23111424