Introducing Technical Indicators to Electricity Price Forecasting: A Feature Engineering Study for Linear, Ensemble, and Deep Machine Learning Models
Abstract
:1. Introduction
2. Materials and Methods
2.1. Technical Indicators
- TI calculation only requires close prices.
- TI inclusion is likely to improve predictive performance by highlighting oscillations or trends in DAM prices.
- t
- time,
- price at time t,
- n
- lag-factor for ,
- s
- span for ,
2.1.1. Simple Moving Average (SMA)
2.1.2. Exponential Moving Average (EMA)
2.1.3. Moving Average Convergence Divergence (MACD)
- ‘Series’: Calculated from two EMAs, the ‘Series’ [17] gives insight into price convergence, divergence, and crossover. The ‘Series’ reflects the difference between a fast (e.g., ) and a slow (e.g., ) EMA, capturing the second derivative of a price series. Using Equation (2), the ‘Series’ is calculated according to Equation (3).
2.1.4. Moving Standard Deviation (MSD)
2.1.5. Bollinger Bands (BBANDs)
- %B: The %B [22] scales the price series by the BBAND width. When the underlying security price equals the SMA, the %B equals . When the price is equal to the BBANDBBAND, the %B equals respectively. Similarly to BBANDs, the %B can be used to identify when prices are overbought or oversold, to predict future volatility and to generate trading ideas. Using Equations (7) and (8), the %B is calculated according to Equation (9).
2.1.6. Momentum (MOM)
2.1.7. Rate of Change (ROC)
2.1.8. Coppock Curve (COPP)
2.1.9. True Strength Index (TSI)
2.2. Models
2.2.1. Linear Models
- Linear Regression (LR): The most fundamental of linear models, LR [25] fits a straight line through a series of points by minimizing the sum of squared errors between its targets and predictions. LRs are sensitive to outliers and correlated features. Nevertheless, as they are one of the primary ML models used for DAM forecasting, we include them in our examination.
- Huber Regression (HR): Extending LRs, HR [26] is a linear model robust to response variable outliers. Unlike LR, HR optimises both an absolute and squared loss function, reducing the impact of outliers. To switch between loss functions, HR uses an epsilon hyperparameter. Despite implementing an enhanced optimisation procedure, HR remains sensitive to explanatory variable outliers and correlations.
2.2.2. Ensemble Models
- Random Forest (RF): RF [27] fits several decision trees on random samples of the data and averages them to obtain a final result. Individual trees are fit by recursively splitting the data in such a way that maximises the information gain. The hyperparameters used by RF are the number of fitted estimators, the maximum number of features, and the minimum sample leaf size.
- AdaBoost (AB): AB [28] is an adaptive boosting algorithm used to sequentially train an ensemble of weak learners. The algorithm begins by fitting a weak learner, and continues by training copies of this learner, placing a greater instance weight on incorrectly predicted values. The algorithm proceeds until the final model, a weighted sum of all trained weak learners, becomes a strong learner. We use the algorithm to train an ensemble of decision trees.
- Gradient Boosting (GB): Another boosting algorithm, GB [29] focuses on sequentially improving model predictions by fitting copies of learners to residuals. Residual predictions are repeatedly added to model predictions until the sum of residuals stops decreasing. Similarly to AB, we choose to apply the GB algorithm to train an ensemble of decision trees.
2.2.3. Deep Models
- Fully Connected Layer (FCL): Fully connected neurons, comprising of linear regression with an added non-linearity [30], are stacked to build an FCL. FCLs can be used to approximate any continuous function [31], explaining why, with increasing computational power, they are frequently used in state-of-the-art DAM predictors.
- Convolutional Layer (CONV): Locally connected neural networks, or convolutional neural networks (CNNs) [32], are used for feature mapping/extraction. The primary module used by these networks, CONV, works by sliding equally sized filters with trainable parameters across input data producing 2D activation maps. While FCLs tune the parameters of every neuron, CONVs implement parameter sharing to remain computationally feasible. For a single CONV filter of size (NxM), parameters are trained. The inclusion of the depth parameter is a consequence of CONV’s fully connected architecture across the final depth dimension. Overall, CONVs have proved adept at identifying features in images [33] and time series [34], making them potentially very powerful modules for technical analysis.
- Residual Module: A residual module, ResNet [35], adds the inputs from one module to the outputs of another module. It thus creates a direct identity mapping in a network between module inputs and outputs, combating both the vanishing gradient problem and the degradation problem, which otherwise impede the training of deep networks.
2.3. Case Study
2.3.1. Data
2.3.2. Data Processing
2.3.3. TI Calculation
2.3.4. Model Training and Prediction
2.3.5. Evaluation
3. Results
3.1. Best Performing TIs
3.2. Distribution of Errors
3.3. Monthly Performance Improvements
4. Conclusions and Discussion
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
AB | AdaBoost |
BBAND(s) | Bollinger Band(s) |
CNN | Neural Network with CONV |
CONV | Convolutional Layer |
COPP | Coppock Curve |
DAM | Day-ahead Market |
DM | Diebold-Mariano (test statistic) |
EMA | Exponential Moving Average |
FCL | Fully Connected Layer |
GB | Gradient Boosting |
HR | Huber Regression |
LR | Linear Regression |
MACD | Moving Average Convergence Divergence |
MAE | Mean Absolute Error |
ML | Machine Learning |
MOM | Momentum |
MSD | Moving Standard Deviation |
NN | Neural Network with FCL |
PCC | Pearson Correlation Coefficient |
RF | Random Forest |
RMSE | Root Mean Squared Error |
ROC | Rate of Change |
SMA | Simple Moving Average |
TI(s) | Technical Indicator(s) |
TSI | True Strength Index |
Appendix A. Technical Analysis
Appendix B. Hyperparameters
Hyperparameters | |
---|---|
LR | - |
HR | epsilon:1.35 |
RF | n_estimators: 100 |
AB | n_estimators: 100, loss: square, learning rate: 0.1 |
GB | loss: huber |
2NN | neuron: 500, neuron: 250, learning rate: 0.001 |
CNN | kernel: (1, 3), filter: 16, CONV layer: 1, learning rate: 0.001, dropout: 0.25 |
2CNN | kernel: (2, 3), filter: 13, CONV layer: 2, learning rate: 0.001, dropout: 0.25 |
2CNN_NN | kernel: (1, 3), filter: 32, CONV layer: 2, neuron: 123, learning rate: 0.001 |
ResNet | kernel: (3, 3), filter: 23, CONV layer: 7, learning rate: 0.001 |
References
- Directive 2009/28/EC (the Renewable Energy Directive). European Parliament, Council of the European Union. 2019. Available online: https://eur-lex.europa.eu/legal-content/EN/ALL/?uri=CELEX:32009L0028 (accessed on 1 November 2019).
- Proposal for a Regulation of the European Parliament and of the Council on the Internal Market for Electricity. European Commission. 2017. Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:52016PC0861R(01)&from=EN (accessed on 11 September 2019).
- Internal Market for Electricity. European Parliamentary Research Service. 2019. Available online: http://www.europarl.europa.eu/RegData/etudes/BRIE/2017/595925/EPRS_BRI(2017)595925_EN.pdf (accessed on 1 November 2019).
- Murphy, J. Technical Analysis of the Financial Markets: A Comprehensive Guide to Trading Methods and Applications; New York Institute of Finance Serie; New York Institute of Finance: New York, NY, USA, 1999. [Google Scholar]
- Weron, R. Electricity Price Forecasting: A Review of the State-Of-The-Art with a Look into the Future; HSC Research Reports HSC/14/07; Hugo Steinhaus Center, Wroclaw University of Technology: Wrocław, Poland, 2014. [Google Scholar]
- Lago, J.; De Ridder, F.; De Schutter, B. Forecasting spot electricity prices: Deep learning approaches and empirical comparison of traditional algorithms. Appl. Energy 2018, 221, 386–405. [Google Scholar] [CrossRef]
- Sadeghi-Mobarakeh, A.; Kohansal, M.; Papalexakis, E.E.; Rad, H.M. Data mining based on random forest model to predict the California ISO day-ahead market prices. In Proceedings of the 2017 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), Washington, DC, USA, 23–26 April 2017; pp. 1–5. [Google Scholar]
- Azzopardi, P. Behavioural Technical Analysis: An Introduction to Behavioural Finance and Its Role in Technical Analysis; Harriman House Series; Harriman House: Petersfield, UK, 2010. [Google Scholar]
- Mehdi Rafiei, T.N.; Khooban, M.H. Probabilistic electricity price forecasting by improved clonal selection algorithm and wavelet preprocessing. Neural Comput. Appl. 2017, 28, 3889–3901. [Google Scholar] [CrossRef]
- Wang, K.; Xu, C.; Zhang, Y.; Guo, S.; Zomaya, A. Robust Big Data Analytics for Electricity Price Forecasting in the Smart Grid. IEEE Trans. Big Data 2019, 5, 34–45. [Google Scholar] [CrossRef]
- Kirkpatrick, C. Technical Analysis: The Complete Resource for Financial Market Technicians; FT Press: Upper Saddle River, NJ, USA, 2011. [Google Scholar]
- Rajashree Dash, P.K.D. A hybrid stock trading framework integrating technical analysis with machine learning techniques. J. Financ. Data Sci. 2016, 2, 42–57. [Google Scholar] [CrossRef] [Green Version]
- Larsen, J.I. Predicting Stock Prices Using Technical Analysis and Machine Learning. Master’s Thesis, Norwegian University of Science and Technology, Trondheim, Norway, 2010. [Google Scholar]
- Chun-Teh Lee, J.S.T. Trend-Oriented Training for Neural Networks to Forecast Stock Markets. Asia Pac. Manag. Rev. 2013, 18, 181–195. [Google Scholar]
- Kulp, A.; Djupsjöbacka, D.; Estlander, M. Managed Futures and Long Volatility. AIMA J. 2005, 27–28. [Google Scholar]
- Nison, S. Japanese Candlestick Charting Techniques: A Contemporary Guide to the Ancient Investment Techniques of the Far East; New York Institute of Finance: New York, NY, USA, 1991. [Google Scholar]
- Pring, M. Technical Analysis Explained: The Successful Investor’s Guide to Spotting Investment Trends and Turning Points; McGraw-Hill Education: New York, NY, USA, 2014. [Google Scholar]
- Neftci, S.N. Naive Trading Rules in Financial Markets and Wiener-Kolmogorov Prediction Theory: A Study of “Technical Analysis”. J. Bus. 1991, 64, 549–571. [Google Scholar] [CrossRef]
- Gurrib, I. Optimization of the Double Crossover Strategy for the S&P500 Market Index. Glob. Rev. Account. Financ. 2016, 7, 92–107. [Google Scholar]
- Poon, S.H.; Granger, C.W. Forecasting Volatility in Financial Markets: A Review. J. Econ. Lit. 2003, 41, 478–539. [Google Scholar] [CrossRef]
- Merrill, A. M and W Patterns. Mark. Tech. Assoc. J. 1980, 43–54. [Google Scholar]
- Bollinger, J. Bollinger on Bollinger Bands; Professional Finance & Investment; McGraw-Hill Education: New York, NY, USA, 2002. [Google Scholar]
- Coppock, E.S.C. Practical Relative Strength Charting; Trendex Corp.: San Antonio, TX, USA, 1960. [Google Scholar]
- Blau, W. Momentum, Direction, and Divergence; Wiley Trader’s Exchange; Wiley: Hoboken, NJ, USA, 1995. [Google Scholar]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer Series in Statistics; Springer: New York, NY, USA, 2001. [Google Scholar]
- Huber, P.; Wiley, J.; InterScience, W. Robust Statistics; Wiley: New York, NY, USA, 1981. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Freund, Y.; Schapire, R.E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef] [Green Version]
- Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2000, 29, 1189–1232. [Google Scholar] [CrossRef]
- Fausett, L. (Ed.) Fundamentals of Neural Networks: Architectures, Algorithms, and Applications; Prentice-Hall, Inc.: Upper Saddle River, NJ, USA, 1994. [Google Scholar]
- Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control. Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; Available online: http://www.deeplearningbook.org (accessed on 29 May 2019).
- Alom, M.Z.; Taha, T.M.; Yakopcic, C.; Westberg, S.; Hasan, M.; Esesn, B.C.V.; Awwal, A.A.S.; Asari, V.K. The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches. arXiv 2018, arXiv:1803.01164. [Google Scholar]
- LeCun, Y.; Bengio, Y. The Handbook of Brain Theory and Neural Networks; Chapter Convolutional Networks for Images, Speech, and Time Series; MIT Press: Cambridge, MA, USA, 1998; pp. 255–258. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. arXiv 2015, arXiv:1512.03385. [Google Scholar]
- Epex Spot, Belgium. Available online: https://www.belpex.be (accessed on 29 May 2019).
- Box, G.E.P.; Jenkins, G. Time Series Analysis, Forecasting and Control; Holden-Day, Inc.: San Francisco, CA, USA, 1990. [Google Scholar]
- Diebold, F.X.; Mariano, R.S. Comparing Predictive Accuracy. J. Bus. Econ. Stat. 2002, 20, 134–144. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Chollet, F. Keras. 2015. Available online: https://keras.io (accessed on 29 May 2019).
LR | HR | RF | AB | GB | 2NN | CNN | 2CNN | 2CNN_NN | ResNet | ||
---|---|---|---|---|---|---|---|---|---|---|---|
Best TI | TI | %B | EMA | EMA | MOM | MOM | EMA | ROC | ROC | ROC | ROC |
HP | n = 58 | s = 2 | s = 6 | n = 58 | n = 58 | s = 22 | n = 49 | n = 49 | n = 9 | n = 27 | |
RMSE | 11.47 | 11.57 | 12.21 | 12.66 | 11.64 | 10.96 | 11.14 | 11.03 | 11.10 | 11.09 | |
%RMSE | 4.49 | 4.50 | 1.66 | 5.42 | 3.74 | 4.09 | 2.73 | 2.39 | 2.36 | 1.38 | |
MAE | 7.87 | 7.67 | 8.07 | 8.58 | 7.84 | 7.59 | 7.65 | 7.58 | 7.66 | 7.56 | |
%MAE | 3.26 | 5.59 | 2.10 | 6.20 | 2.22 | 1.56 | 4.21 | 2.91 | 3.41 | 1.75 | |
PCC | 0.79 | 0.79 | 0.76 | 0.74 | 0.79 | 0.81 | 0.80 | 0.81 | 0.81 | 0.80 | |
%PCC | 2.92 | 2.75 | 1.08 | 3.79 | 2.80 | 1.85 | 1.40 | 1.19 | 0.95 | 0.82 | |
DM | 10.39 | 14.86 | 1.74 | 4.18 | 3.68 | 3.23 | 7.15 | 4.51 | 5.15 | 4.28 | |
Second-Best TI | TI | EMA | %B | MOM | ROC | %B | SMA | - | EMA | SMA | COPP |
HP | s = 2 | n = 58 | n = 58 | n = 57 | n = 54 | n = 22 | - | s = 18 | n = 18 | * | |
RMSE | 11.57 | 11.59 | 12.22 | 12.71 | 11.78 | 11.08 | - | 11.26 | 11.27 | 11.16 | |
%RMSE | 3.73 | 4.35 | 1.57 | 5.01 | 2.66 | 3.00 | - | 0.37 | 0.91 | 0.77 | |
MAE | 7.70 | 7.78 | 8.15 | 8.68 | 7.81 | 7.68 | - | 7.77 | 7.78 | 7.60 | |
%MAE | 5.36 | 4.17 | 1.10 | 5.14 | 2.60 | 1.56 | - | 0.44 | 1.91 | 1.16 | |
PCC | 0.79 | 0.79 | 0.77 | 0.73 | 0.78 | 0.81 | - | 0.80 | 0.80 | 0.80 | |
%PCC | 2.55 | 2.89 | 1.57 | 3.23 | 1.71 | 1.38 | - | 0.06 | 0.01 | 0.40 | |
DM | 13.67 | 12.30 | 1.23 | 4.36 | 1.61 | 2.08 | - | 0.16 | 1.36 | 2.40 | |
Third-Best TI | TI | ‘Histogram’ | SMA | ROC | EMA | EMA | ‘Histogram’ | - | - | COPP | - |
HP | ** | n = 18 | n = 57 | s = 6 | s = 6 | *** | - | - | **** | - | |
RMSE | 11.62 | 11.66 | 12.23 | 12.82 | 11.79 | 11.12 | - | - | 11.28 | - | |
%RMSE | 3.25 | 3.76 | 1.50 | 4.19 | 2.53 | 2.66 | - | - | 0.75 | - | |
MAE | 7.88 | 7.79 | 8.18 | 8.78 | 7.84 | 7.68 | - | - | 7.78 | - | |
%MAE | 3.11 | 4.02 | 0.77 | 4.06 | 2.20 | 1.51 | - | - | 1.89 | - | |
PCC | 0.79 | 0.79 | 0.76 | 0.73 | 0.78 | 0.80 | - | - | 0.80 | - | |
%PCC | 2.22 | 2.43 | 0.97 | 3.01 | 1.67 | 1.13 | - | - | 0.01 | - | |
DM | 11.92 | 16.02 | 0.56 | 3.92 | 2.43 | 2.49 | - | - | 1.32 | - |
LR | HR | RF | AB | GB | 2NN | CNN | 2CNN | 2CNN_NN | ResNet | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Best TI | TI | %B | EMA | EMA | MOM | MOM | EMA | ROC | ROC | ROC | ROC | |
HP | n = 58 | s = 2 | s = 6 | n = 58 | n = 58 | s = 22 | n = 49 | n = 49 | n = 9 | n = 27 | ||
IQR | 47.18 | 53.35 | 52.53 | 51.29 | 49.91 | 48.44 | 56.45 | 54.20 | 52.00 | 51.12 | ||
−24.76 | −20.59 | −29.67 | −66.34 | −27.34 | −42.92 | −30.00 | −28.25 | −32.05 | −16.43 | |||
52.82 | 46.65 | 47.47 | 48.67 | 50.09 | 51.56 | 43.55 | 45.80 | 48.00 | 48.88 | |||
28.15 | 21.94 | 32.19 | 27.28 | 30.17 | 33.27 | 23.98 | 20.81 | 23.60 | 13.31 | |||
3.40 | 1.35 | 2.52 | −39.06 | 2.83 | −9.65 | −6.02 | −7.44 | −8.45 | −3.12 | |||
Tails | 55.79 | 60.10 | 53.17 | 53.88 | 53.21 | 51.65 | 54.55 | 54.38 | 53.37 | 55.03 | ||
−92.42 | −67.83 | −80.61 | −154.69 | −108.50 | −129.77 | −54.81 | −54.21 | −63.78 | −37.31 | |||
44.21 | 39.90 | 46.83 | 46.07 | 46.79 | 48.35 | 45.45 | 45.62 | 46.63 | 44.97 | |||
43.91 | 29.69 | 67.52 | 118.70 | 63.10 | 91.44 | 49.99 | 50.09 | 55.09 | 32.97 | |||
−48.51 | −38.14 | −13.08 | −35.99 | −45.41 | −38.33 | −4.82 | −4.11 | −8.69 | −4.34 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Demir, S.; Mincev, K.; Kok, K.; Paterakis, N.G. Introducing Technical Indicators to Electricity Price Forecasting: A Feature Engineering Study for Linear, Ensemble, and Deep Machine Learning Models. Appl. Sci. 2020, 10, 255. https://doi.org/10.3390/app10010255
Demir S, Mincev K, Kok K, Paterakis NG. Introducing Technical Indicators to Electricity Price Forecasting: A Feature Engineering Study for Linear, Ensemble, and Deep Machine Learning Models. Applied Sciences. 2020; 10(1):255. https://doi.org/10.3390/app10010255
Chicago/Turabian StyleDemir, Sumeyra, Krystof Mincev, Koen Kok, and Nikolaos G. Paterakis. 2020. "Introducing Technical Indicators to Electricity Price Forecasting: A Feature Engineering Study for Linear, Ensemble, and Deep Machine Learning Models" Applied Sciences 10, no. 1: 255. https://doi.org/10.3390/app10010255
APA StyleDemir, S., Mincev, K., Kok, K., & Paterakis, N. G. (2020). Introducing Technical Indicators to Electricity Price Forecasting: A Feature Engineering Study for Linear, Ensemble, and Deep Machine Learning Models. Applied Sciences, 10(1), 255. https://doi.org/10.3390/app10010255