Solar Irradiation Forecasting Using Ensemble Voting Based on Machine Learning Algorithms
Abstract
:1. Introduction
- Propose an ensemble voting combining random forest, extreme gradient boosting, categorical boosting, and adaptive boosting, which had never before been implemented for solar irradiation forecasting;
- Apply a clustering algorithm to group data with similar weather patterns;
- Propose an ensemble feature selection method to select the most significant input variables and their delay values;
- Evaluate the performance of algorithms for different forecasting horizons.
2. Machine Learning Algorithms
2.1. Random Forest (RF)
2.2. Extreme Gradient Boosting (XGBT)
2.3. Categorical Boosting (CatBoost)
2.4. Adaptive Boosting (AdaBoost)
2.5. Ensemble Voting
3. Proposed Methodology
3.1. Data Description
3.2. Pre-Processing
3.3. Clustering
3.4. Feature Selection
3.5. Hyperparameter Optimization
4. Performance Metrics
5. Results and Discussion
5.1. Machine Learning Algorithms
5.2. Voting Ensemble
- VOA1: simple average of CatBoost + RF + XGBT + AdaBoost
- VOA2: simple average of CatBoost + RF + XGBT
- VOA3: simple average of CatBoost + RF
- VOWA: weighted average of CatBoost + RF
5.3. Statistical Analysis
5.4. Different Forecast Horizons
5.5. Comparison with Benchmark Dataset
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Soulouknga, M.H.; Coban, H.H.; Falama, R.Z.; Mbakop, F.K.; Djongyang, N. Comparison of Different Models to Estimate Global Solar Irradiation in the Sudanese Zone of Chad. J. Elektron. Telekomun. 2022, 22, 63. [Google Scholar] [CrossRef]
- IRENA. Renewable Capacity Highlights 2022. Available online: https://www.irena.org/publications/2022/Apr/Renewable-Capacity-Statistics-2022 (accessed on 29 September 2022).
- Wang, Y.; Millstein, D.; Mills, A.D.; Jeong, S.; Ancell, A. The Cost of Day-Ahead Solar Forecasting Errors in the United States. Sol. Energy 2022, 231, 846–856. [Google Scholar] [CrossRef]
- Krishnan, N.; Kumar, K.R.; Inda, C.S. How Solar Radiation Forecasting Impacts the Utilization of Solar Energy: A Critical Review. J. Clean. Prod. 2023, 388, 135860. [Google Scholar] [CrossRef]
- Wu, Y.-K.; Huang, C.-L.; Phan, Q.-T.; Li, Y.-Y. Completed Review of Various Solar Power Forecasting Techniques Considering Different Viewpoints. Energies 2022, 15, 3320. [Google Scholar] [CrossRef]
- Qing, X.; Niu, Y. Hourly Day-Ahead Solar Irradiance Prediction Using Weather Forecasts by LSTM. Energy 2018, 148, 461–468. [Google Scholar] [CrossRef]
- Voyant, C.; Notton, G.; Kalogirou, S.; Nivet, M.-L.; Paoli, C.; Motte, F.; Fouilloy, A. Machine Learning Methods for Solar Radiation Forecasting: A Review. Renew. Energy 2017, 105, 569–582. [Google Scholar] [CrossRef]
- Amoura, Y.; Torres, S.; Lima, J.; Pereira, A.I. Combined Optimization and Regression Machine Learning for Solar Irradiation and Wind Speed Forecasting. In Optimization, Learning Algorithms and Applications; Communications in Computer and Information Science; Springer International Publishing: Cham, Switzerland, 2022; Volume 1754, pp. 215–228. ISBN 978-3-031-23235-0. [Google Scholar]
- Bae, K.Y.; Jang, H.S.; Sung, D.K. Hourly Solar Irradiance Prediction Based on Support Vector Machine and Its Error Analysis. IEEE Trans. Power Syst. 2016, 32, 935–945. [Google Scholar] [CrossRef]
- Aslam, M.; Lee, J.-M.; Kim, H.-S.; Lee, S.-J.; Hong, S. Deep Learning Models for Long-Term Solar Radiation Forecasting Considering Microgrid Installation: A Comparative Study. Energies 2019, 13, 147. [Google Scholar] [CrossRef]
- Khosravi, A.; Koury, R.N.N.; Machado, L.; Pabon, J.J.G. Prediction of Hourly Solar Radiation in Abu Musa Island Using Machine Learning Algorithms. J. Clean. Prod. 2018, 176, 63–75. [Google Scholar] [CrossRef]
- Huang, X.; Li, Q.; Tai, Y.; Chen, Z.; Zhang, J.; Shi, J.; Gao, B.; Liu, W. Hybrid Deep Neural Model for Hourly Solar Irradiance Forecasting. Renew. Energy 2021, 171, 1041–1060. [Google Scholar] [CrossRef]
- Aslam, M.; Lee, J.-M.; Altaha, M.; Lee, S.-J.; Hong, S. AE-LSTM Based Deep Learning Model for Degradation Rate Influenced Energy Estimation of a PV System. Energies 2020, 13, 4373. [Google Scholar] [CrossRef]
- Guermoui, M.; Melgani, F.; Gairaa, K.; Mekhalfi, M.L. A Comprehensive Review of Hybrid Models for Solar Radiation Forecasting. J. Clean. Prod. 2020, 258, 120357. [Google Scholar] [CrossRef]
- Park, J.; Moon, J.; Jung, S.; Hwang, E. Multistep-Ahead Solar Radiation Forecasting Scheme Based on the Light Gradient Boosting Machine: A Case Study of Jeju Island. Remote Sens. 2020, 12, 2271. [Google Scholar] [CrossRef]
- Abdellatif, A.; Mubarak, H.; Ahmad, S.; Ahmed, T.; Shafiullah, G.M.; Hammoudeh, A.; Abdellatef, H.; Rahman, M.M.; Gheni, H.M. Forecasting Photovoltaic Power Generation with a Stacking Ensemble Model. Sustainability 2022, 14, 11083. [Google Scholar] [CrossRef]
- Kumari, P.; Toshniwal, D. Extreme Gradient Boosting and Deep Neural Network Based Ensemble Learning Approach to Forecast Hourly Solar Irradiance. J. Clean. Prod. 2021, 279, 123285. [Google Scholar] [CrossRef]
- Lee, J.; Wang, W.; Harrou, F.; Sun, Y. Reliable Solar Irradiance Prediction Using Ensemble Learning-Based Models: A Comparative Study. Energy Convers. Manag. 2020, 208, 112582. [Google Scholar] [CrossRef]
- Pan, C.; Tan, J. Day-Ahead Hourly Forecasting of Solar Generation Based on Cluster Analysis and Ensemble Model. IEEE Access 2019, 7, 112921–112930. [Google Scholar] [CrossRef]
- AlKandari, M.; Ahmad, I. Solar Power Generation Forecasting Using Ensemble Approach Based on Deep Learning and Statistical Methods. Appl. Comput. Inform. 2020. ahead-of-print. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.; Gulin, A. CatBoost: Unbiased boosting with categorical features. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, QC, Canada, 3 December 2018. [Google Scholar]
- Schapire, R.E. The Boosting Approach to Machine Learning: An Overview. In Nonlinear Estimation and Classification; Denison, D.D., Hansen, M.H., Holmes, C.C., Mallick, B., Yu, B., Eds.; Lecture Notes in Statistics; Springer: New York, NY, USA, 2003; Volume 171, pp. 149–171. ISBN 978-0-387-95471-4. [Google Scholar]
- An, K.; Meng, J. Voting-Averaged Combination Method for Regressor Ensemble. In Proceedings of the Advanced Intelligent Computing Theories and Applications, Fuzhou, China, 20–23 August 2015; Huang, D.-S., Zhao, Z., Bevilacqua, V., Figueroa, J.C., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; Volume 6215, pp. 540–546. [Google Scholar]
- INMET. Instituto Nacional de Meteorologia. Available online: https://portal.inmet.gov.br/ (accessed on 1 September 2022).
- Solargis, Solar resource data © Solargis.
- Han, J.; Kamber, M.; Pei, J. Data Mining: Concepts and Techniques, 3rd ed.; Elsevier Inc.: Waltham, MA, USA, 2012. [Google Scholar]
- Vergara, J.R.; Estévez, P.A. A Review of Feature Selection Methods Based on Mutual Information. Neural. Comput. Applic. 2014, 24, 175–186. [Google Scholar] [CrossRef]
- Kira, K.; Rendell, L. A Practical Approach to Feature Selection. Mach. Learn. Proc. 1992, 1992, 249–256. [Google Scholar] [CrossRef]
- Agrawal, T. Hyperparameter Optimization Using Scikit-Learn. In Hyperparameter Optimization in Machine Learning; Apress: Berkeley, CA, USA, 2021; pp. 31–51. ISBN 978-1-4842-6578-9. [Google Scholar]
- Lago, J.; Marcjasz, G.; Schutter, B.; Weron, R. Forecasting day-ahead electricity prices: A review of state-of-the-art algorithms, best practices and an open-access benchmark. Appl. Energy 2021, 293, 1–21. [Google Scholar] [CrossRef]
- Anderson, O.D. Time Series Analysis and Forecasting: The Box-Jenkins Approach; Butterworth: London, UK; Boston, MA, USA, 1976; ISBN 978-0-408-70675-9. [Google Scholar]
Reference | Year | Forecasting Variable | Feature Selection | ML Algorithms | Cluster Analysis | Ensemble | Multi-Step Ahead Forecast |
---|---|---|---|---|---|---|---|
[9] | 2016 | Hourly solar irradiance | - | SVR, NAR and NN | ✓ | - | - |
[10] | 2019 | Hourly and daily solar radiation | - | NN, RNN, LSTM, GRU and SVR | - | - | - |
[11] | 2018 | Hourly solar radiation | - | NN, SVR, FIS and ANFIS | - | - | ✓ |
[12] | 2021 | Hourly solar irradiance | - | hybrid WPD, CNN, LSTM, and MLP | - | - | - |
[13] | 2020 | PV system energy | - | hybrid AE and LSTM | - | - | ✓ |
[15] | 2020 | Hourly global solar radiation | ✓ | LightGBM | - | Bagging and boosting | ✓ |
[16] | 2022 | PV generation | - | RF, XGBT, AdaBoost and ETR | - | Bagging, boosting, and stacking | - |
[17] | 2021 | Hourly solar irradiance | ✓ | Ensemble XGBT and DNN | - | Bagging and stacking | - |
[18] | 2020 | Global horizontal irradiance | - | Boosted trees, bagged trees, RF, and generalized RF | - | Bagging and boosting | - |
[19] | 2019 | Hourly solar generation | - | Ensemble of RF | ✓ | Bagging | - |
[20] | 2019 | Solar power generation | - | LSTM, GRU, AE LSTM, AE GRU, and Theta model | - | Voting | - |
This paper | 2023 | Hourly global solar irradiation | ✓ | AdaBoost, RF, XGBT, CatBoost, and voting average | ✓ | Bagging, boosting, and voting | ✓ |
Data | Abbrev. | Unity | Mean |
---|---|---|---|
Hour, day, month, year | Hr, D, M, Y | - | - |
Global solar irradiation | R | MJ/m2 | 1.65 |
Maximum wind gust | Wg | m/s | 5.66 |
Wind speed | Ws | m/s | 1.60 |
Wind direction | Wd | ° | 130.80 |
Dry-bulb temperature | T | °C | 27.36 |
Hourly maximum temperature | Tmax | °C | 28.14 |
Hourly minimum temperature | Tmin | °C | 26.50 |
Dew point temperature | Td | °C | 21.53 |
Hourly maximum dew point temperature | Tdmax | °C | 22.28 |
Hourly minimum dew point temperature | Tdmin | °C | 20.85 |
Total precipitation | P | mm | 0.20 |
Station atmospheric pressure | A | mb | 1009.40 |
Hourly maximum atmospheric pressure | Amax | mb | 1009.71 |
Hourly minimum atmospheric pressure | Amin | mb | 1009.21 |
Relative humidity | H | % | 71.32 |
Hourly maximum relative humidity | Hmax | % | 75.05 |
Hourly minimum relative humidity | Hmin | % | 68.11 |
Variable | Cluster 1 | Cluster 2 | Cluster 3 |
---|---|---|---|
R | t − 1, t − 2, t − 23, t − 24, t − 25, t − 48, t − 49, t − 72 | t − 1, t − 2, t − 23, t − 24, t − 25, t − 48, t − 49, t − 72 | t − 1, t − 2, t − 23, t − 24, t − 25, t − 47, t − 48, t − 72 |
T | t − 1, t − 2, t − 23, t − 24, t − 25, t − 48, t − 49, t − 72 | t − 1, t − 2, t − 23, t − 24, t − 25, t − 48, t − 49, t − 72 | t − 1, t − 2, t − 23, t − 24, t − 25, t − 48, t − 49, t − 72 |
H | t − 1, t − 2, t − 23, t − 24, t − 25, t − 48, t − 49, t − 72 | t − 1, t − 2, t − 23, t − 24, t − 25, t − 48, t − 49, t − 72 | t − 1, t − 2, t − 23, t − 24, t − 25, t − 48, t − 49, t − 72 |
Ws | t − 1, t − 2, t − 24, t − 25 | t − 1, t − 2, t − 24, t − 48 | t − 1, t − 2, t − 24, t − 25 |
Wg | t − 1, t − 24 | t − 1 | t − 1, t − 2 |
Wd | t − 1, t − 2, | t − 1 | t − 1, t − 2 |
Td | t − 1 | - | - |
A | t − 1, t − 2, t − 24 | - | - |
Algorithm | Hyperparameter | Cluster 1 | Cluster 2 | Cluster 3 |
---|---|---|---|---|
RF | Max_depth: depth of the tree | 12 | 11 | 14 |
n_estimators: number of trees | 400 | 500 | 600 | |
XGBT | learning_rate: weighting factor for learning | 0.1 | 0.1 | 0.1 |
max_depth: depth of the tree | 4 | 6 | 5 | |
n_estimators: number of trees | 80 | 80 | 80 | |
subsample: subsample ratio of the training set | 0.9 | 0.9 | 0.6 | |
CatBoost | depth: depth of the tree | 6 | 6 | 8 |
L2_reg: coefficient at the L2 regularization term of the cost function | 4 | 4 | 2 | |
learning_rate: used to reduce the gradient step | 0.05 | 0.05 | 0.05 | |
Iterations: maximum number of trees that can be built | 2000 | 2000 | 2000 | |
AdaBoost | learning_rate: weight applied to each regressor at each boosting iteration | 0.1 | 0.2 | 0.2 |
n_estimators: number of trees | 50 | 30 | 30 |
MAE | RMSE | MAPE (%) | R2 | |
---|---|---|---|---|
Cluster 1 | ||||
CatBoost | 0.299 | 0.426 | 35.505 | 0.798 |
RF | 0.306 | 0.427 | 35.963 | 0.797 |
XGBT | 0.308 | 0.430 | 38.393 | 0.794 |
AdaBoost | 0.364 | 0.476 | 58.716 | 0.748 |
Cluster 2 | ||||
CatBoost | 0.241 | 0.352 | 25.773 | 0.852 |
RF | 0.248 | 0.356 | 26.457 | 0.848 |
XGBT | 0.249 | 0.361 | 27.068 | 0.844 |
AdaBoost | 0.304 | 0.403 | 47.741 | 0.806 |
Cluster 3 | ||||
CatBoost | 0.235 | 0.359 | 17.571 | 0.887 |
RF | 0.245 | 0.372 | 18.195 | 0.878 |
XGBT | 0.243 | 0.363 | 18.373 | 0.884 |
AdaBoost | 0.312 | 0.425 | 30.492 | 0.841 |
MAE | RMSE | MAPE (%) | R2 | Learning Time (s) | |
---|---|---|---|---|---|
Cluster 1 | |||||
VOA 1 | 0.309 | 0.427 | 40.099 | 0.797 | 134.40 |
VOA 2 | 0.297 | 0.423 | 34.457 | 0.801 | 125.96 |
VOA 3 | 0.297 | 0.422 | 34.598 | 0.802 | 123.96 |
VOWA | 0.296 | 0.422 | 34.394 | 0.801 | 49.43 |
Cluster 2 | |||||
VOA 1 | 0.250 | 0.355 | 29.889 | 0.849 | 57.20 |
VOA 2 | 0.239 | 0.351 | 25.336 | 0.852 | 53.43 |
VOA 3 | 0.240 | 0.350 | 25.402 | 0.854 | 55.95 |
VOWA | 0.239 | 0.350 | 25.269 | 0.854 | 58.16 |
Cluster 3 | |||||
VOA 1 | 0.246 | 0.365 | 19.977 | 0.883 | 130.60 |
VOA 2 | 0.235 | 0.359 | 17.570 | 0.887 | 127.54 |
VOA 3 | 0.234 | 0.358 | 17.425 | 0.886 | 120.34 |
VOWA | 0.233 | 0.358 | 17.314 | 0.888 | 105.37 |
VOWA | CatBoost | RF |
---|---|---|
Cluster 1 | 2 | 1 |
Cluster 2 | 2 | 1 |
Cluster 3 | 2 | 1 |
p-Value | |||
---|---|---|---|
Cluster 1 | Cluster 2 | Cluster 3 | |
CatBoost-VOWA | 4.33 × 10−5 | 0.006 | 0.003 |
RF-VOWA | 0.003 | 0.001 | 1.98 × 10−4 |
XGBT-VOWA | 3.45 × 10−4 | 3.33 × 10−6 | 0.008 |
AdaBoost-VOWA | 1.14 × 10−36 | 3.80 × 10−27 | 3.31 × 10−25 |
MAE | RMSE | MAPE (%) | R2 | |
---|---|---|---|---|
VOWA | 1779 | 2300 | 3759 | 0929 |
CatBoost | 2518 | 3091 | 5206 | 0871 |
RF | 2031 | 2592 | 4183 | 0909 |
XGBT | 2187 | 2764 | 4567 | 0897 |
AdaBoost | 1796 | 2365 | 3828 | 0924 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Solano, E.S.; Affonso, C.M. Solar Irradiation Forecasting Using Ensemble Voting Based on Machine Learning Algorithms. Sustainability 2023, 15, 7943. https://doi.org/10.3390/su15107943
Solano ES, Affonso CM. Solar Irradiation Forecasting Using Ensemble Voting Based on Machine Learning Algorithms. Sustainability. 2023; 15(10):7943. https://doi.org/10.3390/su15107943
Chicago/Turabian StyleSolano, Edna S., and Carolina M. Affonso. 2023. "Solar Irradiation Forecasting Using Ensemble Voting Based on Machine Learning Algorithms" Sustainability 15, no. 10: 7943. https://doi.org/10.3390/su15107943
APA StyleSolano, E. S., & Affonso, C. M. (2023). Solar Irradiation Forecasting Using Ensemble Voting Based on Machine Learning Algorithms. Sustainability, 15(10), 7943. https://doi.org/10.3390/su15107943