Forecasting Energy Consumption of a Public Building Using Transformer and Support Vector Regression
Abstract
:1. Introduction
- The performance of the transformer-based model is estimated in the energy forecasting domain, which can provide some insights into the comparison between a conventional model SVR and the transformer-based model.
- The sensitivity analysis’s outcome shows there is no need to proliferate the input length as the prediction is insensitive to the data point far away.
2. Materials and Methods
2.1. Pre-Process Data
2.2. Transformer
2.2.1. Self-Attention
2.2.2. Skip Connection and Layer Normalisation
2.3. SVR
2.4. Hyperparameter Tuning
2.5. Metrics
3. Results and Discussions
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Uniyal, S.; Paliwal, R.; Kaphaliya, B.; Sharma, R.K. Human Overpopulation: Impact on Environment. In Megacities and Rapid Urbanization: Breakthroughs in Research and Practice; IGI Global: Hershey, PA, USA, 2020; pp. 20–30. [Google Scholar]
- Mahi, M.; Phoong, S.W.; Ismail, I.; Isa, C.R. Energy–finance–growth nexus in ASEAN-5 countries: An ARDL bounds test approach. Sustainability 2019, 12, 5. [Google Scholar] [CrossRef] [Green Version]
- Alola, A.A.; Joshua, U. Carbon emission effect of energy transition and globalization: Inference from the low-, lower middle-, upper middle-, and high-income economies. Environ. Sci. Pollut. Res. 2020, 27, 38276–38286. [Google Scholar] [CrossRef] [PubMed]
- Gu, W.; Zhao, X.; Yan, X.; Wang, C.; Li, Q. Energy technological progress, energy consumption, and CO2 emissions: Empirical evidence from China. J. Clean. Prod. 2019, 236, 117666. [Google Scholar] [CrossRef]
- Xiang, X.; Ma, M.; Ma, X.; Chen, L.; Cai, W.; Feng, W.; Ma, Z. Historical decarbonization of global commercial building operations in the 21st century. Appl. Energy 2022, 322, 119401. [Google Scholar] [CrossRef]
- Zhao, H.-x.; Magoulès, F. A review on the prediction of building energy consumption. Renew. Sustain. Energy Rev. 2012, 16, 3586–3592. [Google Scholar] [CrossRef]
- Abiodun, O.I.; Jantan, A.; Omolara, A.E.; Dada, K.V.; Mohamed, N.A.; Arshad, H. State-of-the-art in artificial neural network applications: A survey. Heliyon 2018, 4, e00938. [Google Scholar] [CrossRef] [Green Version]
- Dash, S.K.; Roccotelli, M.; Khansama, R.R.; Fanti, M.P.; Mangini, A.M. Long Term Household Electricity Demand Forecasting Based on RNN-GBRT Model and a Novel Energy Theft Detection Method. Appl. Sci. 2021, 11, 8612. [Google Scholar] [CrossRef]
- Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 1–74. [Google Scholar] [CrossRef]
- Maldonado, S.; González, A.; Crone, S. Automatic time series analysis for electric load forecasting via support vector regression. Appl. Soft Comput. 2019, 83, 105616. [Google Scholar] [CrossRef]
- Wei, N.; Li, C.; Peng, X.; Zeng, F.; Lu, X. Conventional models and artificial intelligence-based models for energy consumption forecasting: A review. J. Pet. Sci. Eng. 2019, 181, 106187. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2020. [Google Scholar]
- Chen, M.; Radford, A.; Child, R.; Wu, J.; Jun, H.; Luan, D.; Sutskever, I. Generative pretraining from pixels. In Proceedings of the 37th International Conference on Machine Learning (PMLR), Virtual, 13–18 July 2020. [Google Scholar]
- Saoud, L.S.; Al-Marzouqi, H.; Hussein, R. Household Energy Consumption Prediction Using the Stationary Wavelet Transform and Transformers. IEEE Access 2022, 10, 5171–5183. [Google Scholar] [CrossRef]
- Huang, J.; Algahtani, M.; Kaewunruen, S. Energy Forecasting in a Public Building: A Benchmarking Analysis on Long Short-Term Memory (LSTM), Support Vector Regression (SVR), and Extreme Gradient Boosting (XGBoost) Networks. Appl. Sci. 2022, 12, 9788. [Google Scholar] [CrossRef]
- Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef]
- Chen, T.; He, T. Xgboost: Extreme Gradient Boosting, R package version 0.4-2; The XGBoost Contributors: San Francisco, CA, USA, 2015. [Google Scholar]
- European Union. Directive 2010/31/Eu of the European Parliament and of the Council of 19 May 2010 on the Energy Performance of Buildings; European Union: Brussels, Belgium, 2010; pp. 13–35. [Google Scholar]
- Salvalai, G.; Sesana, M.M. Monitoring Approaches for New-Generation Energy Performance Certificates in Residential Buildings. Buildings 2022, 12, 469. [Google Scholar] [CrossRef]
- Hochreiter, S. The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 1998, 6, 107–116. [Google Scholar] [CrossRef] [Green Version]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 6–11 July 2015. [Google Scholar]
- Ba, J.L.; Kiros, J.R.; Hinton, G.E. Layer normalization. arXiv 2016, arXiv:1607.06450. [Google Scholar]
- Vapnik, V. The Support Vector Method of Function Estimation. In Nonlinear Modeling; Springer: Boston, MA, USA, 1998; pp. 55–85. [Google Scholar]
- Awad, M.; Khanna, R. Support Vector Regression. In Efficient Learning Machines; Apress Open: Berkeley, CA, USA, 2015; pp. 67–80. [Google Scholar]
- LaValle, S.M.; Branicky, M.S.; Lindemann, S.R. On the relationship between classical grid search and probabilistic roadmaps. Int. J. Robot. Res. 2004, 23, 673–692. [Google Scholar] [CrossRef]
- Bergstra, J.; Bengio, Y. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
- Ying, X. An overview of overfitting and its solutions. J. Phys. Conf. Ser. 2019, 1168, 022022. [Google Scholar] [CrossRef]
- Kreider, J.F.; Haberl, J.S. Predicting hourly building energy use: The great energy predictor shootout—Overview and discussion of results. In Proceedings of the 1994 American Society of Heating, Refrigerating, and Air Conditioning Engineers (ASHRAE) Annual Meeting, Orlando, FL, USA, 25–29 June 1994. [Google Scholar]
- Chai, T.; Draxler, R.R. Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geosci. Model Dev. 2014, 7, 1247–1250. [Google Scholar] [CrossRef] [Green Version]
- Renaud, O.; Victoria-Feser, M.-P. A robust coefficient of determination for regression. J. Stat. Plan. Inference 2010, 140, 1852–1862. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Gulli, A.; Pal, S. Deep Learning with Keras; Packt Publishing Ltd.: Birmingham, UK, 2017. [Google Scholar]
- Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. {TensorFlow}: A system for {Large-Scale} machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, Savannah, GA, USA, 2–4 November 2016. [Google Scholar]
- Buitinck, L.; Louppe, G.; Blondel, M.; Pedregosa, F.; Mueller, A.; Grisel, O.; Niculae, V.; Prettenhofer, P.; Gramfort, A.; Grobler, J.; et al. API design for machine learning software: Experiences from the scikit-learn project. arXiv 2013, arXiv:1309.0238. [Google Scholar]
- Edwards, R.E.; New, J.; Parker, L.E. Predicting future hourly residential electrical consumption: A machine learning case study. Energy Build. 2012, 49, 591–603. [Google Scholar] [CrossRef]
- Kelly, J.; Knottenbelt, W. The UK-DALE dataset, domestic appliance-level electricity demand and whole-house demand from five UK homes. Sci. Data 2015, 2, 150007. [Google Scholar] [CrossRef] [Green Version]
- Dimopoulos, T.; Bakas, N. Sensitivity Analysis of Machine Learning Models for the Mass Appraisal of Real Estate. Case Study of Residential Units in Nicosia, Cyprus. Remote Sens. 2019, 11, 3047. [Google Scholar] [CrossRef] [Green Version]
- Gevrey, M.; Dimopoulos, I.; Lek, S. Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecol. Model. 2003, 160, 249–264. [Google Scholar] [CrossRef]
- Olden, J.D.; Jackson, D.A. Illuminating the “black box”: A randomization approach for understanding variable contributions in artificial neural networks. Ecol. Model. 2002, 154, 135–150. [Google Scholar] [CrossRef]
- Kaewunruen, S.; Sussman, J.M.; Matsumoto, A. Grand Challenges in Transportation and Transit Systems. Front. Built Environ. 2016, 2, 4. [Google Scholar] [CrossRef]
Algorithms | Hyperparameters | Searching Space |
---|---|---|
Transformer | The No. of attention | 1–256 |
The no. of units for the feedforward layer | 32–256 | |
The no. of attention blocks | 1–20 | |
Dropout rate | 1 × 10−1–0.9 × 10−1 | |
Learning rate | 1 × 10−1–1 × 10−6 | |
SVR | Epsilon | 1 × 10−2–2 × 10−1 |
C | 1–2000 | |
Kernel | RBF |
Models | CV | R2 | RMSE |
---|---|---|---|
Transformer | 17.0652% | 0.8238 | 76.9611 |
SVR | 12.1949% | 0.9196 | 54.9363 |
SVR in [18] | N/A | N/A | 0.0296 |
The transformer in [18] | N/A | N/A | 0.0182 |
The proposed method in [18] | N/A | N/A | 0.009 |
Algorithms | Hyperparameters | Tuned Value |
---|---|---|
Transformer | The No. of attention | 256 |
The no. of units for the feedforward layer | 224 | |
The no. of attention blocks | 4 | |
Dropout rate | 0.2 | |
Learning rate | 0.0013 | |
SVR | Epsilon | 0.089 |
C | 19 | |
Kernel | RBF |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, J.; Kaewunruen, S. Forecasting Energy Consumption of a Public Building Using Transformer and Support Vector Regression. Energies 2023, 16, 966. https://doi.org/10.3390/en16020966
Huang J, Kaewunruen S. Forecasting Energy Consumption of a Public Building Using Transformer and Support Vector Regression. Energies. 2023; 16(2):966. https://doi.org/10.3390/en16020966
Chicago/Turabian StyleHuang, Junhui, and Sakdirat Kaewunruen. 2023. "Forecasting Energy Consumption of a Public Building Using Transformer and Support Vector Regression" Energies 16, no. 2: 966. https://doi.org/10.3390/en16020966
APA StyleHuang, J., & Kaewunruen, S. (2023). Forecasting Energy Consumption of a Public Building Using Transformer and Support Vector Regression. Energies, 16(2), 966. https://doi.org/10.3390/en16020966