Data Enrichment as a Method of Data Preprocessing to Enhance Short-Term Wind Power Forecasting
Abstract
:1. Introduction
- It can be combined with other data preprocessing methods and is also applicable to various modeling algorithms;
- It serves the purpose of adding as much valuable weather prediction information as possible into the inputs of the wind power forecasting models.
- To propose the concept and a methodological framework of data enrichment to improve short-term wind power forecasting performance;
- To put forward a set of executable data enrichment steps and validate the effectiveness of each step in improving wind power forecasting performance;
- To verify the general applicability of the proposed data enrichment method by cooperating with one machine learning and one deep learning short-term wind power forecasting models for three different actual wind farms.
2. Literature Review
2.1. Data Preprocessing Methods Used in Wind Power Forecasting
- Data organization methods reorganize raw datasets into datasets of different sizes, scales, and sampling frequencies;
- Data cleaning methods correct abnormal data and errors;
- Dimensionality reduction methods reduce the number of features or transform the original features to shrink the feature space and prevent overfitting.
2.2. Data Enrichment
2.3. Summary
3. Wind Data Enrichment Method
3.1. Concept
3.2. Overall Framework
- Add error features of the weather prediction sources;
- Add features of neighboring weather prediction data nodes to take advantage of the spatial continuity of the atmosphere;
- Add time series features of the weather prediction sources to take advantage of the temporal continuity of the atmosphere;
- Add multiple complementary weather prediction sources.
3.3. Add Error Features of Weather Prediction Sources
3.4. Add Features of Weather Prediction at Neighboring Nodes
- Determine the adjacent nodes: As Figure 3 shows, for models forecasting at the wind turbine level the adjacent nodes are the eight nodes surrounding the box where the turbine is located. For models forecasting at the wind farm level, the adjacent nodes can be the eight areas surrounding the wind farm area. Each adjacent node is as large as the wind farm area. Neighboring nodes are selected based on physical proximity to reflect the physical continuity of the atmosphere;
- Calculate the combined weather prediction feature of each adjacent node: If the weather variable is a vector, such as wind speed, the new feature can be formulated by projecting the vector in the direction from the center of the wind farm node to the center of the adjacent node, as Equation (2) shows. If the weather variable is a scalar, the new feature can be formulated by calculating the gradient of the scalar first and then transforming the gradient as the other vector variables.
3.5. Add Time Series Features of Weather Prediction Sources
3.6. Add Complementary Weather Prediction Sources
- Calculate the average forecast accuracy and forecast data availability for the problem period. The problem period can be all the historical periods when the weather prediction and WTV for the problem location are available, or for artificial intelligence algorithms, the time covering the training and test dataset. The average forecast accuracy can be measured by its root mean square error (RMSE):
- Select weather prediction source(s) that the wind power forecasting model used to set as inputs, or one weather prediction source with the lowest RMSE and highest data availability, as the first weather prediction source in the weather prediction base. Calculate the accuracy of the wind power forecasting model as ;
- Add one more weather prediction source to the wind power forecasting model as a new input. Calculate the accuracy of the wind power forecasting model with the new weather prediction base as . If , the newly added weather prediction source is then set as one of the sources in the weather prediction base;
- Repeat the previous step until all of the available weather prediction sources have been tried. An optimal weather prediction base is then set in which all the available complementary weather prediction sources are contained.
4. Experimental Setting
4.1. Baseline Wind Power Forecasting Models
4.2. Datasets
4.3. Performance Evaluation Metric
5. Results and Discussion
5.1. General Effectiveness of the Data Enrichment Method
- The review in [68] published in 2021 listed the percentage error reduction in 41 hybrid wind power forecasting models compared to their benchmarks. There were 26 models designed for short-term wind power or wind speed forecasting, from which eight were evaluated by RMSE. The average RMSE reduction in the models proposed in the eight studies was 24.0%, which is comparable to the error reduction achieved by the proposed data enrichment method in this paper;
- The improvement brought by other published methods that also have XGBoost and LSTM as benchmarks is also comparable to the improvement brought by the proposed data enrichment method. Xiong et al. [73] proposed an improved XGBoost algorithm via Bayesian hyperparameter optimization (BH-XGBoost) and verified the efficacy of the improvement relative to XGBoost on a 200 MW wind farm. The verification results showed that the BH-XGBoost achieved 10.2% to 21.4% of RMSE reduction. Qin et al. proposed an improved LSTM algorithm that combines variational mode decomposition (VMD), maximum relevance and minimum redundancy algorithm (mRMR), long short-term memory neural network (LSTM), and firefly algorithm (FA) together. Compared to LSTM, the combined method achieved an RMSE reduction of 27.9%;
- Table 6 shows that the proposed data enrichment method can help wind farm operator to avoid 1.4% to 5.5% of the penalized power. Assuming that a wind farm had an installed capacity of 180 MW, 2000 annual full load hours, and 0.4 RMB/kWh of the feed-in tariff for a wind farm, the annual savings could amount to CNY 2 to 7 million.
- XGBoost outperformed LSTM for all three wind farms, regardless of the application of data enrichment or not;
- For all of the wind farms, the accuracy improvement brought by data enrichment with LSTM through the proposed method was more than that with XGBoost;
- The accuracy of the two models was closer with the aid of the proposed data enrichment method, as can be observed in Figure 4. For wind farms № 1 and № 2 the forecast accuracy of LSTM caught up and became almost the same as that of XGBoost after the introduction of the data enrichment step.
- The data enrichment enables the original information in the input data to be better learned by LSTM; XGBoost has already learned this prior to data enrichment;
- LSTM can better learn the additional information brought by the data enrichment than XGBoost.
5.2. Effectiveness of Each Step of the Data Enrichment Method
- While the addition of error features of weather prediction sources did little to improve the forecast accuracy of XGBoost, it resulted in a significant increase in the forecast accuracy of LSTM. On the contrary, the addition of time series features of weather prediction sources led to a considerable increase in the forecast accuracy of XGBoost, but it hardly improved the accuracy of LSTM. The contrast might well reflect the merits and demerits of the two models: XGBoost is good at analyzing the relationship between different variables simultaneously, while the advantage of LSTM lies in learning the relationship between the historical values and future values of time series. That being the case, XGBoost might already infer the error features of different weather prediction sources from the historical weather prediction and wind power production, but LSTM does not. The step of adding error features explicitly expresses the error characteristics of weather prediction for LSTM to learn. Similarly, it might be easy for LSTM to learn the temporal continuity of wind from time series prior to data enrichment, making the step of adding time series features of NWP almost redundant to LSTM. However, the addition of the time series features of NWP is a helpful complement to XGBoost to allow it to learn the time series characteristics. Similar phenomena have also appeared and been discussed in studies on forecasting models in other areas [2,74,75];
- Adding features of neighboring weather prediction nodes only slightly improved the accuracy for both models;
- Adding complementary weather prediction sources could be instrumental in improving the forecast accuracy of both models since the step supplied more weather prediction information to the model.
6. Limitations
7. Conclusions and Recommendations
- Application of the data enrichment method to more types of wind power forecasting models to further identify the adaptability and performance of the method;
- Extension of the data enrichment method to long-term and very-short-term wind power forecasting models or other forecasting problems related to weather data;
- Exploration of the relationship between the intrinsic strength and weakness of forecasting models and the data enrichment method;
- In-depth study into the optimized calculation of each data enrichment step;
- Design of other possible methods of data enrichment.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Alkhayat, G.; Mehmood, R. A review and taxonomy of wind and solar energy forecasting methods based on deep learning. Energy AI 2021, 4, 100060. [Google Scholar] [CrossRef]
- Hanifi, S.; Liu, X.; Lin, Z.; Lotfian, S. A critical review of wind power forecasting methods—Past, present and future. Energies 2020, 13, 3764. [Google Scholar] [CrossRef]
- Tascikaraoglu, A.; Uzunoglu, M. A review of combined approaches for prediction of short-term wind speed and power. Renew. Sustain. Energy Rev. 2014, 34, 243–254. [Google Scholar] [CrossRef]
- Soman, S.S.; Zareipour, H.; Malik, O.; Mandal, P. In A review of wind power and wind speed forecasting methods with different time horizons. In Proceedings of the North-American Power Symposium (NAPS) 2010, Arlington, TX, USA, 26–28 September 2010; pp. 1–8. [Google Scholar]
- Zhao, E.; Sun, S.; Wang, S. New developments in wind energy forecasting with artificial intelligence and big data: A scientometric insight. Data Sci. Manag. 2022, 5, 84–95. [Google Scholar] [CrossRef]
- Wang, Y.; Zou, R.; Liu, F.; Zhang, L.; Liu, Q. A review of wind speed and wind power forecasting with deep neural networks. Appl. Energy 2021, 304, 117766. [Google Scholar] [CrossRef]
- Liu, H.; Chen, C. Data processing strategies in wind energy forecasting models and applications: A comprehensive review. Appl. Energy 2019, 249, 392–408. [Google Scholar] [CrossRef]
- Lipu, M.S.H.; Miah, M.S.; Hannan, M.A.; Hussain, A.; Sarker, M.R.; Ayob, A.; Saad, M.H.M.; Mahmud, M.S. Artificial intelligence based hybrid forecasting approaches for wind power generation: Progress, challenges and prospects. IEEE Access 2021, 9, 102460–102489. [Google Scholar] [CrossRef]
- Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat. Deep learning and process understanding for data-driven Earth system science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef] [PubMed]
- Knapp, E.D.; Langill, J.T. Exception, anomaly, and threat detection. In Industrial Network Security—Securing Critical Infrastructure Networks for Smart Grid, SCADA, and Other Industrial Control Systems; Elsevier: Amsterdam, The Netherlands, 2015. [Google Scholar]
- Buckinx, W.; Verstraeten, G.; Van den Poel, D. Predicting customer loyalty using the internal transactional database. Expert Syst. Appl. 2007, 32, 125–134. [Google Scholar] [CrossRef]
- Azad, S.A.; Wasimi, S.; Ali, A.B.M.S. Business Data Enrichment: Issues and Challenges. In Proceedings of the 2018 5th Asia-Pacific World Congress on Computer Science and Engineering (APWC on CSE), Nadi, Fiji, 10–12 December 2018; pp. 98–102. [Google Scholar]
- Reis Filho, I.J.; Marcacini, R.M.; Rezende, S.O. On the enrichment of time series with textual data for forecasting agricultural commodity prices. MethodsX 2022, 9, 101758. [Google Scholar] [CrossRef]
- Pombo, D.V.; Gocmen, T.; Das, K.; Sorensen, P. Multi-horizon data-driven wind power forecast: From nowcast to 2 days-ahead. In Proceedings of the 2021 International Conference on Smart Energy Systems and Technologies (SEST), Vaasa, Finland, 6–8 September 2021; pp. 1–6. [Google Scholar]
- He, Y.; Cao, C.; Wang, S.; Fu, H. Nonparametric probabilistic load forecasting based on quantile combination in electrical power systems. Appl. Energy 2022, 322, 119507. [Google Scholar] [CrossRef]
- Chen, X.; Zhao, J.; Jia, X.; Li, Z. Multi-step wind speed forecast based on sample clustering and an optimized hybrid system. Renew. Energy 2021, 165, 595–611. [Google Scholar] [CrossRef]
- Takahashi, Y.; Fujimoto, Y.; Hayashi, Y. Forecast of infrequent wind power ramps based on data sampling strategy. Energy Procedia 2017, 135, 496–503. [Google Scholar] [CrossRef]
- Wang, J.; Li, Q.; Zhang, H.; Wang, Y. A deep-learning wind speed interval forecasting architecture based on modified scaling approach with feature ranking and two-output gated recurrent unit. Expert Syst. Appl. 2023, 211, 118419. [Google Scholar] [CrossRef]
- Liu, Z.; Hara, R.; Kita, H. Hybrid forecasting system based on data area division and deep learning neural network for short-term wind speed forecasting. Energy Convers. Manag. 2021, 238, 114136. [Google Scholar] [CrossRef]
- Qiao, B.; Liu, J.; Wu, P.; Teng, Y. Wind power forecasting based on variational mode decomposition and high-order fuzzy cognitive maps. Appl. Soft Comput. 2022, 129, 109586. [Google Scholar] [CrossRef]
- Manero, J.; Béjar, J.; Cortés, U. “Dust in the wind...”, deep learning application to wind energy time series forecasting. Energies 2019, 12, 2385. [Google Scholar] [CrossRef] [Green Version]
- Huang, Y.; Liu, G.; Hu, W. Priori-guided and data-driven hybrid model for wind power forecasting. ISA Trans. 2022, in press. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.; Li, Y.; Zhang, G. Short-term wind power forecasting approach based on Seq2Seq model using NWP data. Energy 2020, 213, 118371. [Google Scholar] [CrossRef]
- Qian, Z.; Pei, Y.; Zareipour, H.; Chen, N. A review and discussion of decomposition-based hybrid models for wind energy forecasting applications. Appl. Energy 2019, 235, 939–953. [Google Scholar] [CrossRef]
- Chen, Y.; Yu, S.; Islam, S.; Lim, C.P.; Muyeen, S.M. Decomposition-based wind power forecasting models and their boundary issue: An in-depth review and comprehensive discussion on potential solutions. Energy Rep. 2022, 8, 8805–8820. [Google Scholar] [CrossRef]
- Dong, W.; Sun, H.; Tan, J.; Li, Z.; Zhang, J.; Zhao, Y.Y. Short-term regional wind power forecasting for small datasets with input data correction, hybrid neural network, and error analysis. Energy Rep. 2021, 7, 7675–7692. [Google Scholar] [CrossRef]
- Chen, H.; Birkelund, Y.; Zhang, Q. Data-augmented sequential deep learning for wind power forecasting. Energy Convers. Manag. 2021, 248, 114790. [Google Scholar] [CrossRef]
- Niu, D.; Sun, L.; Yu, M.; Wang, K. Point and interval forecasting of ultra-short-term wind power based on a data-driven method and hybrid deep learning model. Energy 2022, 254, 124384. [Google Scholar] [CrossRef]
- Yan, J.; Zhang, H.; Liu, Y.; Han, S.; Li, L.; Lu, Z. Forecasting the high penetration of wind power on multiple scales using multi-to-multi mapping. IEEE Trans. Power Syst. 2018, 33, 3276–3284. [Google Scholar] [CrossRef]
- Delle Monache, L.; Nipen, T.; Liu, Y.; Roux, G.; Stull, R. Kalman filter and analog schemes to postprocess numerical weather predictions. Mon. Weather Rev. 2011, 139, 3554–3570. [Google Scholar] [CrossRef] [Green Version]
- Cassola, F.; Burlando, M. Wind speed and wind energy forecast through Kalman filtering of Numerical Weather Prediction model output. Appl. Energy 2012, 99, 154–166. [Google Scholar] [CrossRef]
- Hur, S. Short-term wind speed prediction using Extended Kalman filter and machine learning. Energy Rep. 2021, 7, 1046–1054. [Google Scholar] [CrossRef]
- Louka, P.; Galanis, G.; Siebert, N.; Kariniotakis, G.; Katsafados, P.; Pytharoulis, I.; Kallos, G. Improvements in wind speed forecasts for wind power prediction purposes using Kalman filtering. J. Wind. Eng. Ind. Aerodyn. 2008, 96, 2348–2362. [Google Scholar] [CrossRef] [Green Version]
- Zhang, C.; Zhou, J.; Li, C.; Fu, W.; Peng, T. A compound structure of ELM based on feature selection and parameter optimization using hybrid backtracking search algorithm for wind speed forecasting. Energy Convers. Manag. 2017, 143, 360–376. [Google Scholar] [CrossRef]
- Li, Y.; Peng, T.; Zhang, C.; Sun, W.; Hua, L.; Ji, C.; Muhammad Shahzad, N. Multi-step ahead wind speed forecasting approach coupling maximal overlap discrete wavelet transform, improved grey wolf optimization algorithm and long short-term memory. Renew. Energy 2022, 196, 1115–1126. [Google Scholar] [CrossRef]
- Zha, W.; Liu, J.; Li, Y.; Liang, Y. Ultra-short-term power forecast method for the wind farm based on feature selection and temporal convolution network. ISA Trans. 2022, 129, 405–414. [Google Scholar] [CrossRef] [PubMed]
- Lu, P.; Ye, L.; Zhao, Y.; Dai, B.; Pei, M.; Li, Z. Feature extraction of meteorological factors for wind power prediction based on variable weight combined method. Renew. Energy 2021, 179, 1925–1939. [Google Scholar] [CrossRef]
- Lu, P.; Ye, L.; Pei, M.; Zhao, Y.; Dai, B.; Li, Z. Short-term wind power forecasting based on meteorological feature extraction and optimization strategy. Renew. Energy 2022, 184, 642–661. [Google Scholar] [CrossRef]
- Ai, X.; Li, S.; Xu, H. Short-term wind speed forecasting based on two-stage preprocessing method, sparrow search algorithm and long short-term memory neural network. Energy Rep. 2022, 8, 14997–15010. [Google Scholar] [CrossRef]
- Wang, Y.; Wang, J.; Li, Z.; Yang, H.; Li, H. Design of a combined system based on two-stage data preprocessing and multi-objective optimization for wind speed prediction. Energy 2021, 231, 121125. [Google Scholar] [CrossRef]
- Yang, Z.; Wang, J. A combination forecasting approach applied in multistep wind speed forecasting based on a data processing strategy and an optimized artificial intelligence algorithm. Appl. Energy 2018, 230, 1108–1125. [Google Scholar] [CrossRef]
- Niu, X.; Wang, J. A combined model based on data preprocessing strategy and multi-objective optimization algorithm for short-term wind speed forecasting. Appl. Energy 2019, 241, 519–539. [Google Scholar] [CrossRef]
- Deng, Y.; Wang, B.; Lu, Z. A hybrid model based on data preprocessing strategy and error correction system for wind speed forecasting. Energy Convers. Manag. 2020, 212, 112779. [Google Scholar] [CrossRef]
- Tian, C.; Hao, Y.; Hu, J. A novel wind speed forecasting system based on hybrid data preprocessing and multi-objective optimization. Appl. Energy 2018, 231, 301–319. [Google Scholar] [CrossRef]
- Li, C.; Zhu, Z.; Yang, H.; Li, R. An innovative hybrid system for wind speed forecasting based on fuzzy preprocessing scheme and multi-objective optimization. Energy 2019, 174, 1219–1237. [Google Scholar] [CrossRef]
- Needham, D. The Enrichment Game: A Story about Making Data Powerful, 1st ed.; Technics Publications: Basking Ridge, NJ, USA, 2021. [Google Scholar]
- Allen, M.; Cervo, D. Chapter 9—Data Quality Management. In Multi-Domain Master Data Management; Allen, M., Cervo, D., Eds.; Morgan Kaufmann: Boston, MA, USA, 2015; pp. 131–160. [Google Scholar]
- Schön, C.; Dittrich, J.; Müller, R. The Error is the Feature: How to Forecast Lightning using a Model Prediction Error. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, AK, USA, 4–8 August 2019. [Google Scholar]
- Miao, L.; Yu, D.; Pang, Y.; Zhai, Y. Temperature Prediction of Chinese Cities Based on GCN-BiLSTM. Appl. Sci. 2022, 12, 11833. [Google Scholar] [CrossRef]
- Xie, H.; Zheng, R.; Lin, Q. Short-Term Intensive Rainfall Forecasting Model Based on a Hierarchical Dynamic Graph Network. Atmosphere 2022, 13, 703. [Google Scholar] [CrossRef]
- Ozdemir, S. Feature Engineering Bookcamp; Manning Publications: Shelter Island, NY, USA, 2022. [Google Scholar]
- Hamilton, J.D. Time Series Analysis; Princeton University Press: Princeton, NJ, USA, 1994. [Google Scholar]
- Dabernig, M. Comparison of different numerical weather prediction models as input for statistical wind power forecasts. Master’s Thesis, University of Innsbruck, Innsbruck, Austria, 2013. [Google Scholar]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree noosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
- Cai, R.; Xie, S.; Wang, B.; Yang, R.; Xu, D.; He, Y. Wind speed forecasting based on extreme gradient boosting. IEEE Access 2020, 8, 175063–175069. [Google Scholar] [CrossRef]
- Browell, J.; Gilbert, C.; McMillan, D. Use of turbine-level data for improved wind power forecasting. In Proceedings of the 2017 IEEE Manchester PowerTech, Manchester, UK, 18–22 June 2017; pp. 1–6. [Google Scholar]
- Gebin, L.G.G.; Salgado, R.M.; Nogueira, D.A. Wind power forecast: Ensemble model based in statistical and machine learning models. Res. Soc. Dev. 2020, 9, e38291211251. [Google Scholar] [CrossRef]
- Bochenek, B.; Jurasz, J.; Jaczewski, A.; Stachura, G.; Sekuła, P.; Strzyżewski, T.; Wdowikowski, M.; Figurski, M. Day-ahead wind power forecasting in poland based on numerical weather prediction. Energies 2021, 14, 2164. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Shokouhifar, M.; Ranjbarimesan, M. Multivariate time-series blood donation/demand forecasting for resilient supply chain management during COVID-19 pandemic. Clean. Logist. Supply Chain. 2022, 5, 100078. [Google Scholar] [CrossRef]
- AlRassas, A.M.; Al-qaness, M.A.A.; Ewees, A.A.; Ren, S.; Sun, R.; Pan, L.; Abd Elaziz, M. Advance artificial time series forecasting model for oil production using neuro fuzzy-based slime mould algorithm. J. Pet. Explor. Prod. Technol. 2022, 12, 383–395. [Google Scholar] [CrossRef] [PubMed]
- Kader, A.; Izzati, N. A review of long short-term memory approach for time series analysis and forecasting. In Proceedings of the 2nd International Conference on Emerging Technologies and Intelligent System: ICETIS 2022, Virtual, 2–3 September 2022; Springer Nature: Cham, Switzerland, 2023. [Google Scholar]
- Puspita Sari, A.; Suzuki, H.; Kitajima, T.; Yasuno, T.; Arman Prasetya, D.; Rabi’, A. Deep convolutional long short-term memory for forecasting wind speed and direction. SICE J. Control. Meas. Syst. Integr. 2021, 14, 30–38. [Google Scholar] [CrossRef]
- Moharm, K.; Eltahan, M.; Elsaadany, E. Wind speed forecast using LSTM and BILSTM algorithms over Gabal El-Zayt wind farm. In Proceedings of the 2020 International Conference on Smart Grids and Energy Systems (SGES), Perth, Australia, 23–26 November 2020; pp. 922–927. [Google Scholar]
- Su, Y.; Yu, J.; Tan, M.; Wu, Z.; Xiao, Z.; Hu, J. A LSTM based wind power forecasting method considering wind frequency components and the wind turbine states. In Proceedings of the 2019 22nd International Conference on Electrical Machines and Systems (ICEMS), Harbin, China, 11–14 August 2019; pp. 1–6. [Google Scholar]
- Li, G.; Shi, J. On comparing three artificial neural networks for wind speed forecasting. Appl. Energy 2010, 87, 2313–2320. [Google Scholar] [CrossRef]
- González-Sopeña, J.M.; Pakrashi, V.; Ghosh, B. An overview of performance evaluation metrics for short-term statistical wind power forecasting. Renew. Sustain. Energy Rev. 2021, 138, 110515. [Google Scholar] [CrossRef]
- Yang, B.; Zhong, L.; Wang, J.; Shu, H.; Zhang, X.; Yu, T.; Sun, L. State-of-the-art one-stop handbook on wind forecasting technologies: An overview of classifications, methodologies, and analysis. J. Clean. Prod. 2021, 283, 124628. [Google Scholar] [CrossRef]
- China Energy Regulatory Bureau. Implementation Rules for Grid-Connected Operation Management of Wind Farms in North China; No. 381; China Energy Regulatory Bureau: Beijing, China, 2015. (In Chinese)
- Entsoe Enhanced RES Infeed Forecasting-Wind. Available online: https://www.entsoe.eu/Technopedia/techsheets/enhanced-res-infeed-forecasting-wind (accessed on 3 June 2022).
- European Wind Energy Association. Powering Europe: Wind Energy and the Electricity Grid; European Wind Energy Association: Brussels, Belgium, 2010. [Google Scholar]
- Shcherbakov, M.V.; Brebels, A.; Shcherbakova, N.L.; Tyukov, A.P.; Janovsky, T.A.; Kamaev, V.A.E. A Survey of Forecast Error Measures. World Appl. Sci. J. 2013, 24, 171–176. [Google Scholar]
- Xiong, X.; Guo, X.; Zeng, P.; Zou, R.; Wang, X. A Short-Term Wind Power Forecast Method via XGBoost Hyper-Parameters Optimization. Front. Energy Res. 2022, 10, 574. [Google Scholar] [CrossRef]
- Liu, Y.; Zhang, S.; Zhang, J.; Tang, L.; Bai, Y. Assessment and comparison of six machine learning models in estimating evapotranspiration over croplands using remote sensing and meteorological factors. Remote Sens. 2021, 13, 3838. [Google Scholar] [CrossRef]
- Du, Q.; Yin, F.; Li, Z. Base station traffic prediction using XGBoost-LSTM with feature enhancement. IET Netw. 2020, 9, 29–37. [Google Scholar] [CrossRef]
Purpose | Data Preprocessing Method | Occasion of Use and Recent Research Applications |
---|---|---|
Data organization | Data sampling | |
Data division/Data splitting | ||
Data standardization/Data normalization | ||
Data clustering | ||
Data decomposition | ||
Data augmentation | ||
Data cleaning | Data correction/denoising | |
Data filtering | ||
Dimensionality reduction | Feature selection | |
Feature extraction |
Keywords | Number of Results 1 | Topic |
---|---|---|
Data preprocessing wind forecast or data processing wind forecast | 7 | All of the seven studies proposed a specific combination of hybrid algorithms. In the data preprocessing part, the authors in [39] proposed singular spectrum analysis (SSA) for data denoising and variational mode decomposition (VMD) as the data decomposition method. The authors in [40] proposed VMD as the data decomposition method and PSR as the feature extraction method. The authors in [41] proposed complete ensemble empirical mode decomposition (CEEMD) as the data decomposition method. The authors in [42] proposed a complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) as the data decomposition method. The authors in [43] proposed empirical wavelet transform (EWT) as the data decomposition method. The authors in [44] proposed combining CEEMD and VMD as the data decomposition method. The authors in [45] proposed combining ICEEMDAN and fuzzy time series as the data decomposition and feature extraction methods. |
1 | A review paper of the data processing methods used in wind energy forecasting models [7]. |
Abbreviation | Description of the Weather Prediction Sources |
---|---|
IBM | Global high-resolution atmospheric forecasting by IBM Weather Operations Center |
GFS | Global forecast system operated by the United States’ National Weather Service |
ECMWF | European Centre for Medium-Range Weather Forecasts |
CMC | Canadian Meteorological Centre |
DWD | Deutscher Wetterdienst |
Wind Farm | № 1 | № 2 | № 3 |
---|---|---|---|
Number of turbines | 45 | 51 | 23 |
Dataset start time | 2019/4/26 00:00:00 | 2019/4/26 00:00:00 | 2019/4/26 00:00:00 |
Dataset end time | 2020/5/28 23:00:00 | 2020/5/28 23:00:00 | 2020/5/28 23:00:00 |
Model | XGBoost | LSTM | |||||||
---|---|---|---|---|---|---|---|---|---|
Wind Farm | ACCwithoutDE 1 | ACCwithDE | ACC△% | NRMSE△% | ACCwithoutDE | ACCwithDE | ACC△% | NRMSE△% | |
№ 1 | 69.1% | 72.6% | 5.0% | 11.2% | 62.0% | 72.3% | 16.7% | 27.2% | |
№ 2 | 72.1% | 77.2% | 7.2% | 18.5% | 63.0% | 76.7% | 21.6% | 36.9% | |
№ 3 | 63.4% | 73.4% | 15.9% | 27.5% | 61.0% | 71.6% | 17.55% | 27.3% |
Model | XGBoost | LSTM | |||||
---|---|---|---|---|---|---|---|
Wind Farm | PenaltywithoutDE | PenaltywithDE | Penalty△ | PenaltywithoutDE | PenaltywithDE | Penalty△ | |
№ 1 | 6.4% | 5.0% | 1.4% | 9.2% | 5.1% | 4.1% | |
№ 2 | 5.2% | 3.1% | 2.1% | 8.8% | 3.3% | 5.5% | |
№ 3 | 8.8% | 4.6% | 4.0% | 9.6% | 5.4% | 4.3% |
Models | XGBoost for Wind Farm № 1 | LSTM for Wind Farm № 1 | |
---|---|---|---|
Steps | |||
Without data enrichment (baseline) | 69.1% | 62.0% | |
With data enrichment | Add error features of weather prediction sources | 69.1% | 68.9% |
Add features of neighboring nodes | 69.3% | 69.0% | |
Add time series features of weather prediction sources | 71.2% | 69.1% | |
Add complimentary weather prediction sources | 72.6% | 72.3% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, Y.; Ma, L.; Ni, W.; Yu, C. Data Enrichment as a Method of Data Preprocessing to Enhance Short-Term Wind Power Forecasting. Energies 2023, 16, 2094. https://doi.org/10.3390/en16052094
Zhou Y, Ma L, Ni W, Yu C. Data Enrichment as a Method of Data Preprocessing to Enhance Short-Term Wind Power Forecasting. Energies. 2023; 16(5):2094. https://doi.org/10.3390/en16052094
Chicago/Turabian StyleZhou, Yingya, Linwei Ma, Weidou Ni, and Colin Yu. 2023. "Data Enrichment as a Method of Data Preprocessing to Enhance Short-Term Wind Power Forecasting" Energies 16, no. 5: 2094. https://doi.org/10.3390/en16052094
APA StyleZhou, Y., Ma, L., Ni, W., & Yu, C. (2023). Data Enrichment as a Method of Data Preprocessing to Enhance Short-Term Wind Power Forecasting. Energies, 16(5), 2094. https://doi.org/10.3390/en16052094