Previous Article in Journal
Likert Scale Variables in Personal Finance Research: The Neutral Category Problem
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Forecasting Wind–Photovoltaic Energy Production and Income with Traditional and ML Techniques

by
Giovanni Masala
1,* and
Amelie Schischke
2
1
Department of Economics and Business Sciences, University of Cagliari, 09123 Cagliari, Italy
2
Institute of Materials Resource Management, University of Augsburg, Universitätsstraße 2, 86159 Augsburg, Bavaria, Germany
*
Author to whom correspondence should be addressed.
Econometrics 2024, 12(4), 34; https://doi.org/10.3390/econometrics12040034
Submission received: 20 September 2024 / Revised: 29 October 2024 / Accepted: 5 November 2024 / Published: 12 November 2024

Abstract

:
Hybrid production plants harness diverse climatic sources for electricity generation, playing a crucial role in the transition to renewable energies. This study aims to forecast the profitability of a combined wind–photovoltaic energy system. Here, we develop a model that integrates predicted spot prices and electricity output forecasts, incorporating relevant climatic variables to enhance accuracy. The jointly modeled climatic variables and the spot price constitute one of the innovative aspects of this work. Regarding practical application, we considered a hypothetical wind–photovoltaic plant located in Italy and used the relevant climate series to determine the quantity of energy produced. We forecast the quantity of energy as well as income through machine learning techniques and more traditional statistical and econometric models. We evaluate the results by splitting the dataset into estimation windows and test windows, and using a backtesting technique. In particular, we found evidence that ML regression techniques outperform results obtained with traditional econometric models. Regarding the models used to achieve this goal, the objective is not to propose original models but to verify the effectiveness of the most recent machine learning models for this important application, and to compare them with more classic linear regression techniques.

1. Introduction

The energy transition and the consequent increase in the production of energy from renewable sources is currently of considerable importance and is destined to develop significantly in the near future. Leaving aside regulatory aspects that are outside the scope of our work, we will deal with forecasting energy from renewable sources, both in terms of the quantity produced and profitability, through quantitative techniques. The introduction of renewable energy into production processes also has a significant impact on the determination of electricity prices (see Tselika 2022; Macedo et al. 2020).
Another important aspect was the 2015 Paris Agreements, established by the “United Nations Framework Convention on Climate Change” (UNFCCC 2015) to reduce greenhouse gas (GHG) emissions and therefore mitigate the “greenhouse effect”. The main goal of the study is to forecast energy production as well as the profitability of a hybrid wind/PV system. To this end, we apply some machine learning techniques that will be compared with more classic regression techniques. To do so, we model the climate series involved in the RES production process, as well as the price of electricity.
The development of wind/PV energies has recently been addressed in the specialized literature under various aspects. To model and perform forecasting analyses, we can, on the one hand, consider the climatic factors involved in the process and exploit the characteristics of wind/photovoltaic plants to determine the energy produced. On the other hand, we can also exploit data on the empirical production of existing plants. In the latter case, the availability of data is more problematic. Therefore, we will use the first path.
In the literature, previous studies focus either on the impact of wind speed, solar radiation, or temperature on the electricity price or production.
Regarding wind speed, the following key studies can be highlighted in the existing literature. Caporin and Preś (2012) show that the ARFIMA-FIGARCH process models wind speed efficiently. Chang (2011) uses Weibull distributions, a typical characteristic of wind speed distributions. D’Amico et al. (2015a) model wind intensity with innovative indexed semi-Markov chains. Sim et al. (2019) first apply a transformation of the data and then an ARIMA process. Wind power depends essentially on wind speed, and we deduce the energy produced with the power curve associated with the turbines.
Regarding solar radiation, Saoud et al. (2018) use a quaternion-valued neural network to forecast with a short-term horizon. As an example of the application of neural networks to model photovoltaic power production, we list Monteiro et al. (2017), Yousif et al. (2017), and Graditi et al. (2016). Furthermore, to model photovoltaic power production, Benth and Ibrahim (2017) use a continuous-time process while Lingohr and Müller (2019) use a non-linear continuous-time AR. The efficiency of a PV panel has been studied by several authors. The modular temperature of a photovoltaic panel was investigated by Faiman (2008). Finally, we refer to the works of Huld et al. (2011), Koehl et al. (2011), and Urraca et al. (2018) concerning the features of the PV panels used in our survey.
Finally, some references regarding temperature modeling include Benth and Benth (2011), Huang et al. (2018), Lee and Craine (2012), Türkvatan et al. (2020), and Zapranis and Alexandridis (2011).
The second essential point of our analysis concerns the economic profitability of the energy produced. We model the electricity spot price on the day-ahead market and compare several forecasting techniques. Hereby, we apply besides classic econometric models also more advanced ML models.
Models and forecasts of electricity prices are widely present in the literature. Detailed contributions can be found in Weron (2014) and Nowotarski and Weron (2018) with associated bibliographies. In addition, Giordano and Morale (2021) use a fractional Brownian–Hawkes model for the Italian market. However, the profitability for RES is poorly presented in the literature. We list an application of Benth et al. (2018) for wind farms by using Ornstein–Uhlenbeck processes for wind intensity. Casula et al. (2020) use a VAR process to model wind intensity and electricity price. It is evident that the relationship between climatic variables and electricity prices is of primary importance. In the same vein, Matsumoto and Endo (2021) forecast the electricity price from the forecasting of climatic variables.
Finally, regarding hybrid plants and their profitability, we list the following contributions. Cucchiella et al. (2017) analyze the profitability of RES investments in the Italian context. deLlano-Paz et al. (2017) propose a literature review oriented toward portfolio theory applied to power production. Neto et al. (2017) investigate portfolio theory with several RES assets oriented to the Brazilian market. Furthermore, Lucheroni and Mari (2017, 2018) investigate the study of energy portfolios by including the minimization of the Levelized Cost of Energy. Li et al. (2017) study the influence of RES production in the Chinese market. Yang et al. (2016) also investigate the impact of RES regards electricity production. Mahesh and Sandhu (2015) investigate the positive effects of a mixed wind–PV system production. Carpio (2021) addresses the issue of production discontinuity of photovoltaic power. This work is also a continuation of Casula et al. (2022) regarding the modeling of the production of a hybrid plant with classical econometric techniques.
Overall, previous studies focus on the impact of wind speed, solar radiation, or temperature on electricity price or production. In contrast, we analyze the profitability of a hybrid system, combining models for energy production, using climate variables as well as models for the spot price of electricity. Specifically, here are the main tasks we are investigating.
(1) We aim to forecast the main climatic variables essential to determine the production of wind/photovoltaic energy (wind speed, solar radiation, and temperature). In parallel, we want to predict the spot price of electricity. To achieve this goal, we apply both a classical and an advanced regression model using machine learning techniques. The decision variables are described in the next section.
(2) Once the climatic variables are known, we consider, as a numerical application, hybrid wind/photovoltaic plants located in Sardinia (Italy) with given technical characteristics. We are then able to deduce the overall power produced by the plant with hourly granularity. Furthermore, knowledge of the electricity price allows us to deduce the profitability of the plant in a given time horizon.
(3) To verify the reliability of the results with respect to the empirical values (both regarding energy production and income generated), we apply a backtesting technique. The model parameters are estimated in a given “estimation window” while the simulated values concern a subsequent “test window” in which the traditional risk measures (MAPE, SMAPE, MAE, RMSE) are determined with respect to the empirical values. Finally, using a rolling window technique, we verify the validity of the models and compare the performance of different traditional statistical models with ML models. We thereby obtain a succession of risk measures, which allow us to verify the validity of the model and also to compare competing models with each other.
The main innovative aspect of our work is to predict the profitability of a wind–photovoltaic system including climatic variables as well as electricity prices. To our knowledge, this aspect has not yet been addressed in the literature and therefore allows us to cover an important gap in the existing literature. Regarding the models used to achieve the goal, the objective is not to propose original models but to verify the effectiveness of the most recent machine learning models and compare them with more classic linear regression techniques. Another interesting aspect that we propose is to test the correctness of the results by varying the width of both the estimation windows and the test windows used in the backtesting technique, for which we propose a comparative analysis. Through a practical application exploiting different locations, we also tested the robustness of the results using an estimation of the most traditional risk measures. Lastly, we highlighted that neural networks for regression constitute the most efficient model, among those tested, for forecasting both the total energy produced and the income of a hybrid wind–photovoltaic system.
The paper continues as follows. In Section 2 we describe the data, the production of RES energy, and the expected income. Section 3 discusses the models concerning the stochastic variables. Then, in Section 4, we present the findings of the described models. Section 5 is the conclusion.

2. Data and RES Production

In this Section, we analyze the variables involved. Then, we determine the solar energy produced by a photovoltaic panel and the wind power produced by a wind turbine with known features and location. Finally, we deduce the profit obtained by the total production, taking into account electricity prices.

2.1. Dataset Characteristics

We retrieve from NASA’s MERRA-2 project1 the time series of the climatic variables (wind speed, solar radiation, and temperature with hourly granularity). The MERRA-2 site allows us to download a large amount of climate data, identified with certain codes, by entering the desired geographical coordinates. We have selected three locations in Sardinia (Italy) with the following geographical coordinates: location 1 (Medio Campidano, 39°35′ N, 8°40′ E), location 2 (Macchiareddu, 39°12′ N, 8°58′ E), and location 3 (Fiume Santo, 40°50′ N, 8°17′ E). In these locations (see Figure 1), energy production from renewable sources is currently being developed.
The primary purpose is to obtain real data since the location is irrelevant to the proposed modeling. Note that solar radiation is entered in MERRA-2 with the SWGDN code (surface_incoming_shortwave_flux). The wind components are cataloged with codes U50M (“50-m eastward wind”) and V50M (“50-m northward wind”) from which we can deduce wind speed and direction. Furthermore, the wind intensity at a fixed height can be deduced through the following formula (see, e.g., D’Amico et al. 2015b):
v h = v h 0 h h 0 ϑ with   ϑ = ln h z 0 1
where v h is the wind intensity at the height h of the turbine, v h 0 represents the value of the wind intensity at height h 0 (here, h 0 = 50   m ), and z 0 represents the characteristics of the location’s morphology.
The electricity spot prices dataset (and load values) given within the day-ahead auction market can be retrieved from the “Gestore Mercati Energetici” website2. We used the PUN price (“National Single Price”). The data ranges from 1 January 2005 to 31 December 2019 (131,400 records); some statistics are in Table 1a,b.
We highlight the high kurtosis for electricity prices due to its numerous price peaks.

2.2. RES Production

We show how to estimate the power produced by a wind turbine and a photovoltaic panel starting from the climatic sources and the features of the plant.
Regarding the wind energy, we consider turbines with the following features:
Rated power 2 MW; cut-in wind intensity 4 m/s; rated wind intensity 13 m/s; cut-out wind intensity 25 m/s (see Casula et al. 2020). To convert the wind intensity x into power, we apply the specific power curve Ψ ( x ) of the turbine:
Ψ ( x ) = 0 if 0 < x < 4 21.78 x 2 147.96 x + 243.42 if 4 x 13 2000 if 13 < x 25 0 if x > 25
where x is expressed in m/s and Ψ ( x ) is expressed in kWh . The power curve equation can be deduced from empirical data and follows the Betz law (see Figure 2).
Next, we move on to estimating photovoltaic energy for a panel with fixed features and the necessary climatic factors given. For this purpose, we apply the procedure in Urraca et al. (2018). “Standard Test Conditions” (STC) refers to T = 25   ° C and solar radiation 1000   W / m 2 . The power produced by a photovoltaic panel depends on the in-plane radiation G e f f and the module temperature T mod according to the Faiman relation (Faiman 2008):
T mod = T a m b + G e f f u 0 + u 1 W S mod
where u 0 is the impact of the radiation on the module temperature, u 1 is the wind cooling effect, T a m b is the ambient temperature, and W S mod denotes the wind intensity. The solar power obtained in general conditions is as follows (Urraca et al. 2018):
P D C = G e f f 1 + k 1 ln G e f f + k 2 ln 2 G e f f + k 3 T mod + k 4 T mod ln G e f f + k 5 T mod ln 2 G e f f + k 6 T 2 mod
with
P D C = P D C / P S T C P S T C = nominal   power G e f f = G e f f / G S T C G S T C = 1000 W / m 2 T mod = T mod T S T C T S T C = 25   ° C
so that P D C is the energy produced. (“DC” stands for direct current).
The parameters k 1 , , k 6 characterize the type of panels (see Huld et al. 2011).

2.3. Expected Profit

We determine the expected profit obtained from the total electricity production of the hybrid plant in a fixed period by including electricity spot prices. The expected income of the plant at a time t 0 0 up to time t 0 + τ ( τ > 0 ) is given as:
V t 0 , t 0 + τ = E t 0 k = 1 τ P t 0 + k z t 0 + k ( 1 + r ) k
By r we denote the constant risk-free interest rate, P t 0 + k the electricity spot price at the time t 0 + k , and z t 0 + k the overall energy produced by the plant at the time t 0 + k . An estimator of the expected income is:
V ^ t 0 , t 0 + τ = 1 n i = 1 n k = 1 τ P i t 0 + k z i t 0 + k ( 1 + r ) k
where P i t 0 + k represents the simulated price process value at t 0 + k for the i t h simulated path, z i t 0 + k denotes the overall energy process value at t 0 + k for the i t h simulated path, and n denotes the number of simulated paths. Finally, note that the risk-free rate is constant as we apply a very short test window.

3. Mathematical Models

Below we describe the main steps.
(1) We forecast the main variables (climate series and spot electricity prices). We use Formulas (2)–(4) to deduce the wind–photovoltaic production (and consequently the total production) and Formula (5) to determine the income in a given period.
(2) To model the series, we use as exogenous variables the Boolean calendar variables (hour, day, month), a list of public holidays in the case of the price of electricity (hol), the load (consumption) with some lags, and some lags of the variables themselves. The main model involves a machine learning regression model, which is compared with a classic regression model (given in Formula (7)).
y ( t ) = i = 1 24 h o u r i ( t ) + i = 1 7 d a y i ( t ) + i = 1 12 m o n t h i ( t ) + h o l ( t ) + i = 1 L y ( t i ) + + i = 0 3 l o a d ( t i ) + l o a d ( t 24 ) + l o a d ( t 168 )
In Equation (7), the variable t is expressed on an hourly basis and belongs to the specific test window. For example, load(t − 24) represents the value of the load at the same time the previous day. Boolean variables are worth 1 if the condition is satisfied, otherwise they take on a value of zero. Finally, the natural number L represents the number of delays for the variable y.
Besides the traditional classical regression model, we aim to analyze whether ML techniques improve predictability. Therefore, we evaluate3 a wide range of different common models of ML techniques (trained regression ensemble model; regression tree; k-nearest neighbor classification model; and feedforward, fully connected neural network for regression). We normalize the data for a more efficient estimation of the parameters (see Singh and Singh 2020). The optimal choice is the NN for regression, which consists of a feedforward, fully connected neural network for regression. Specifically, the first fully connected layer has a connection from the network input (predictor data), and each subsequent layer has a connection from the previous layer. Next, each fully connected layer multiplies the input by a weight matrix and then adds a bias vector. An activation function follows each fully connected layer except the last (the so-called ReLU activation function is applied to the first fully connected layer). The final fully connected layer gives the network’s output, representing the predicted response values.
(3) To assess the models, a backtesting technique is used (see the scheme in Figure 3). We set an estimation window to determine the model parameters. The series is then forecasted in a subsequent test window. The simulated series is then compared with the relevant empirical series and the usual adequacy measures are applied (MAPE, MAE, and RMSE). Afterward, we apply a rolling window of one week and repeat the procedure until the entire available dataset is covered. We will therefore have a sequence of risk measures that will be useful for establishing the correctness of the model and comparing competing models.
The length of the estimation window and the test window are of fundamental importance and there is no consolidated method to determine their optimal size. The estimation window must be such as to allow the model parameters to be estimated optimally. A “long” window allows medium/long-term seasonal effects to be taken into account while a “shorter” window may be more suitable for detecting sudden changes. For these reasons, we have chosen estimation windows of 5 years and 1 year. Finally, the size of the test window is linked to the forecaster’s objectives. Here, we have decided to consider a medium-term horizon of one day to one week.
Concerning the risk measures, we applied the following definitions:
MAPE = 100 n i = 1 n x i x ^ i x i SMAPE = 100 n i = 1 n x i x ^ i 0.5 x i + 0.5 x ^ i MAE = 1 n i = 1 n x i x ^ i RMSE = 1 n i = 1 n x i x ^ i 2
where x ^ i are the forecasted values and x i are the real values. In our application, risk measures are defined for each rolling window while forecasting climatic and price series. Next, we take their average value. As for the income and total energy, we estimate a forecasted value x ^ i in each rolling window and then apply the definition of each risk measure.

4. Results

To better justify the choice of regressors, we apply linear regression to the entire dataset for the four variables used. The results of the correctness of the fit tests are reported in Table 2.
We note, in particular, from the high value of the R-squared that the explanatory variables explain very well the variability of the dependent variable. In addition, the application of the augmented Dickey–Fuller test and the Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) test shows that these series are non-stationary.
Let us now look at an autocorrelation analysis of our series. We report the ACF functions in Figure 4a and the PACF functions in Figure 4b.
We also note that the first lags are certainly significant, which is why we included them in the list of explanatory variables. We also note the presence of seasonality.
In this study, we aim to apply the most suitable machine learning techniques. To do so, we will use a supervised learning approach; specifically, the various regression techniques. Note that ML techniques can handle these types of data. This way, we compare the performance of the traditional linear regression model with several ML techniques, in particular: trained regression ensemble model (hereafter denoted ensemble); regression tree (tree); k-nearest neighbor classification model (k-neigh); and feedforward, fully connected neural network for regression (NN).
For reasons of space, we present only the results for the PUN price and for the two main climatic variables (wind speed and solar radiation) related to location 1 (other results are available upon request). We used an estimation window of one year and a test window of one week (with 500 backtestings). The results of the associated risk measures are highlighted in Table 3.
We see that the ML regression model (NN) produces significantly better results, so we will always make this choice from now on and compare it with the classical regression model. In some cases, the MAPE is not defined (this happens when some values are zero, specified in the Tables as ‘ND’). For this reason we also add the SMAPE risk measure, which does not present this error. We have also entered the MAE and RMSE in percentage form (reported at the average value) to make more intuitive comparisons. The results of the climatic variables for the other two locations are perfectly similar.
Let us now consider our main analysis. Through the described procedure, we conducted 500 backtesting tests for the three locations. We considered an estimation window of one and five years and a test window of 1-2-3-7 days. The results for income and total energy are shown in Table 4 and Table 5 (best-fitting results are bold highlighted). These tables contain the average risk values calculated for all rolling windows. We then compare the income and total energy of the competing models (ML neural network for regression and traditional regression) with the empirical values. The best-performing regressors turned out to be Boolean regressors, representing hours and lags from 1 to 5 for all variables. Furthermore, we added load values and their lags for the PUN price and solar radiation. In a preliminary analysis, the calendar variables linked to months and years did not lead to an improvement in the fitting measures (MAE and RMSE), while for PUN price and solar radiation, the inclusion of the load variables actually produced an improvement in these measures. We do not report these results due to space constraints, but they are available upon request.
We have also added the MAE and RMSE in percentage form here to facilitate the comparison between the different situations. The MAPE is always well defined since income and total energy are always far from the zero value. A first element of comparison concerns the width of the estimation window (between one year and five years). The results highlight that the values of the risk measures (for both income and total energy) offer better results, with very few exceptions, in the case of a five-year estimation window. This result is not surprising, since with a greater width of the estimation window, the model is able to more effectively capture the seasonal aspects of the series under consideration (regardless of the three locations). Now, given the same estimation window, we carry out the comparison with respect to the length of the test window. We observe that the values of the risk measures decrease when we go from one day to a week. We can conjecture that in the case of a short test window (one day), the unpredictable variations in our variables are more difficult to grasp (again regardless of the location chosen). Finally, regarding the comparison between the two competing models among the various predicted situations, we have found evidence that the ML neural network for regression (NN) is more reliable in forecasting energy production and the related income with respect to the classical regression model. The results found with distinct locations and different lengths of estimation and test windows are consistent.

5. Discussion

It is crucial to adequately model climate variables when forecasting renewable electricity production. Furthermore, predicting the evolution of electricity prices allows us to determine the profitability of the energy produced. We have shown that applying ML regression models can achieve this goal, with better results than traditional regression models. The practical application was conducted considering three distinct locations, for which we estimated the energy produced and the income using different estimation windows and different test windows. The results proved to be robust concerning these different choices.
Our work has made it possible to fill a gap in the literature with the aim of jointly predicting both the production and profitability of a hybrid wind–photovoltaic system. This type of modeling can clearly be applied to systems with a more complex structure.
We have highlighted the need to use the most suitable models to achieve the intended goal. In this regard, we tested some machine learning models that have undergone enormous development in recent years and compared them with classic linear regression models. The models we used, although not original, allowed us to obtain very reliable results, in the context of an original application.
Finally, we observed the importance of the width of the estimation windows and the test windows. The importance of the width of the estimation windows derives from the need to capture any seasonal factors in the data. There are no strict rules for their determination, therefore we must test different amplitudes to achieve optimal results. The width of the test windows depends on the objectives of the forecasters, so we also have a wide choice here, and we have noticed that the reliability of the results depends on these widths.
The choice of locations and related climate data does not influence the choice of models to use in the slightest. However, we have observed, through the choice of different locations, that the results for traditional risk measures are robust for these choices.
Further developments involve including storage techniques in the model, which is fundamental for aligning energy supply and demand, and is therefore an essential aspect of forming energy prices through the well-known auction mechanisms.
Finally, for further research, we could introduce some derivatives, such as “quanto options”, to cover the market risk and the volumetric risk inherent to energy production from renewable sources.

Author Contributions

Conceptualization, G.M. and A.S.; methodology, G.M. and A.S.; software, G.M.; validation, G.M. and A.S.; formal analysis, G.M. and A.S.; investigation, G.M. and A.S.; resources, G.M. and A.S.; data curation, G.M.; writing—original draft preparation, G.M. and A.S.; writing—review and editing, G.M. and A.S.; visualization, G.M. and A.S.; supervision, G.M. and A.S.; project administration, G.M. and A.S.; funding acquisition, G.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Links to freely accessible data are given in notes 1 and 2.

Conflicts of Interest

The authors declare no conflicts of interest.

Notes

1
https://gmao.gsfc.nasa.gov/reanalysis/MERRA-2 (accessed on 1 September 2024).
2
3
Estimations performed with Matlab R2024b.

References

  1. Benth, Fred Espen, and Jūratė Šaltytė Benth. 2011. Weather Derivatives and Stochastic Modelling of Temperature. International Journal of Stochastic Analysis 2011: 576791. [Google Scholar] [CrossRef]
  2. Benth, Fred Espen, and N. ‘A. Ibrahim. 2017. Stochastic modeling of photovoltaic power generation and electricity prices. Journal of Energy Markets 10: 1–33. [Google Scholar] [CrossRef]
  3. Benth, Fred Espen, Luca Di Persio, and Silvia Lavagnini. 2018. Stochastic Modeling of Wind Derivatives in Energy Markets. Risks 6: 56. [Google Scholar] [CrossRef]
  4. Caporin, Massimiliano, and Juliusz Preś. 2012. Modeling and forecasting wind speed intensity for weather risk management. Computational Statistics and Data Analysis 56: 3459–76. [Google Scholar] [CrossRef]
  5. Carpio, Lucio G. T. 2021. Efficient spatial allocation of solar photovoltaic electric energy generation in different regions of Brazil: A portfolio approach. Energy Sources, Part B: Economics, Planning, and Policy 16: 542–57. [Google Scholar] [CrossRef]
  6. Casula, Laura, Guglielmo D’Amico, Giovanni Masala, and Filippo Petroni. 2020. Performance estimation of a wind farm with a dependence structure between electricity price and wind speed. The World Economy 43: 2803–22. [Google Scholar] [CrossRef]
  7. Casula, Laura, Guglielmo D’Amico, Giovanni Masala, and Filippo Petroni. 2022. A multivariate model for hybrid wind–photovoltaic power production with energy portfolio optimization. Journal of Energy Markets 15: 1–29. [Google Scholar] [CrossRef]
  8. Chang, Tian Pau. 2011. Performance comparison of six numerical methods in estimating Weibull parameters for wind energy application. Applied Energy 88: 272–82. [Google Scholar] [CrossRef]
  9. Cucchiella, Federica, Massimo Gastaldi, and M. Trosini. 2017. Investments and cleaner energy production: A portfolio analysis in the Italian electricity market. Journal of Cleaner Production 142: 121–32. [Google Scholar] [CrossRef]
  10. D’Amico, Guglielmo, Filippo Petroni, and Flavio Prattico. 2015a. Economic performance indicators of wind energy based on wind speed stochastic modeling. Applied Energy 154: 290–97. [Google Scholar] [CrossRef]
  11. D’Amico, Guglielmo, Filippo Petroni, and Flavio Prattico. 2015b. Wind speed prediction for wind farm applications by Extreme Value Theory and Copulas. Journal of Wind Engineering and Industrial Aerodynamics 145: 229–36. [Google Scholar] [CrossRef]
  12. deLlano-Paz, Fernando, Anxo Calvo-Silvosa, Susana Iglesias Antelo, and Isabel Soares. 2017. Energy planning and modern portfolio theory: A review. Renewable and Sustainable Energy Reviews 77: 636–51. [Google Scholar] [CrossRef]
  13. Faiman, David. 2008. Assessing the outdoor operating temperature of photovoltaic modules. Progress in Photovoltaics: Research and Applications 16: 307–15. [Google Scholar] [CrossRef]
  14. Giordano, Luca, and Daniela Morale. 2021. A fractional Brownian–Hawkes model for the Italian electricity spot market: Estimation and forecasting. Journal of Energy Markets 14: 65–109. [Google Scholar] [CrossRef]
  15. Graditi, Giorgio, Sergio Ferlito, and Giovanna Adinolfi. 2016. Comparison of Photovoltaic plant power production prediction methods using a large measured dataset. Renewable Energy 90: 513–19. [Google Scholar] [CrossRef]
  16. Huang, Jr-Wei, Sharon. S. Yang, and Chuang-Chang Chang. 2018. Modeling temperature behaviors: Application to weather derivative valuation. Journal of Futures Markets 38: 1152–75. [Google Scholar] [CrossRef]
  17. Huld, Thomas, Gabi Friesen, Artur Skoczek, Robert P. Kenny, Tony Sample, Michael Field, and Ewan D. Dunlop. 2011. A power-rating model for crystalline silicon PV modules. Solar Energy Materials and Solar Cells 95: 3359–69. [Google Scholar] [CrossRef]
  18. Koehl, Michael, Markus Heck, Stefan Wiesmeier, and Jochen Wirth. 2011. Modeling of the nominal operating cell temperature based on outdoor weathering. Solar Energy Materials and Solar Cells 95: 1638–46. [Google Scholar] [CrossRef]
  19. Lee, Jeo, and Robert Craine. 2012. Temperature Modeling and Weather Derivative Pricing. American Journal of Scientific Research 77: 93–109. [Google Scholar]
  20. Li, Yanxia, Wenjia Cai, and Can Wang. 2017. Economic impacts of wind and solar photovoltaic power development in China. Energy Procedia 105: 3440–48. [Google Scholar] [CrossRef]
  21. Lingohr, Daniel, and Gernot Müller. 2019. Stochastic modeling of intraday photovoltaic power generation. Energy Economics 81: 175–86. [Google Scholar] [CrossRef]
  22. Lucheroni, Carlo, and Carlo Mari. 2017. CO2 volatility impact on energy portfolio choice: A fully stochastic LCOE theory analysis. Applied Energy 190: 278–90. [Google Scholar] [CrossRef]
  23. Lucheroni, Carlo, and Carlo Mari. 2018. Risk shaping of optimal electricity portfolios in the stochastic LCOE theory. Computers and Operations Research 96: 374–85. [Google Scholar] [CrossRef]
  24. Macedo, Daniela Pereira, Antònio Cardoso Marques, and Olivier Damette. 2020. The impact of the integration of renewable energy sources in the electricity price formation: Is the Merit-Order Effect occurring in Portugal? Utilities Policy 66: 101080. [Google Scholar] [CrossRef]
  25. Mahesh, Aeidapu, and Kanwarjit Singh Sandhu. 2015. Hybrid wind/photovoltaic energy system developments: Critical review and findings. Renewable and Sustainable Energy Reviews 52: 1135–47. [Google Scholar] [CrossRef]
  26. Matsumoto, Takuji, and Misao Endo. 2021. One-week-ahead electricity price forecasting using weather forecasts, and its application to arbitrage in the forward market: An empirical study of the Japan Electric Power Exchange. Journal of Energy Markets 14: 1–26. [Google Scholar] [CrossRef]
  27. Monteiro, Raul Vitor Arantes, Geraldo Caixeta Guimarães, Fabricio Augusto Matheus Moura, Madeleine Rocio Medrano Castillo Albertini, and Marcelo Keese Albertini. 2017. Estimating photovoltaic power generation: Performance analysis of artificial neural networks, Support Vector Machine and Kalman filter. Electric Power Systems Research 143: 643–56. [Google Scholar] [CrossRef]
  28. Neto, Daywes Pinheiro, Elder Geraldo Domingues, Antònio. Paulo Coimbra, Anibal Traça de Almeida, Aylton José Alves, and Wesley Pacheco Calixto. 2017. Portfolio optimization of renewable energy assets: Hydro, wind, and photovoltaic energy in the regulated market in Brazil. Energy Economics 64: 238–50. [Google Scholar] [CrossRef]
  29. Nowotarski, Jakub, and Rafal Weron. 2018. Recent advances in electricity price forecasting: A review of probabilistic forecasting. Renewable and Sustainable Energy Reviews 81: 1548–68. [Google Scholar] [CrossRef]
  30. Saoud, Lyes Saad, Fayçal Rahmoune, Victor Tourtchine, and Kamel Baddari. 2018. A novel method to forecast 24 h of global solar irradiation. Energy Systems 9: 171–93. [Google Scholar] [CrossRef]
  31. Sim, So-Kumnet, Philipp Maass, and Pedro G. Lind. 2019. Wind Speed Modeling by Nested ARIMA Processes. Energies 12: 69. [Google Scholar] [CrossRef]
  32. Singh, Dalwinder, and Birmohan Singh. 2020. Investigating the impact of data normalization on classification performance. Applied Soft Computing Journal 97: 105524. [Google Scholar] [CrossRef]
  33. Tselika, Kyriaki. 2022. Tselika, Kyriaki 2022. The impact of variable renewables on the distribution of hourly electricity prices and their variability: A panel approach. Energy Economics 113: 106194. [Google Scholar] [CrossRef]
  34. Türkvatan, Aysun, Hayfavi Azize, and Tolga Omay. 2020. A regime switching model for temperature modeling and applications to weather derivatives pricing. Mathematics and Financial Economics 14: 1–42. [Google Scholar] [CrossRef]
  35. United Nations Framework Convention on Climate Change. 2015. Available online: https://treaties.un.org/pages/ViewDetails.aspx?src=TREATY&mtdsg_no=XXVII-7-d&chapter=27&clang=_en (accessed on 1 September 2024).
  36. Urraca, Ruben, Thomas Huld, Anders. Vilhelm Lindfors, Aku Riihelä, Francisco Javier Martinez-de-Pison, and Andres Sanz-Garcia. 2018. Quantifying the amplified bias of PV system simulations due to uncertainties in solar radiation estimates. Solar Energy 176: 663–77. [Google Scholar] [CrossRef]
  37. Weron, Rafal. 2014. Electricity price forecasting: A review of the state-of-the-art with a look into the future. International Journal of Forecasting 30: 1030–81. [Google Scholar] [CrossRef]
  38. Yang, Yingkui, Hans Stubbe Solgaard, and Wolfgang Haider. 2016. Wind, hydro or mixed renewable energy source: Preference for electricity products when the share of renewable energy increases. Energy Policy 97: 521–31. [Google Scholar] [CrossRef]
  39. Yousif, Jabar H., A. Kazem Hussein, and John Boland. 2017. Predictive Models for Photovoltaic Electricity Production in Hot Weather Conditions. Energies 10: 971. [Google Scholar] [CrossRef]
  40. Zapranis, Achilleas D., and Antonios Alexandridis. 2011. Modeling and forecasting cumulative average temperature and heating degree day indices for weather derivative pricing. Neural Computing and Applications 20: 787–801. [Google Scholar] [CrossRef]
Figure 1. Power plant location.
Figure 1. Power plant location.
Econometrics 12 00034 g001
Figure 2. Power curve of the turbine.
Figure 2. Power curve of the turbine.
Econometrics 12 00034 g002
Figure 3. Backtesting and rolling windows.
Figure 3. Backtesting and rolling windows.
Econometrics 12 00034 g003
Figure 4. (a) Autocorrelation functions for the main variables. (b) Partial autocorrelation functions for the main variables.
Figure 4. (a) Autocorrelation functions for the main variables. (b) Partial autocorrelation functions for the main variables.
Econometrics 12 00034 g004
Table 1. (a) Climatic series statistics: wind speed (m/s), temperature (°C) and solar radiation (W/m2) for each location. (b) Electricity price (Euro) statistics.
Table 1. (a) Climatic series statistics: wind speed (m/s), temperature (°C) and solar radiation (W/m2) for each location. (b) Electricity price (Euro) statistics.
(a)
W1W2W3T1T2T3R1R2R3
Mean5.786.296.0316.9517.6517.73206.68207.81201.88
St. Dev.3.103.333.707.966.364.98288.89289.77282.82
Variance9.6311.1013.7163.3240.4324.7983,454.5983,968.9179,988.20
Kurtosis0.670.440.67−0.34−0.63−1.060.200.170.22
Skew.0.790.690.940.450.330.161.221.211.23
Min0.010.020.01−3.301.824.890.000.000.00
Max23.3823.4824.6043.3138.3332.741025.501028.001017.00
(b)
VariableMeanSt. Dev.SkewnessKurtosisMin.Max.
Electricity price62.9825.321.326.580.00378.47
Table 2. Regression adequacy tests.
Table 2. Regression adequacy tests.
VariableRMSEAdjusted R-SquaredF-Statisticp-Value
PUN7.650.9092.72 × 1040
Wind0.5020.9741.75 × 1050
Radiation34.90.9851.84 × 1050
Temp0.5230.9961.08 × 1060
Table 3. Machine learning risk measures for PUN, wind speed, and solar radiation.
Table 3. Machine learning risk measures for PUN, wind speed, and solar radiation.
PUNMAPESMAPEMAERMSEMAE (%)RMSE (%)
NNND7.37034.88006.92047.15%10.14%
ensembleNDND6.21798.95859.11%13.13%
treeND10.23687.005910.393310.27%15.23%
k-neighND10.66417.084210.267410.38%15.05%
SolarMAPESMAPEMAERMSEMAE (%)RMSE (%)
NNNDND12.872927.73946.17%13.29%
ensembleNDND17.856237.00168.55%17.73%
treeNDND15.978036.59797.65%17.53%
k-neighNDND15.827636.61327.58%17.54%
WindMAPESMAPEMAERMSEMAE (%)RMSE (%)
NN9.28308.02610.32690.47825.61%8.21%
ensemble11.06199.92900.41330.58147.10%9.99%
tree11.932410.61670.44610.62917.66%10.81%
k-neigh16.717014.53730.63130.851410.84%14.62%
Table 4. (1) 1-year estimation window and 1-day test window (locations 1-2-3). (2) 1-year estimation window and 2-day test window (locations 1-2-3). (3) 1-year estimation window and 3-day test window (locations 1-2-3). (4) 1-year estimation window and 7-day test window (locations 1-2-3). Optimal values of risk measures are highlighted in bold.
Table 4. (1) 1-year estimation window and 1-day test window (locations 1-2-3). (2) 1-year estimation window and 2-day test window (locations 1-2-3). (3) 1-year estimation window and 3-day test window (locations 1-2-3). (4) 1-year estimation window and 7-day test window (locations 1-2-3). Optimal values of risk measures are highlighted in bold.
(1)
L.Income [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML3.31093.249771,069.82101,686.892.83%4.05%2,512,285.78
1Regression3.97333.909086,121.34119,288.853.43%4.75%2,514,355.22
1Empirical 2,509,645.36
2ML3.10373.051175,596.56108,280.892.76%3.95%2,748,182.78
2Regression3.75473.706393,082.66130,889.123.39%4.77%2,752,205.15
2Empirical 2,743,971.79
3ML2.88502.847768,325.3898,127.252.69%3.87%2,541,424.27
3Regression3.56493.526881,420.64111,648.133.21%4.40%2,548,533.23
3Empirical 2,535,461.42
L.Total energy [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML2.11972.0880636.72882.921.73%2.40%36,763.80
1Regression2.42442.3754706.60913.241.92%2.48%36,781.06
1Empirical 36,806.92
2ML1.88031.8574632.13892.841.57%2.22%40,218.14
2Regression2.09532.0647694.35932.391.72%2.32%40,281.14
2Empirical 40,257.27
3ML1.49231.4816494.56742.431.32%1.99%37,324.73
3Regression1.86091.8387563.88732.581.51%1.96%37,423.39
3Empirical 37,377.69
(2)
L.Income [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML2.75092.7184131,586.18187,493.842.46%3.51%5,340,661.02
1Regression3.21063.1597152,019.89213,138.782.84%3.99%5,366,347.48
1Empirical 5,344,676.63
2ML2.66042.6384143,986.17204,564.732.47%3.51%5,826,446.71
2Regression3.13773.0934165,670.00231,909.082.84%3.98%5,856,686.19
2Empirical 5,825,482.99
3ML2.55572.5381128,250.22181,422.732.40%3.39%5,350,202.03
3Regression3.02482.9831147,686.81204,431.462.76%3.82%5,382,739.87
3Empirical 5,347,656.44
L.Total energy [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML1.50041.4854942.911244.691.27%1.68%74,128.75
1Regression1.86911.84331142.451451.661.54%1.96%74,199.15
1Empirical 74,234.88
2ML1.35761.3503951.871298.071.17%1.60%81,199.48
2Regression1.73561.71631195.981523.541.47%1.87%81,350.57
2Empirical 81,274.60
3ML1.15241.1475770.091084.221.03%1.45%74,643.10
3Regression1.46071.4461901.091180.191.21%1.58%74,845.80
3Empirical 74,760.04
(3)
L.Income [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML2.43252.4231187,939.55273,544.162.27%3.30%8,279,679.35
1Regression2.79732.7611216,991.50304,648.742.62%3.67%8,334,649.45
1Empirical 8,293,667.33
2ML2.28792.2888199,156.87292,839.592.21%3.25%8,998,723.06
2Regression2.70172.6688230,484.00327,821.842.56%3.64%9,062,070.85
2Empirical 9,008,161.78
3ML2.28642.2882180,198.51271,369.192.18%3.29%8,247,010.73
3Regression2.67422.6422209,401.73295,252.952.54%3.58%8,310,164.88
3Empirical 8,252,275.36
L.Total energy [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML1.25131.23691208.121556.881.07%1.38%112,483.61
1Regression1.54291.52731479.631868.631.31%1.66%112,613.72
1Empirical 112,656.34
2ML1.10881.10601231.631655.961.00%1.35%122,826.47
2Regression1.44161.42951543.011941.961.25%1.58%123,078.14
2Empirical 122,962.82
3ML0.98740.98411007.401411.650.89%1.25%112,564.05
3Regression1.25511.24651219.611584.341.08%1.41%112,849.13
3Empirical 112,717.98
(4)
L.Income [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML1.94831.9728357,940.06578,554.231.89%3.06%18,882,869.19
1Regression2.21892.2007413,048.90601,741.022.18%3.18%19,008,567.81
1Empirical 18,931,019.95
2ML1.91121.9316382,972.19614,764.931.87%3.00%20,445,413.32
2Regression2.19122.1716440,970.16630,597.162.15%3.08%20,584,935.45
2Empirical 20,475,215.83
3ML1.88201.9026357,036.96582,253.031.88%3.07%18,959,278.21
3Regression2.15742.1385408,559.83584,362.542.15%3.08%19,105,834.49
3Empirical 18,985,906.24
L.Total energy [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML0.85700.85592056.392603.030.78%0.99%262,743.75
1Regression1.11561.11052648.603261.371.01%1.24%263,012.48
1Empirical 263,096.01
2ML0.82770.82662195.442908.640.77%1.02%285,680.44
2Regression1.09051.08462822.003448.840.99%1.21%286,173.30
2Empirical 285,836.51
3ML0.67290.67221686.192357.080.64%0.89%264,648.54
3Regression0.90600.90152168.182747.580.82%1.04%265,319.11
3Empirical 264,905.85
Table 5. (1) 5-year estimation window and 1-day test window (locations 1-2-3). (2) 5-year estimation window and 2-day test window (locations 1-2-3). (3) 5-year estimation window and 3-day test window (locations 1-2-3). (4) 5-year estimation window and 7-day test window (locations 1-2-3). Optimal values of risk measures are highlighted in bold.
Table 5. (1) 5-year estimation window and 1-day test window (locations 1-2-3). (2) 5-year estimation window and 2-day test window (locations 1-2-3). (3) 5-year estimation window and 3-day test window (locations 1-2-3). (4) 5-year estimation window and 7-day test window (locations 1-2-3). Optimal values of risk measures are highlighted in bold.
(1)
L.Income [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML3.06253.025455,760.9678,812.822.69%3.80%2,070,321.59
1Regression3.95223.858075,078.85112,694.683.62%5.43%2,094,152.99
1Empirical 2,074,539.68
2ML2.83992.808455,121.8775,038.262.46%3.34%2,241,656.37
2Regression3.77523.693876,839.09111,105.243.42%4.95%2,266,829.99
2Empirical 2,244,770.36
3ML2.76342.719851,464.3272,939.252.48%3.51%2,076,542.86
3Regression3.56203.486670,040.28109,250.163.37%5.26%2,100,109.66
3Empirical 2,075,465.30
L.Total energy [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML2.10782.0709681.811058.931.78%2.76%38,281.40
1Regression2.47972.4317765.201145.921.99%2.99%38,330.61
1Empirical 38,365.07
2ML1.85071.8202611.27831.171.48%2.01%41,287.43
2Regression2.20522.1665703.72938.771.70%2.27%41,334.16
2Empirical 41,350.67
3ML1.64011.5933508.71781.571.33%2.04%38,266.17
3Regression2.02121.9871604.26822.361.58%2.15%38,341.19
3Empirical 38,234.57
(2)
L.Income [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML2.62242.577597,231.98142,213.572.35%3.44%4,150,449.52
1Regression3.17233.1105119,417.33166,145.762.89%4.01%4,175,415.98
1Empirical 4,139,248.26
2ML2.49522.4567100,774.21143,816.282.24%3.20%4,511,966.74
2Regression3.06623.0084125,162.22170,032.422.78%3.78%4,537,742.89
2Empirical 4,496,982.08
3ML2.37702.333089,503.02136,595.592.16%3.30%4,165,383.87
3Regression3.00702.9507115,949.98163,313.902.80%3.94%4,190,133.79
3Empirical 4,144,815.88
L.Total energy [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML1.43281.4214975.151415.751.26%1.84%77,018.92
1Regression1.81591.79691185.881645.391.54%2.13%77,122.66
1Empirical 77,151.17
2ML1.30821.2979934.401248.051.12%1.50%83,301.51
2Regression1.66141.64391140.211473.111.37%1.77%83,405.57
2Empirical 83,369.13
3ML1.11621.1069756.911058.520.98%1.38%76,926.69
3Regression1.48781.4730945.361278.851.23%1.66%77,070.94
3Empirical 76,879.79
(3)
L.Income [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML2.01961.9975120,184.52175,930.851.89%2.77%6,368,058.64
1Regression2.63892.6030156,777.74212,127.782.47%3.34%6,395,295.76
1Empirical 6,346,629.74
2ML1.93811.9176124,171.38179,420.861.80%2.60%6,924,823.12
2Regression2.51532.4797161,734.31219,243.112.35%3.18%6,955,129.45
2Empirical 6,895,378.90
3ML1.91061.8873113,812.15172,416.701.79%2.71%6,394,809.44
3Regression2.50872.4715150,574.78208,634.782.37%3.28%6,426,759.92
3Empirical 6,362,097.53
L.Total energy [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML1.18261.17671228.211724.051.07%1.51%114,261.30
1Regression1.53781.52541520.432010.981.33%1.76%114,392.40
1Empirical 114,459.96
2ML1.04501.03961138.121524.200.92%1.23%123,691.17
2Regression1.37111.35941435.771837.551.16%1.48%123,879.07
2Empirical 123,772.33
3ML0.96820.96411004.311357.560.88%1.19%114,403.78
3Regression1.27681.26601233.141615.751.08%1.41%114,653.30
3Empirical 114,404.57
(4)
L.Income [0,T]MAPE SMAPE MAERMSEMAE (%)RMSE (%)Mean
1ML1.38301.3784209,606.65288,837.381.37%1.88%15,366,617.29
1Regression2.11572.0893319,380.22439,922.772.08%2.87%15,533,379.76
1Empirical 15,345,187.95
2ML1.32911.3236218,349.90306,154.941.31%1.83%16,762,819.14
2Regression2.06422.0384341,080.65467,218.892.04%2.79%16,935,591.00
2Empirical 16,725,320.45
3ML1.33561.3302204,598.50286,718.671.32%1.86%15,488,348.00
3Regression2.10472.0772320,221.14437,506.622.07%2.83%15,655,172.88
3Empirical 15,442,105.88
L.Total energy [0,T]MAPESMAPEMAERMSEMAE (%)RMSE (%)Mean
1ML0.76040.75991908.442520.270.73%0.96%261,238.45
1Regression1.12281.11782667.803344.851.02%1.28%261,655.00
1Empirical 261,726.66
2ML0.66850.66801788.092266.630.63%0.80%283,967.78
2Regression1.05311.04782680.693304.990.94%1.16%284,452.01
2Empirical 284,180.11
3ML0.62650.62601597.692086.600.61%0.79%263,106.60
3Regression0.95840.95422262.802777.540.86%1.06%263,605.40
3Empirical 263,226.32
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Masala, G.; Schischke, A. Forecasting Wind–Photovoltaic Energy Production and Income with Traditional and ML Techniques. Econometrics 2024, 12, 34. https://doi.org/10.3390/econometrics12040034

AMA Style

Masala G, Schischke A. Forecasting Wind–Photovoltaic Energy Production and Income with Traditional and ML Techniques. Econometrics. 2024; 12(4):34. https://doi.org/10.3390/econometrics12040034

Chicago/Turabian Style

Masala, Giovanni, and Amelie Schischke. 2024. "Forecasting Wind–Photovoltaic Energy Production and Income with Traditional and ML Techniques" Econometrics 12, no. 4: 34. https://doi.org/10.3390/econometrics12040034

APA Style

Masala, G., & Schischke, A. (2024). Forecasting Wind–Photovoltaic Energy Production and Income with Traditional and ML Techniques. Econometrics, 12(4), 34. https://doi.org/10.3390/econometrics12040034

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop