3.2. Data Analysis
The dataset was analyzed once it was cleaned. First,
Figure 5 shows the probability density function (PDF) of the PV energy production data, taking only into account values between sunrise and sunset. These datasets present a bimodal distribution for the Australian facilities, with a peak close to cero and another at high PV energy production. On the other hand, the distributions are left-skewed for the installations located in Germany, with its highest point at low PV energy production. In both cases, the first peak is derived from the hours near sunset and sunrise, where the global horizontal radiation is low.
Figure 6 shows the autocorrelation function (ACF) of PV energy production for two of the time series considered, one for each location. This ACF represents the similarity between two observations depending on the lag time among them. In this case, the results indicate that the PV energy production presents a clear periodic pattern with an interval of 24h. These results are equivalent for the rest of the facilities.
The Pearson correlation coefficient was also calculated to measure the correlation between the PV energy production and both the meteorological variables and the solar angles (
Table 4).
The correlation between PV energy production and the different meteorological variables varies between the location, showing a different influence of these features depending on the climatic conditions. With respect to the correlation with the horizontal global irradiation, it is stronger in the case of the facilities from Australia. Among the facilities in Germany, the highest correlation was found in the GE_2 facility. The differences in the correlations obtained may be due to several factors. First, the sky conditions in Konstanz are much more diverse than the sky conditions in Alice Springs, which has an impact in the relation between horizontal irradiation and plane-of-array (POA) irradiation [
63], and therefore, in energy production. Secondly, the GE_1 and GE_3 installations are not oriented towards the south, but slightly towards the southwest, which produces a certain gap between the maximum generation and the maximum irradiation points, as shown in
Figure 7. Thirdly, the tilt of the different systems is not the same: the lower the tilt, the greater the correlation. Finally, the facilities in Germany are located on roofs of urbanized areas, which results in a more obstructed horizon that also decreases the correlation.
Finally,
Figure 8 represents the relation between the PV energy production and the horizontal global irradiation for the different facilities, highlighting the differences between the months of the year.
3.4. Forecasting Results
Using the variables selected, the hourly day-ahead PV power generation was predicted using different models for the two locations. The hyperparameters of MLP, LSTM, and XGBoost were established based on the best models in [
28,
48,
65]. For TFT, the hyperparameters were tuned with NNI and the best performance was obtained for the values in
Table 5.
Figure 9 shows the results for different combinations of hyperparameters.
The variables related to the complexity of the neural network based models are shown in
Table 6. TFT is the most complex model considering the three variables: number of parameters, model size, and training time per epoch showing an 82.6% and 86.6% increase in size and in time complexity, respectively, compared to LSTM.
The forecasting errors using these different models are shown in
Table 7. As can be seen, TFT outperforms the other models with lowest values for all the indicators in both locations. Comparing TFT with LSTM, the second-best model, the indicators of RMSE, MAE, and MASE, for Australia are 46%, 48%, 48% lower, respectively. The same behavior can be observed for Germany, with a reduction of 20%, 26%, and 54% in RMSE, MAE, and MASE from TFT with respect to LSTM. These results may be explained by the different components integrated in TFT. This algorithm incorporates temporal self-attention decoder to capture long-term dependencies, allowing to learn relationships at different scales. It also supports to efficiently address features for each input type: static or time-invariant features, past-observed time varying input and future-known time varying input. It was also found to exceed other predictions models in other areas like the forecasting of wind speed or freeway traffic speed [
66,
67].
LSTM showed good accuracy and better performance than XGBoost, MLP, and ARIMA. This behavior was also found in [
47,
48]. Unlike XGBoost and MLP, both LSTM and TFT can retain previous information and learn temporal correlations between consecutive data points, giving these models a better performance. Besides, while the ARIMA model was only developed taking into account a linear relationship between the exogenous variables and the target variable, both LSTM and TFT can include non-linear approximations. Therefore, these results indicate that both mechanisms are important for the forecasting of day-ahead solar production.
The performance of TFT on each facility was compared using MASE and R
2, the two metrics that are scale invariant. The results for Germany are always worse than those for Australia, with a significant reduction in accuracy for both metrics. The forecasting errors with TFT for the different indicators and the different series considered are shown in
Table 8.
The highest accuracy in forecasting the day-ahead PV power generation is always achieved in AU_1, although the performance of the TFT is very similar for the three Australian PV plants. In this case, the coefficient of variation, which establish the extent of the variability in relation to the mean, is 3.73% and 0.05% for MASE and R2, respectively. In the case of Germany, the coefficients of variation are 27.62% and 0.29% for MASE and R2, showing a great variability between the performance of the model in German facilities.
The forecast accuracy in the German facilities is still not as favorable as the one in the facilities in Australia, this may be due to the greater variability in the weather conditions experienced by the solar facilities in Konstanz.
However, the forecast in the facilities in Germany has improved drastically, especially in GE_1 and GE_3. As mentioned, the other factors which could be damaging the Pearson correlation between horizontal irradiation and energy production (the different orientation and tilt of the panels and the more blocked horizon) could have been corrected with the implementation in the model of the solar angles (zenith and azimuth), the meteorological variables (temperature and humidity) and the calendar data. The solar angles help to predict the relationship between the horizontal irradiation and the irradiation in the plane of the panels [
68], which is reinforced with the meteorological variables [
63] that also influence other factors, such as the efficiency of the panels [
69].
Figure 10 and
Figure 11 show the results of PV power generation forecasting based on TFT and the real solar energy production for representative rainy and sunny days in both locations. These graphs represent the global horizontal irradiation and the observed values from the past 72 h to the prediction length. The predicted value is represented with the prediction intervals, showing the uncertainty of the forecasted values. This information is important when managing the operation of the facilities.
For AU_1 on a rainy day, the maximum solar energy production predicted is 3.50 kWh, where 3.22 kWh and 3.76 kWh are the 0.1 and 0.9 percentiles, respectively. The interval between the tenth and the ninetieth percentile is 0.54 kWh. Considering the hours with energy production, the mean and standard deviation for half of this interval are 0.19 kWh and 0.07 kWh, respectively, with a maximum value of 0.27 kWh and a minimum of 0.07 kWh. For the example of a sunny day, the value predicted for the hour with maximum solar output is 4.05 kWh. In this case, the mean and standard deviation of the mid-range between the considered percentiles during the day is 0.12 kWh and 0.03 kWh, respectively, indicating lower variation in these values. Analyzing these results for this case, it seems that the uncertainty is higher and more varied for rainy days.
For the case of GE_2, the maximum production on the example for a rainy day is 1.39 kWh, and the mean and standard deviation of the half of the interval between 0.1 and 0.9 percentiles along the day is 0.20 kWh and 0.07 kWh, respectively. The maximum is 0.35 kWh and the minimum is 0.08 kWh. In the case of a sunny day, the values are 0.17 kWh and 0.02 kWh for the mean and the standard deviation, respectively, with a solar energy production of 3.2 kWh. In this case, the uncertainty is higher than in the case of Australia, although it is constant throughout the day. Comparing the examples of a rainy and a sunny day, the mean values are similar, although the rainy day is over a peak of solar energy production of 1.39 kWh, whereas on a sunny day, the maximum output is 3.2 kWh. This indicates that the uncertainty is also higher on rainy days as in the case of Germany.
One possible explanation for the lower accuracy during cloudy or rainy days could be the higher variability of the irradiance levels recorded during such days. This implies that the training of the network, and therefore, the prediction of PV energy production, becomes less reliable under these circumstances.
3.5. TFT Interpretability
TFT improves the interpretability of time series forecasting through the calculation of the variable importance scores for the different types of features, and also through the representation of the attention weight patterns. These results can be seen in
Figure 12, which shows the importance of the past-observed time varying inputs,
Figure 13, representing the importance of the future-known time varying features, and
Figure 14 showing the attention weight patterns for one-step-ahead forecasts.
These datasets also have three static inputs: target center, target scale, and the identification of the facility, aiming at providing additional information and context to the model. This last variable is needed to identify each series, since TFT allows to train all of them together, learning also from general patterns amongst series. The first two are related to the standardization of the target variable, and included as static variables in the model. For these series, target center has a value of 0 and target scale is the median of the time series. The importance of these variables is as follows: target center> series id> target scale.
The encoder variables contain the features for which past values are known at prediction time, and consist of the features selected previously in addition to the index indicating the relative time. In this case, the relative time takes values between −72 (input length) and 0. The importance of each of these variables can be seen in
Figure 12. The horizontal solar irradiation received most of the attention (almost 25%), followed by the target variable and the solar zenith (around 12.5% each). Relative humidity and the sine and cosine transformation of the month are variables with a limited weight on the encoder side.
Decoder variables are those features for which future values are known at prediction time. For the decoder, the relative time index takes values between −72 and 24 (prediction length). The importance of each of these variables for the prediction model can be observed in
Figure 13. Solar zenith and horizontal solar irradiation play a significant role, summing almost 60% of the weighted importance.
The results from both the encoder and decoder weighted importance highlight the necessity of having a good representation of solar zenith and horizontal solar irradiation to achieve high prediction performance.
Finally,
Figure 14 represents the attention weight patterns for one-step-ahead forecasts, which can be used to understand the most important past time steps that the TFT model focused on. It shows that the attention displays a cyclic pattern, with clear attention peaks at daily intervals. Thus, the attention is focused on the closest values, and in values at the same hour of previous days. This behavior is understandable since the PV energy production follows the same periodic pattern, as can be seen in
Figure 6.