1. Introduction
The ongoing energy transition is progressively redefining structure and arrangement of the current energy system. A crucial challenge is represented by the large penetration of RES (Renewable Energy Sources) into the existing power supply structure. A grid operator should be able to ensure the balance between the electricity production and consumption any moment, accommodating expected and unexpected changes on both sides. RES have dynamic nature and large variability depending on geographical locations and weather conditions. For instance, concerning PV (photovoltaic) plants, the power output depends on several meteorological variables such as solar irradiance, air temperature, cloud variation, wind speed and so on, intrinsically intermittent and non-controllable: these aspects imply problems of reliability, stability, and scheduling of the power supply structure [
1].
Reliable forecast tools allow the prediction of the expected power production and its fluctuations, leading to a more efficient grid management [
2]: for this reason, power forecast research field is presently receiving unprecedent attention from the scientific community. The current work is focused specifically on day-ahead PV power forecast.
According to literature, solar forecast methods can be categorized in: statistical methods, physical methods, Machine Learning (ML) methods and hybrid methods [
2,
3,
4]. Statistical methods are capable, given a time series of historical data, to reconstruct the relationship between solar irradiance or PV power output and meteorological parameters. Moreover, they do not require physical knowledge about a system to model it [
3]. Physical methods, mainly consisting of Numerical Weather Predictions (NWP), model the interactions between solar radiation and atmospheric components by means of differential equations and do not require historical data [
5,
6]. ML methods mimics the capability of human brain to learn from experience and can solve even problems which cannot be represented explicitly. As with statistical methods, to perform a prediction, they require historical data but not physical knowledge of the modeled system [
2]. Artificial Neural Networks (ANNs) are a ML method commonly involved in PV power forecast. Finally, hybrid methods consist of combinations between other forecast methods, with the purpose of solving the weaknesses of individual ones and benefiting from their advantages [
7,
8].
In the current work, as forecast models, several combinations between ANNs and clustering techniques are proposed. Clustering is an unsupervised machine learning technique that allows the partitioning of a dataset into groups of samples presenting similarities [
9,
10]. In the following, different clustering criteria are applied to divide the days in a dataset into different classes according to their weather conditions. Once a partition is defined, a specific ANN is developed for every cluster: each ANN is trained using only samples belonging to a certain cluster and is used to forecast PV power production only in the weather conditions typical of that cluster. The similarity between PV power curves registered in similar weather conditions is therefore exploited to construct optimized forecast models.
The aim of this paper is to assess whether it possible to improve the training of artificial neural networks for day-ahead PV power forecast by dividing a dataset through clustering techniques and, in the case of a positive answer, to identify the best-performing dataset partition in terms of forecast accuracy between the proposed ones.
2. Case Study and Procedure
Different combinations between clustering techniques and artificial neural networks are tested, validated, and compared on a real case study: a PV facility in SolarTech
LAB, at Politecnico di Milano [
11]. However, the proposed procedure is valid for PV plants of all sizes. The available dataset contains historical data about measured power and predicted weather parameters, namely temperature, global horizontal irradiance, global plane-of-array irradiance and wind speed. The predicted weather parameters are provided as input to the proposed prediction models, whose output is compared with the measured power data to assess the forecast performance. Data are recorded on an hourly basis for a total amount of 840 days in the time span comprised between January 2017 and September 2020.
The overall PV power forecast process can be summarized as the iterative multi-step procedure represented in
Figure 1. In the following, details about every single step are provided.
2.1. Clustering Phase
In the clustering phase, to obtain a proper partition, the daily clearness index (
) is employed in clustering as meaningful parameter for day type estimation [
12,
13,
14]. It is defined as:
In the equation,
G is the daily global horizontal irradiation, while
represents the corresponding daily extraterrestrial horizontal irradiation. Hence,
is a dimensionless quantity employed in day type clustering thanks to its capability to remove the seasonal dependence from solar irradiation, isolating the information content about weather conditions [
9]. Large values of clearness index indicate clear sky conditions, while low values represent overcast sky conditions. Starting from the previously described dataset, the
value for each day is computed by means of the same procedure applied by ESRA (European Solar Radiation Atlas) [
15].
Then, four different dataset division criteria are proposed, namely: FT-A, FT-B, KM-3, and KM-2. All the approaches are based on the clearness index and, as previously mentioned, aim to divide the dataset in classes according to weather conditions of single days.
FT-A (Fixed Threshold set A) and FT-B (Fixed Threshold set B) are not properly clustering algorithms, but they perform a partition relying of fixed threshold values of clearness index defined in scientific literature [
16,
17]. In detail, they divide the dataset in three different weather classes based on the thresholds summarized
Table 1.
Both KM-3 and KM-2 are based on the k-means clustering algorithm. The choice of k-means instead of other possible clustering algorithms is related to its simplicity in implementation and its efficiency. It is worth noticing that the application of k-means algorithm based on a single parameter (i.e., the clearness index) corresponds to a fixed-thresholds-based partition where the thresholds are set automatically by the algorithm instead of by an external intervention (as in FT-A and FT-B). The difference between KM-3 and KM-2 consists of the choice of the number of clusters (K). KM-3 adopts K = 3 for a homogeneous comparison with the fixed-thresholds-based approaches (i.e., FT-A and FT-B). KM-2 exploits some proper indexes to select the best possible dataset partition in terms of clustering quality, namely: Silhouette index, Davies-Bouldin index and Calinski–Harabasz index.
Given a generic dataset , containing N elements and partitioned into K clusters , these indexes can be computed as follows.
The Silhouette index [
18] (computed as global value) is defined as:
In the equation:
is the number of elements in the generic cluster
;
is the average distance between the
element in the cluster
and the other elements in the same cluster;
is the minimum average distance between the
element in the cluster
and all the elements belonging to clusters
, with
and
. The optimal number of clusters is the one that maximizes the value of Silhouette index.
The Davies-Bouldin index [
18] is defined as:
In the equation:
is the within-cluster distance;
is the between cluster distance. The optimal clustering solution is the one that minimizes the Davies-Bouldin index value.
The Calinski–Harabasz index [
19] is defined as:
In the equation:
is the number of elements in the cluster
;
is the barycentre of the cluster
(in the case of k-means clustering, it corresponds to the centroid); and
G is the barycentre of the entire dataset (the overall mean of the data). The optimal clustering solution is the one that maximizes the value of Calinski–Harabasz index.
These indexes are computed in function of different numbers of clusters and, applying a majority voting procedure, K = 2 is selected as the optimal dataset partition.
2.2. Extraction Phase
The extraction phase corresponds to the extraction of a test day, consisting of 24 consecutive hourly samples, from the initial dataset. This day constitutes the test set on which the prediction performance is computed. The cluster of origin of the extracted day is assumed to be unknown, as it would be in a real day-ahead power forecast. For a complete and reliable prediction performance assessment, all days available in the dataset are extracted one by one in different iterations.
2.3. Classification Phase
In the classification phase, the most suitable cluster for the test day is identified. Once labeled, the test day is assigned to the proper ANN, which perform the power prediction in the test day weather conditions. Therefore, this phase represents an additional step with respect to the single-network-based forecast, where the inputs are directly provided to the unique ANN available. As classifier, the random forest model is chosen, among all the possible algorithms, thanks to its flexibility, fast implementation, and easy tuning [
20]. The classifier optimization consists of a proper selection of number of trees and input features based on out-of-bag classification error. The optimal configuration consists of a structure with 60 trees that takes global horizontal irradiance and global plane-of-array irradiance as input features.
2.4. Prediction Phase
Lastly, in the prediction step, different neural networks are developed to predict the PV power output in the extracted test days. Two different approaches are adopted, namely NN-Clust and NN-Std.
NN-Clust represents the clustering-based approach. In this approach, only days belonging to the same cluster of the test day are used or the training of each ANN. Then, the trained ANN predicts PV power output for the test day, characterized by weather conditions similar to those of samples involved in training. For the training of each ANN, 10% of samples contained in a given cluster is randomly extracted as validation set, while the remaining 90% constitutes the training set. Moreover, an ensemble of 10 independent trials is implemented to enhance the generalization capability of the model. To optimize the hidden layer size, a sensitivity analysis is carried out for every ANN corresponding to a different cluster. In practical terms, the sensitivity analysis studies the trade-off between performance and computational cost, analyzing the value of the Mean Square Error in function of a variable number of hidden neurons. The predicted weather parameters available in the dataset, namely temperature, global horizontal irradiance, and global plane-of-array irradiance and wind speed, are provided as input features to all the networks.
On the other hand, NN-Std represents the most common forecast approach in scientific literature, involving a single neural network, and it is developed for comparison with the previously described clustering-based approach. For the sake of a fair comparison, NN-Std must present several similarities with NN-Clust: same number of neurons in the hidden layer, same input features and same days predicted as test. The crucial difference between NN-Clust and NN-Std is that the latter is trained with days extracted from all clusters.
3. Error Metrics
Given a forecast output P and an observed output several error metrics are defined and adopted in this work for performance evaluation.
The Normalized Mean Absolute Error (NMAE) estimates the average magnitude of the errors for a set of
N predictions divided by the plant net capacity
C:
The Root Mean Square Error (RMSE) is computed using the square of the difference between observed and predicted values, and therefore penalizes large gaps:
The normalized Root Mean Square Error (nRMSE) corresponds to the ratio between RMSE and the maximum observed power output in the considered time frame:
The Weighted Mean Absolute Error (WMAE) is based on the total energy production:
Finally, the Envelope-weighted Mean Absolute Error (EMAE), introduced in [
21], aims to provide a measure of forecast accuracy in the interval between 0% and 100%:
4. Results and Discussion
The groups identified by the different dataset partitioning methods proposed are different and quite unbalanced in terms of numerosity, as reported in
Table 2. In general, the cluster corresponding to sunny conditions is the largest while the others, in comparison, contain much less elements. The only exception is represented by FT-B, providing a more homogeneous grouping where sunny days and partially cloudy days clusters have comparable size. The numerosity of a cluster is relevant by the point of view of the forecast: ANNs trained using too few elements could present poor generalization performance.
Concerning the forecast accuracy, several comparisons are performed. First, the NN-Clust models developed are compared to the corresponding NN-Std to evaluate the performance enhancement allowed by the proposed methodology. The performance improvements computed according to all the evaluation metrics are reported in
Table 3.
Independently from the error metric and the dataset partition considered, the approach involving clustering (i.e., NN-Clust) outperforms the one based on a single-network prediction (i.e., NN-Std). The largest improvement recorded consists of an error reduction of 7.9% in nRMSE with KM-3, while smallest one consists of an error reduction of 1.9% in RMSE with KM-2. Therefore, weather type clustering is demonstrated to be effective and beneficial when combined to ANN with the goal to optimize their training.
Then, a comparison between different dataset partitioning criteria, always in terms of prediction performance, is carried out and visually represented in
Figure 2. The spider-web chart is represented normalizing all the error metrics, i.e., dividing them by the corresponding maximum recorded value.
Comparing all the approaches that divide the dataset in 3 clusters, it is observed that the clustering-based approach, i.e., KM-3, outperform both FT-B and FT-A, based on fixed threshold values of clearness index. Therefore, at equal number of clusters identified, the clustering-based methods exhibit better performance.
On the other hand, comparing the clustering-based approaches, i.e., KM-3 and KM-2, four error metrics out of five highlight the superiority of clustering method KM-3, even if the error reduction allowed with respect to KM-2 is always limited. This means that the optimal dataset partition in terms of clustering quality does not necessarily imply the best prediction performance of the forecast model.
Among all the dataset partitioning methods considered, KM-3 reveals to be the best-performing one by the point of view of forecast accuracy.
Lastly, the “best” and “worst” days in terms of forecast performance, corresponding respectively to minimum and maximum recorded values of EMAE, are extracted and analyzed for each cluster identified by KM-3, i.e., the best-performing partitioning criterion. For these days, the actual power curve (
) and the ones forecast by NN-Clust and NN-Std approaches are depicted and compared in
Figure 3.
In the “best” case for sunny days, both NN-Clust and NN-Std approaches accurately approximate the smooth power curve typical of sunny days. The “best” partially cloudy day presents an actual power trend not as smooth as a typical sunny day, but not even much irregular. Indeed, this day shows one of the highest clearness index value (0.53) among the partially cloudy days cluster. The forecast curves accurately approximate the actual one except for a small region around the central hours of the day, where NN-Clust clearly outperforms NN-Std. The “best” cloudy day presents the irregular PV power trend typical of overcast sky conditions. NN-Clust outperforms NN-Std in terms of forecast error, but both models are capable of accurately approximating the actual trend.
The “worst” days always correspond to errors in weather forecast, when the real weather characteristics of a given day turned out to be completely different from the expected ones. In this condition, the forecast power either strongly overestimate or underestimate the measured one. It is, therefore, observed that with inaccurately predicted weather parameters in input to an ANN, the forecast performance exhibits a heavy deterioration.