Enhanced Day-Ahead PV Power Forecast: Dataset Clustering for an Effective Artificial Neural Network Training

Matteri, Andrea; Ogliari, Emanuele; Nespoli, Alfredo

doi:10.3390/engproc2021005016

Open AccessProceeding Paper

Enhanced Day-Ahead PV Power Forecast: Dataset Clustering for an Effective Artificial Neural Network Training^†

by

Andrea Matteri

^*

,

Emanuele Ogliari

and

Alfredo Nespoli

Politecnico di Milano, Dipartimento di Energia, Via La Masa, 34, 20156 Milan, Italy

^*

Author to whom correspondence should be addressed.

^†

Presented at the 7th International Conference on Time Series and Forecasting, Gran Canaria, Spain, 19–21 July 2021.

Eng. Proc. 2021, 5(1), 16; https://doi.org/10.3390/engproc2021005016

Published: 28 June 2021

(This article belongs to the Proceedings of The 7th International Conference on Time Series and Forecasting)

Download

Browse Figures

Versions Notes

Abstract

:

The increasing integration of renewable energy sources into the existing energy supply structure is challenging due to the intermittency typical of these energy sources, which implies problems of reliability and scheduling of grid operation. Concerning solar energy, the solar forecast tool predicts the photovoltaic (PV) power production and therefore permits a more efficient grid management. In this paper, the combination of clustering techniques and ANNs (Artificial Neural Networks) for day-ahead PV power forecast is analyzed. Clustering techniques are exploited to divide a dataset into different classes of days with similar weather conditions. Then, a dedicated ANN is developed for every group. The main goal is to assess the forecast improvement determined by the combination of ANNs and dataset clustering methods. Different combinations are compared on a real case study: a PV facility in SolarTech^LAB, in Politecnico di Milano.

Keywords:

power forecast; photovoltaic; artificial neural network; clustering; clearness index; k-means; classification; random forest

1. Introduction

The ongoing energy transition is progressively redefining structure and arrangement of the current energy system. A crucial challenge is represented by the large penetration of RES (Renewable Energy Sources) into the existing power supply structure. A grid operator should be able to ensure the balance between the electricity production and consumption any moment, accommodating expected and unexpected changes on both sides. RES have dynamic nature and large variability depending on geographical locations and weather conditions. For instance, concerning PV (photovoltaic) plants, the power output depends on several meteorological variables such as solar irradiance, air temperature, cloud variation, wind speed and so on, intrinsically intermittent and non-controllable: these aspects imply problems of reliability, stability, and scheduling of the power supply structure [1].

Reliable forecast tools allow the prediction of the expected power production and its fluctuations, leading to a more efficient grid management [2]: for this reason, power forecast research field is presently receiving unprecedent attention from the scientific community. The current work is focused specifically on day-ahead PV power forecast.

According to literature, solar forecast methods can be categorized in: statistical methods, physical methods, Machine Learning (ML) methods and hybrid methods [2,3,4]. Statistical methods are capable, given a time series of historical data, to reconstruct the relationship between solar irradiance or PV power output and meteorological parameters. Moreover, they do not require physical knowledge about a system to model it [3]. Physical methods, mainly consisting of Numerical Weather Predictions (NWP), model the interactions between solar radiation and atmospheric components by means of differential equations and do not require historical data [5,6]. ML methods mimics the capability of human brain to learn from experience and can solve even problems which cannot be represented explicitly. As with statistical methods, to perform a prediction, they require historical data but not physical knowledge of the modeled system [2]. Artificial Neural Networks (ANNs) are a ML method commonly involved in PV power forecast. Finally, hybrid methods consist of combinations between other forecast methods, with the purpose of solving the weaknesses of individual ones and benefiting from their advantages [7,8].

In the current work, as forecast models, several combinations between ANNs and clustering techniques are proposed. Clustering is an unsupervised machine learning technique that allows the partitioning of a dataset into groups of samples presenting similarities [9,10]. In the following, different clustering criteria are applied to divide the days in a dataset into different classes according to their weather conditions. Once a partition is defined, a specific ANN is developed for every cluster: each ANN is trained using only samples belonging to a certain cluster and is used to forecast PV power production only in the weather conditions typical of that cluster. The similarity between PV power curves registered in similar weather conditions is therefore exploited to construct optimized forecast models.

The aim of this paper is to assess whether it possible to improve the training of artificial neural networks for day-ahead PV power forecast by dividing a dataset through clustering techniques and, in the case of a positive answer, to identify the best-performing dataset partition in terms of forecast accuracy between the proposed ones.

2. Case Study and Procedure

Different combinations between clustering techniques and artificial neural networks are tested, validated, and compared on a real case study: a PV facility in SolarTech^LAB, at Politecnico di Milano [11]. However, the proposed procedure is valid for PV plants of all sizes. The available dataset contains historical data about measured power and predicted weather parameters, namely temperature, global horizontal irradiance, global plane-of-array irradiance and wind speed. The predicted weather parameters are provided as input to the proposed prediction models, whose output is compared with the measured power data to assess the forecast performance. Data are recorded on an hourly basis for a total amount of 840 days in the time span comprised between January 2017 and September 2020.

The overall PV power forecast process can be summarized as the iterative multi-step procedure represented in Figure 1. In the following, details about every single step are provided.

2.1. Clustering Phase

In the clustering phase, to obtain a proper partition, the daily clearness index (

K_{t}

) is employed in clustering as meaningful parameter for day type estimation [12,13,14]. It is defined as:

K_{t} = G / G_{0}

(1)

In the equation, G is the daily global horizontal irradiation, while

G_{0}

represents the corresponding daily extraterrestrial horizontal irradiation. Hence,

K_{t}

is a dimensionless quantity employed in day type clustering thanks to its capability to remove the seasonal dependence from solar irradiation, isolating the information content about weather conditions [9]. Large values of clearness index indicate clear sky conditions, while low values represent overcast sky conditions. Starting from the previously described dataset, the

K_{t}

value for each day is computed by means of the same procedure applied by ESRA (European Solar Radiation Atlas) [15].

Then, four different dataset division criteria are proposed, namely: FT-A, FT-B, KM-3, and KM-2. All the approaches are based on the clearness index

K_{t}

and, as previously mentioned, aim to divide the dataset in classes according to weather conditions of single days.

FT-A (Fixed Threshold set A) and FT-B (Fixed Threshold set B) are not properly clustering algorithms, but they perform a partition relying of fixed threshold values of clearness index defined in scientific literature [16,17]. In detail, they divide the dataset in three different weather classes based on the thresholds summarized Table 1.

Both KM-3 and KM-2 are based on the k-means clustering algorithm. The choice of k-means instead of other possible clustering algorithms is related to its simplicity in implementation and its efficiency. It is worth noticing that the application of k-means algorithm based on a single parameter (i.e., the clearness index) corresponds to a fixed-thresholds-based partition where the thresholds are set automatically by the algorithm instead of by an external intervention (as in FT-A and FT-B). The difference between KM-3 and KM-2 consists of the choice of the number of clusters (K). KM-3 adopts K = 3 for a homogeneous comparison with the fixed-thresholds-based approaches (i.e., FT-A and FT-B). KM-2 exploits some proper indexes to select the best possible dataset partition in terms of clustering quality, namely: Silhouette index, Davies-Bouldin index and Calinski–Harabasz index.

Given a generic dataset

X = {X_{1}, X_{2}, \dots, X_{N}}

, containing N elements and partitioned into K clusters

C = {C_{1}, C_{2}, \dots, C_{K}}

, these indexes can be computed as follows.

The Silhouette index [18] (computed as global value) is defined as:

S (C) = \frac{1}{N} \cdot \sum_{j = 1}^{K} \frac{1}{m_{j}} \cdot \sum_{i = 1}^{m_{j}} \frac{b_{i}^{j} - a_{i}^{j}}{max {a_{i}^{j}, b_{i}^{j}}}

(2)

In the equation:

m_{j}

is the number of elements in the generic cluster

C_{j}

;

a_{i}^{j}

is the average distance between the

i_{t h}

element in the cluster

C_{j}

and the other elements in the same cluster;

b_{i}^{j}

is the minimum average distance between the

i_{t h}

element in the cluster

C_{j}

and all the elements belonging to clusters

C_{k}

, with

k = {1, 2, \dots, K}

and

k \neq j

. The optimal number of clusters is the one that maximizes the value of Silhouette index.

The Davies-Bouldin index [18] is defined as:

D B (C) = \frac{1}{K} \cdot \sum_{i = 1}^{K} max_{i \neq j} \frac{Δ (C_{i}) - Δ (C_{j})}{δ (C_{i}, C_{j})}

(3)

In the equation:

Δ (C_{i})

is the within-cluster distance;

δ (C_{i}, C_{j})

is the between cluster distance. The optimal clustering solution is the one that minimizes the Davies-Bouldin index value.

The Calinski–Harabasz index [19] is defined as:

C H (C) = \frac{\sum_{i = 1}^{K} m_{i} \cdot | | G_{i} - G {| |}^{2}}{\sum_{i = 1}^{K} \sum_{X \in C_{i}}^{} | | X - G_{i} {| |}^{2}} \cdot \frac{N - K}{K - 1}

(4)

In the equation:

m_{i}

is the number of elements in the cluster

C_{i}

;

G_{i}

is the barycentre of the cluster

C_{i}

(in the case of k-means clustering, it corresponds to the centroid); and G is the barycentre of the entire dataset (the overall mean of the data). The optimal clustering solution is the one that maximizes the value of Calinski–Harabasz index.

These indexes are computed in function of different numbers of clusters and, applying a majority voting procedure, K = 2 is selected as the optimal dataset partition.

2.2. Extraction Phase

The extraction phase corresponds to the extraction of a test day, consisting of 24 consecutive hourly samples, from the initial dataset. This day constitutes the test set on which the prediction performance is computed. The cluster of origin of the extracted day is assumed to be unknown, as it would be in a real day-ahead power forecast. For a complete and reliable prediction performance assessment, all days available in the dataset are extracted one by one in different iterations.

2.3. Classification Phase

In the classification phase, the most suitable cluster for the test day is identified. Once labeled, the test day is assigned to the proper ANN, which perform the power prediction in the test day weather conditions. Therefore, this phase represents an additional step with respect to the single-network-based forecast, where the inputs are directly provided to the unique ANN available. As classifier, the random forest model is chosen, among all the possible algorithms, thanks to its flexibility, fast implementation, and easy tuning [20]. The classifier optimization consists of a proper selection of number of trees and input features based on out-of-bag classification error. The optimal configuration consists of a structure with 60 trees that takes global horizontal irradiance and global plane-of-array irradiance as input features.

2.4. Prediction Phase

Lastly, in the prediction step, different neural networks are developed to predict the PV power output in the extracted test days. Two different approaches are adopted, namely NN-Clust and NN-Std.

NN-Clust represents the clustering-based approach. In this approach, only days belonging to the same cluster of the test day are used or the training of each ANN. Then, the trained ANN predicts PV power output for the test day, characterized by weather conditions similar to those of samples involved in training. For the training of each ANN, 10% of samples contained in a given cluster is randomly extracted as validation set, while the remaining 90% constitutes the training set. Moreover, an ensemble of 10 independent trials is implemented to enhance the generalization capability of the model. To optimize the hidden layer size, a sensitivity analysis is carried out for every ANN corresponding to a different cluster. In practical terms, the sensitivity analysis studies the trade-off between performance and computational cost, analyzing the value of the Mean Square Error in function of a variable number of hidden neurons. The predicted weather parameters available in the dataset, namely temperature, global horizontal irradiance, and global plane-of-array irradiance and wind speed, are provided as input features to all the networks.

On the other hand, NN-Std represents the most common forecast approach in scientific literature, involving a single neural network, and it is developed for comparison with the previously described clustering-based approach. For the sake of a fair comparison, NN-Std must present several similarities with NN-Clust: same number of neurons in the hidden layer, same input features and same days predicted as test. The crucial difference between NN-Clust and NN-Std is that the latter is trained with days extracted from all clusters.

3. Error Metrics

Given a forecast output P and an observed output

\hat{P}

several error metrics are defined and adopted in this work for performance evaluation.

The Normalized Mean Absolute Error (NMAE) estimates the average magnitude of the errors for a set of N predictions divided by the plant net capacity C:

N M A E_{%} = \frac{1}{N} \cdot \sum_{h = 1}^{N} \frac{P_{h} - \hat{P_{h}}}{C} \cdot 100

(5)

The Root Mean Square Error (RMSE) is computed using the square of the difference between observed and predicted values, and therefore penalizes large gaps:

R M S E = \sqrt{\frac{1}{N} \cdot \sum_{h = 1}^{N} {(P_{h} - \hat{P_{h}})}^{2}}

(6)

The normalized Root Mean Square Error (nRMSE) corresponds to the ratio between RMSE and the maximum observed power output in the considered time frame:

n R M S E_{%} = \frac{R M S E}{max (P_{h})} \cdot 100

(7)

The Weighted Mean Absolute Error (WMAE) is based on the total energy production:

W M A E_{%} = \frac{\sum_{h = 1}^{N} | P_{h} - \hat{P_{h}} |}{\sum_{h = 1}^{N} P_{h}}

(8)

Finally, the Envelope-weighted Mean Absolute Error (EMAE), introduced in [21], aims to provide a measure of forecast accuracy in the interval between 0% and 100%:

E M A E_{%} = \frac{\sum_{h = 1}^{N} | P_{h} - \hat{P_{h}} |}{\sum_{h = 1}^{N} max (P_{h}, \hat{P_{h}}))}

(9)

4. Results and Discussion

The groups identified by the different dataset partitioning methods proposed are different and quite unbalanced in terms of numerosity, as reported in Table 2. In general, the cluster corresponding to sunny conditions is the largest while the others, in comparison, contain much less elements. The only exception is represented by FT-B, providing a more homogeneous grouping where sunny days and partially cloudy days clusters have comparable size. The numerosity of a cluster is relevant by the point of view of the forecast: ANNs trained using too few elements could present poor generalization performance.

Concerning the forecast accuracy, several comparisons are performed. First, the NN-Clust models developed are compared to the corresponding NN-Std to evaluate the performance enhancement allowed by the proposed methodology. The performance improvements computed according to all the evaluation metrics are reported in Table 3.

Independently from the error metric and the dataset partition considered, the approach involving clustering (i.e., NN-Clust) outperforms the one based on a single-network prediction (i.e., NN-Std). The largest improvement recorded consists of an error reduction of 7.9% in nRMSE with KM-3, while smallest one consists of an error reduction of 1.9% in RMSE with KM-2. Therefore, weather type clustering is demonstrated to be effective and beneficial when combined to ANN with the goal to optimize their training.

Then, a comparison between different dataset partitioning criteria, always in terms of prediction performance, is carried out and visually represented in Figure 2. The spider-web chart is represented normalizing all the error metrics, i.e., dividing them by the corresponding maximum recorded value.

Comparing all the approaches that divide the dataset in 3 clusters, it is observed that the clustering-based approach, i.e., KM-3, outperform both FT-B and FT-A, based on fixed threshold values of clearness index. Therefore, at equal number of clusters identified, the clustering-based methods exhibit better performance.

On the other hand, comparing the clustering-based approaches, i.e., KM-3 and KM-2, four error metrics out of five highlight the superiority of clustering method KM-3, even if the error reduction allowed with respect to KM-2 is always limited. This means that the optimal dataset partition in terms of clustering quality does not necessarily imply the best prediction performance of the forecast model.

Among all the dataset partitioning methods considered, KM-3 reveals to be the best-performing one by the point of view of forecast accuracy.

Lastly, the “best” and “worst” days in terms of forecast performance, corresponding respectively to minimum and maximum recorded values of EMAE, are extracted and analyzed for each cluster identified by KM-3, i.e., the best-performing partitioning criterion. For these days, the actual power curve (

P_{m}

) and the ones forecast by NN-Clust and NN-Std approaches are depicted and compared in Figure 3.

In the “best” case for sunny days, both NN-Clust and NN-Std approaches accurately approximate the smooth power curve typical of sunny days. The “best” partially cloudy day presents an actual power trend not as smooth as a typical sunny day, but not even much irregular. Indeed, this day shows one of the highest clearness index value (0.53) among the partially cloudy days cluster. The forecast curves accurately approximate the actual one except for a small region around the central hours of the day, where NN-Clust clearly outperforms NN-Std. The “best” cloudy day presents the irregular PV power trend typical of overcast sky conditions. NN-Clust outperforms NN-Std in terms of forecast error, but both models are capable of accurately approximating the actual trend.

The “worst” days always correspond to errors in weather forecast, when the real weather characteristics of a given day turned out to be completely different from the expected ones. In this condition, the forecast power either strongly overestimate or underestimate the measured one. It is, therefore, observed that with inaccurately predicted weather parameters in input to an ANN, the forecast performance exhibits a heavy deterioration.

5. Conclusions

With the increasing RES penetration in the energy mix, reliable forecast tools allow the prediction of the expected power production and its fluctuations, leading to a more efficient grid management. The current work focuses on PV power output prediction and proposes several combinations of ANNs and clustering techniques for an enhanced day-ahead forecast. The aim of this work is to assess whether it possible to improve the training of artificial neural networks for day-ahead PV power forecast by dividing a dataset through clustering techniques and, in the case of a positive answer, to identify the best-performing dataset partition in terms of forecast accuracy between the proposed ones. The methodologies proposed are tested and validated on a real case study, a PV facility located in Politecnico di Milano, the SolarTech^LAB.

The conclusions drawn from the analysis of the results are summarized in the following:

The proposed procedure, based on a day type clustering according to weather conditions, is beneficial for ANNs training. Indeed, the performance obtained with clustering-based approaches always outperform those of their non-clustering-based counterpart. The NN-Clust (clustering-based) approach based on KM-3, i.e., the best-performing combination, presents an improvement of 4.2% in EMAE with respect to the corresponding NN-Std (non-clustering-based) approach.
Comparing all the approaches identifying a constant number of clusters (i.e., FT-A, FT-B and KM-3, identifying K = 3 clusters), it is observed that the clustering-based partition is more effective than clearness-index-fixed-threshold-based ones in terms of forecast performance.
Comparing the clustering-based approaches (i.e., KM-3 and KM-2), it is observed that the optimal dataset partition in terms of cluster quality do not necessarily lead to the best forecast result. Therefore, a partition showing good scores according to a quality evaluation criterion do not necessarily imply a good effectiveness in an application.
The forecast performance is strongly influenced by the inaccuracies in weather parameters prediction, which can heavily affect the final result.

Author Contributions

Conceptualization, A.M., E.O. and A.N.; data curation, A.N.; formal analysis, A.M.; methodology, A.M. and E.O.; software, A.M.; supervision, E.O. and A.N.; visualization, A.M.; writing—original draft, A.M.; writing—review and editing, A.M., E.O. and A.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Voyant, C.; Notton, G.; Kalogirou, S.; Nivet, M.L.; Paoli, C.; Motte, F.; Fouilloy, A. Machine learning methods for solar radiation forecasting: A review. Renew. Energy 2017, 105, 569–582. [Google Scholar] [CrossRef]
Mellit, A.; Pavan, A.M.; Ogliari, E.; Leva, S.; Lughi, V. Advanced methods for photovoltaic output power forecasting: A review. Appl. Sci. 2020, 10, 487. [Google Scholar] [CrossRef] [Green Version]
Sobri, S.; Koohi-Kamali, S.; Rahim, N.A. Solar photovoltaic generation forecasting methods: A review. Energy Convers. Manag. 2018, 156, 459–497. [Google Scholar] [CrossRef]
Ahmed, R.; Sreeram, V.; Mishra, Y.; Arif, M.D. A review and evaluation of the state-of-the-art in PV solar power forecasting: Techniques and optimization. Renew. Sustain. Energy Rev. 2020, 124, 109792. [Google Scholar] [CrossRef]
Yang, D.; Kleissl, J.; Gueymard, C.A.; Pedro, H.T.; Coimbra, C.F. History and trends in solar irradiance and PV power forecasting: A preliminary assessment and review using text mining. Sol. Energy 2018, 168, 60–101. [Google Scholar] [CrossRef]
Larson, V.E. Forecasting Solar Irradiance with Numerical Weather Prediction Models. Sol. Energy Forecast. Resour. Assess. 2013, 299–318. [Google Scholar]
Guermoui, M.; Melgani, F.; Gairaa, K.; Mekhalfi, M.L. A comprehensive review of hybrid models for solar radiation forecasting. J. Clean. Prod. 2020, 258, 120357. [Google Scholar] [CrossRef]
Ogliari, E.; Dolara, A.; Manzolini, G.; Leva, S. Physical and hybrid methods comparison for the day ahead PV output power forecast. Renew. Energy 2017, 113, 11–21. [Google Scholar] [CrossRef]
Jimenez-Perez, P.F.; Mora-Lopez, L. Modeling and forecasting hourly global solar radiation using clustering and classification techniques. Sol. Energy 2016, 135, 682–691. [Google Scholar] [CrossRef]
Xu, R.; Wunsch, D. Survey of clustering algorithms. IEEE Trans. Neural Netw. 2005, 16, 645–678. [Google Scholar] [CrossRef] [PubMed] [Green Version]
SolarTech Lab. Available online: http://www.solartech.polimi.it (accessed on 16 April 2021).
Udo, S.O. Sky conditions at Ilorin as characterized by clearness index and relative sunshine. Sol. Energy 2000, 69, 45–53. [Google Scholar] [CrossRef]
Page, J. The Role of Solar-Radiation Climatology in the Design of Photovoltaic Systems. In Practical Handbook of Photovoltaics; McEvoy, A., Markvart, T., Castañer, L., Eds.; Academic Press: Cambridge, MA, USA, 2012; pp. 573–643. [Google Scholar]
Brownson, J.R. Measure and Estimation of the Solar Resource. In Solar Energy Conversion Systems; Brownson, J.R., Ed.; Academic Press: Cambridge, MA, USA, 2014; pp. 199–235. [Google Scholar]
Scharmer, K.; Greif, J. The European Solar Radiation Atlas: Fundamentals and Maps; Presses des Mines: Paris, France, 2000; Volume 1. [Google Scholar]
Leva, S.; Dolara, A.; Grimaccia, F.; Mussetta, M.; Ogliari, E. Analysis and validation of 24 hours ahead neural network forecasting of photovoltaic output power. Math. Comput. Simul. 2017, 131, 88–100. [Google Scholar] [CrossRef] [Green Version]
Kudish, A.I.; Ianetz, A. Analysis of daily clearness index, global and beam radiation for Beer Sheva, Israel: Partition according to day type and statistical analysis. Energy Convers. Manag. 1996, 37, 405–416. [Google Scholar] [CrossRef]
Petrovic, S. A Comparison Between the Silhouette Index and the Davies-Bouldin Index in Labelling IDS Clusters. In Proceedings of the 11th Nordic Workshop on Secure IT-Systems, Linköping, Sweden, 19–20 October 2006; pp. 53–64. [Google Scholar]
Chowdhury, S.A.; Riccardi, G.; Alam, F. Unsupervised Recognition and Clustering of Speech Overlaps in Spoken Conversations. In Proceedings of the Workshop on Speech, Language and Audio in Multimedia (SLAM2014), Penang, Malaysia, 11–12 September 2014; pp. 62–66. [Google Scholar]
Cutler, A.; Cutler, D.R.; Stevens, J.R. Random Forests. In Ensemble Machine Learning; Zhang, C., Ma, Y., Eds.; Springer: Berlin, Germany, 2012; pp. 157–175. [Google Scholar]
Dolara, A.; Grimaccia, F.; Leva, S.; Mussetta, M.; Ogliari, E. Comparison of training approaches for photovoltaic forecasts by means of machine learning. Appl. Sci. 2018, 8, 228. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Proposed forecast procedure.

Figure 2. Comparisons between different partitioning methods.

Figure 3. Forecast and actual power curves in: “best” (a) and “worst” (b) sunny days, “best” (c) and “worst” (d) partially cloudy days, and “best” (e) and “worst” (f) cloudy days.

Table 1. Clearness index partition.

Weather Conditions	FT-A	FT-B
Sunny	$K_{t}$ > 0.45	$K_{t}$ > 0.65
Partially cloudy	0.25 < $K_{t}$ < 0.45	0.35 < $K_{t}$ < 0.65
Cloudy	$K_{t}$ < 0.25	$K_{t}$ < 0.35

Table 2. Clusters numerosity with different partitions.

Weather Conditions	FT-A	FT-B	KM-3	KM-2
Sunny days	641	339	511	618
Partially cloudy days	125	369	193	-
Cloudy days	74	132	136	222
Total	840	840	840	840

Table 3. Performance improvement given by NN-Clust with respect to NN-Std.

Method	Cluster	$Δ$ NMAE	$Δ$ RMSE	$Δ$ nRMSE	$Δ$ WMAE	$Δ$ EMAE
KM-3	3	6.0%	3.9%	7.9%	7.0%	4.2%
KM-2	2	4.6%	1.9%	6.5%	6.0%	3.4%
FT-A	3	3.9%	2.0%	5.4%	5.2%	3.2%
FT-B	3	5.8%	3.3%	7.1%	6.1%	3.9%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Matteri, A.; Ogliari, E.; Nespoli, A. Enhanced Day-Ahead PV Power Forecast: Dataset Clustering for an Effective Artificial Neural Network Training. Eng. Proc. 2021, 5, 16. https://doi.org/10.3390/engproc2021005016

AMA Style

Matteri A, Ogliari E, Nespoli A. Enhanced Day-Ahead PV Power Forecast: Dataset Clustering for an Effective Artificial Neural Network Training. Engineering Proceedings. 2021; 5(1):16. https://doi.org/10.3390/engproc2021005016

Chicago/Turabian Style

Matteri, Andrea, Emanuele Ogliari, and Alfredo Nespoli. 2021. "Enhanced Day-Ahead PV Power Forecast: Dataset Clustering for an Effective Artificial Neural Network Training" Engineering Proceedings 5, no. 1: 16. https://doi.org/10.3390/engproc2021005016

APA Style

Matteri, A., Ogliari, E., & Nespoli, A. (2021). Enhanced Day-Ahead PV Power Forecast: Dataset Clustering for an Effective Artificial Neural Network Training. Engineering Proceedings, 5(1), 16. https://doi.org/10.3390/engproc2021005016

Article Menu

Enhanced Day-Ahead PV Power Forecast: Dataset Clustering for an Effective Artificial Neural Network Training^†

Abstract

1. Introduction