1. Introduction
The growth of the world population, coupled with the accelerated depletion of fossil fuel reserves, which, at the current rate of exploitation, are expected to be completely exhausted within a few centuries [
1], has motivated researchers to evaluate the impact of integrating sources of clean and environmentally friendly energy, such as solar and wind, into electric power systems [
2]. In line with this, the United Nations’ 2030 Agenda established 17 Sustainable Development Goals, one of which is Affordable and Clean Energy. This goal encourages countries to invest in renewable energy, setting targets such as increasing the share of renewable energy in the global energy mix and promoting international cooperation to facilitate access to clean energy research and technology [
3].
Wind energy represents an inexhaustible natural resource with the potential to combat climate change [
2]. By 2023, the total installed global wind capacity reached 1021 GW, reflecting a growth of 13% compared to 2022 [
4]. In Ecuador, the current regulatory framework includes the utilization of renewable energy sources, with a short-term feasible wind potential of 884 MW. To develop new renewable generation projects, a policy was established that allows the implementation of power and energy blocks from various primary sources, to be covered by projects proposed by the private sector [
5] through Public Selection Processes (PPSs). Within this context, the PPS for the Non-Conventional Renewable Energy Block (ERNC I) was launched and fully awarded in 2023. The block comprises six photovoltaic solar projects, three hydroelectric projects, and one wind farm project, totaling 511 MW with private investment exceeding USD 800 million [
6]. One of the awarded projects is the Yanahurcu wind farm project, with 52.8 MW installed capacity and 44.81 MW effective capacity, to be installed in the province of Loja, at a price of 60.63 USD/MWh [
7].
As regards wind farm projects, the estimation of energy production is critically important for the successful planning and operation of renewable energy systems, and stochastic analysis plays a key role in enhancing the accuracy of these estimates. Wind power is inherently variable and influenced by several unpredictable factors, such as meteorological conditions and grid topology availability, which introduces significant uncertainty into production forecasts. Traditional deterministic models fall short of capturing this variability, often leading to mismatches between expected and actual output. Stochastic methods, on the other hand, generate multiple scenarios of wind power production based on probability distributions of wind speeds, wind directions, and other variables, offering a more realistic and comprehensive understanding of possible outcomes. By incorporating stochastic analysis, decision-makers can better manage the risks associated with wind farm production variability, optimize system performance, and improve grid integration. Moreover, it allows for the assessment of different weather scenarios and their impact on supply–demand balance, enhancing the resilience and reliability of power systems.
A method that integrates wind power and temperature correlations to generate stochastic scenarios for wind power production is presented in [
8]. It emphasizes capturing both regular and extreme conditions, such as multi-day periods without wind, to perform supply–demand balance analysis. This method helps in improving the accuracy of long-term wind power forecasts and optimizing planning under stochastic conditions. In [
9], a hybrid method using Vector Autoregressive Moving Average (VARMA) models and copula techniques to generate wind power scenarios with spatiotemporal correlations is proposed. This approach effectively captures both the spatial and temporal dependencies between multiple wind farms, offering a robust solution for short-term wind power forecasting. A comprehensive review of various probabilistic forecasting models used in wind power generation is stated in [
10]. It discusses the differences between deterministic and probabilistic models, with a particular focus on how the latter can capture the inherent uncertainty in wind energy production. The review also identifies future research directions in the field of stochastic wind. All these contributions allow researchers to face the problem of stochastic wind farm energy production forecasting, showing important advances in this field; however, most of the approaches are not yet included in mass-market commercial software capable of being used for performing confident enough studies related to the connection of energy renewable projects. In this regard, the Ecuadorian entity responsible for planning the operation of power generation, the National Electricity Operator CENACE, optimizes economic dispatch in medium- and long-term scenarios using the SDDP (Stochastic Dual Dynamic Programming) tool, a modularly licensed software that requires significant investment [
11]. As a free software alternative, the Instituto de Energía Eléctrica in Uruguay has developed the SimSEE platform for the Simulation of Electric Power Systems, which obtains an Optimal Operation Policy through the resolution of a Stochastic Dynamic Programming (SDP) problem [
12]. Whereas SDDP allows researchers to manage the problem of sizing large volumes of decision variables and restrictions, SimSEE uses an SDP algorithm and offers the possibility of implementing user models, so the versatility of modeling new energy sources is greater. Although these two packages have a great modeling and simulation capacity, SimSEE surpasses SDDP in three important aspects: (i) it is an open-source free software, (ii) it is capable of accepting user-defined models, and (iii) it is more flexible for including intermittent renewable generation such as wind farms and photovoltaic plants, based on its Histogram-Based Gaussian Space Correlation (CEGH) model. In this connection, SimSEE is the selected software to develop the proposed methodology, since this software has the capabilities and enough references to be accepted by the Electricity Regulation and Control Agency in Ecuador (ARCONEL) as a replacement of SDDP.
In [
11], a database has been created containing all the power plants of the National Interconnected System (SNI), modeled with parameters provided by CENACE to simulate a single-node dispatch over a 10-year horizon.
Based on the aforementioned, this paper evaluates the power generation of the Yanahurcu wind project by integrating the plant into the SimSEE database. A CEGH model is generated from wind data to analyze the wind resource in the project’s installation area. For this aim, a data reanalysis strategy is first applied to obtain corrected data. Subsequently, simulations are carried out with different time horizons, thus obtaining the estimated production of the plant, which will finally be analyzed from a probabilistic perspective. The main contribution of this paper is the integration of a free tool like SimSEE with an open-access meteorological database like MERRA2. This enables researchers to conduct renewable energy studies even when measurement data are not available, orienting the forecasting to the development of actual wind farm projects since SimSEE can be accepted by ARCONEL as a replacement of SDDP.
3. Materials and Methods
To account the stochastic nature of the wind in simulations, it is necessary to define a CEGH model created from measurement data, or, in this case, historical data obtained through reanalysis tools, which have been pre-processed to improve result quality. Then, a model of the wind farm must be generated in SimSEE, with two options: one representing wind speed as a module and another representing the two components of wind speed, allowing the representation of its direction. For simulations where the dispatch problem is solved using SDP, two databases are generated in SimSEE in order to represent different case studies that aim to emulate the dispatch stages performed by the system operator [
17]. Finally, a probabilistic analysis of the results is performed to evaluate the energy impact of the wind project’s connection. This methodology is described in the flowchart illustrated in
Figure 2. It is important to highlight that this comprehensive methodology constitutes an important practical contribution to the current practices of evaluating wind farm power forecasts using mass-market commercial software capable of being used for performing confident enough studies related to the connection of energy renewable projects when measured data are not available. For this aim, a CEGH model is generated based on a proposed data reanalysis strategy that is first applied to obtain corrected data. Thus, the main contribution is defining a methodology to integrate a free tool like SimSEE with an open-access meteorological database like MERRA2, ensuring confident enough data correction. With this, the power forecasting of actual wind farm projects is also allowed since SimSEE might be accepted as a replacement of SDDP or other commercial software.
3.1. Analysis of Historical Wind Data
Wind measurement data at the installation site were not available for modeling the wind resource, so the POWER DAVe platform was used to access meteorological information from the MERRA-2 reanalysis database. With the approximate location of the project as the known datum, it is possible to obtain wind speed and direction data, using the POWER DAVe website. The available data correspond to wind speed, in meters per second, at 50 m above the ground level and wind direction, in degrees. These data are first reanalyzed to define the input data for the stochastic wind farm power production forecast. To carry out reliable energy analyses, it is necessary to have data series spanning 25 to 30 years. However, since this historical information was unavailable, the recommendation is to have data for periods ranging from 5 to 10 years as a minimum [
18]. Therefore, it was established that the historical data used would be from 2015 to 2023.
Subsequently, three corrections are applied to the historical data, following the procedure described in [
19]:
Vertical Wind Speed Profile Correction (1), related to the height above the ground at which the wind speed value was obtained and the height at which the wind turbine hub will be located.
where
is the height at which MERRA-2 data were obtained;
is the turbine hub height;
is the surface roughness coefficient;
is the wind speed at height
; and
is the wind speed at height
.
Air Density Correction (2), which adjusts the atmospheric pressure at the wind farm height.
where
is the air pressure at the reanalysis data height
;
is the air pressure at the actual project site height
; and
is a factor calculated as indicated in Equation (3).
Bias Correction or Downscaling (4), which corrects the bias of reanalysis data by adjusting them around a known mean [
20].
Parameters
and
are obtained as follows:
where
is the known average speed for the project installation site, and
corresponds to each wind speed value obtained from MERRA-2.
Additionally, it is necessary to decompose the wind into the Vx and Vy components that SimSEE uses as the wind source for the actor considering wind direction.
With the corrected data, the next phase is to analyze them statistically. Two classifications are distinguished: hourly analysis and monthly analysis. For the hourly analysis, clusters were identified to group hours with similar behavior using a hierarchical clustering procedure, resulting in the dendrogram shown in
Figure 3. The statistical measures calculated for each hourly cluster, for the hourly analysis, and for each month in the monthly analysis were mean, median, standard deviation, kurtosis, skewness coefficient, shape, and scale parameters for the Weibull distribution.
3.2. Modeling in SimSEE
To simulate the behavior of the wind farm, it is necessary to have a wind source. In this case, the source will be of the CEGH Synthesizer type, associated with a CEGH model generated from the historical data. The Serial Analysis tool is used to create the wind source, generating a .txt file with the wind CEGH model from the historical data. Once this file is obtained, it is loaded into the SimSEE Edit database, and the corresponding wind source is generated. Since the wind farm was modeled using two different actors, two sources had to be created, considering the input variables for each actor. First, the SimSEE wind farm actor allows for the incorporation of a source with two terminals: wind speed [m/s] and temperature [°C]. On the other hand, the SimSEE wind farm Vxy actor allows for the incorporation of a source with two terminals, Vx and Vy, which are the wind speed components in [m/s].
3.3. Definition of Case Studies
Considering the stages of operation scheduling, four case studies have been defined. For each case study, specific parameters will be considered, such as different time steps and time horizons, which allow each planning stage to be better represented.
3.3.1. Case 1: Daily Dispatch
Simulating the daily production of a wind farm can be useful for observing resource behavior throughout the day. Although intermittent energy plants do not participate in solving the economic dispatch problem, it is still valuable to estimate their expected production during peak and valley hours. The results were analyzed for days with maximum and minimum resource availability in a year. This case should be simulated with an hourly time step, so the simulation was carried out in a simplified database where only the wind farm and a demand are modeled. The simulation time horizon was set to one year (2026), and an additional year was considered for the optimization of the time horizon.
3.3.2. Case 2: Weekly Dispatch
This type of dispatch is used for Unit Commitment. In this case, the wind farm’s production was analyzed over a week of high wind resource availability and another week of low availability. It was also developed in the simplified database. Additionally, daily simulation blocks were established. The duration of the posts was set according to the hourly clustering of the wind data. The same optimization and simulation horizons as in the previous case were used.
3.3.3. Case 3: Monthly Dispatch for One Year
For this case, the annual production was analyzed with an appropriate simulation time step. The results allow the identification of production periods throughout the year and estimate the annual energy output of the wind farm. Additionally, this type of study helps determine the fuel requirements for thermal power plants and establish the maintenance schedule for power system components. The simulations were carried out in the SimSEE database that contains the entire National Interconnected System (SNI). Weekly simulation steps were considered, where the hourly duration of each post was multiplied by 7 to represent the 7 days of the week. The same optimization and simulation horizons as in the previous cases were used.
3.3.4. Case 4: Long-Term Dispatch
Long-term dispatch was used to analyze the wind farm’s production over 25 years with an appropriate time step. This long-term planning helps determine the needs to satisfy demand and to implement system expansion plans. As in the previous case, this case was developed in the complete SNI database, where monthly simulation posts were established. For the time horizons, 25 years were considered for the simulation and 29 years for the optimization.
3.4. Simulation in SimSEE
Once the simulation parameters for each case study have been established, the simulation is run from the SimSEE Edit Simulator window. First, the optimization process is executed to obtain the Optimal Operation Policy. Then, the simulation is executed, which has a shorter duration than the previous process.
3.5. Probabilistic Analysis of Results
After obtaining the simulation results for each of the case studies, a specific treatment of the output data from each case is necessary.
3.5.1. Determination of Days and Weeks of Maximum and Minimum Energy Production
For case studies 1 and 2, it is necessary to determine, based on the results for a year, the corresponding days and weeks of highest and lowest energy production.
3.5.2. Obtaining Monthly Clusters
The hierarchical clustering procedure was used to determine two periods in the year: one of high wind resource availability and another of low availability. Considering that planning in Ecuador distinguishes between two semiannual periods in the year (rainy and dry), the aim was to establish something similar for the wind resource in the project installation area, based on the results of case study 4.
3.5.3. Exceedance Probability
The exceedance probability refers to the probability of exceeding a certain value over a given period [
21], and it can be obtained using Equation (7).
where
is the probability that the random variable X exceeds the value x, and
is the cumulative probability distribution function.
Thus, for each case study, the exceedance probability of the energy produced during each analyzed period was assessed, obtaining the adjusted curves based on the results provided by SimSEE.
4. Results
The statistical analysis of the historical wind data provided useful information about the behavior of this resource in the project installation area. In
Figure 4a, the hourly behavior throughout the day is illustrated using the clusters obtained according to
Figure 3, and monthly behavior throughout the year is also illustrated in
Figure 4b. This plot allows for a better appreciation of the statistical measures than simply presenting the numerical results.
Therefore, the results of this analysis provide insights into wind behavior in the project area and allow researchers to assess whether the conditions are favorable for installing a wind power plant. This constitutes an important contribution to generation expansion planning in Ecuador.
Subsequently, the energy production results obtained through the SimSEE simulation are presented. First, case study 1 is presented, where the exceedance probability for the daily energy of the Wind farm Vxy model is shown in
Figure 5.
The values obtained from the original and fitted curves were extracted and are shown in
Table 1, where the energy corresponding to the 50% and 90% exceedance probabilities for the days of maximum and minimum production, respectively, is provided.
These daily energy values provide power system operators with the estimated daily production limits for the wind plant, which should be considered into the daily dispatch.
With the results from case 4, hierarchical clustering was performed, whose dendrogram is illustrated in
Figure 6. Since the results show two clear wind production periods per year, a suitable cutoff line was chosen, resulting in one cluster formed by the months of May to September and the other by the months of October to April.
The identified periods closely resemble those of the country’s hydrology. Therefore, maintenance planning for this wind plant should align with the maintenance schedules of hydroelectric plants. In addition, during these periods of maintenance, fuel for thermal plants should be secured to ensure a reliable energy supply.
Thus, for case 3, the annual production as well as the production for each of the monthly periods was analyzed. In
Figure 7, the adjusted exceedance probability curves for these three periods are presented, both for the wind farm model and for the wind farm Vxy model.
In
Figure 8, the expected production for the entire life of the plant is compared between the two SimSEE wind farm models.
With the results of the entire life of the plant, the expected annual energy, full load equivalent hours, and capacity factor, which are important parameters when evaluating power generation projects, can be calculated, as shown in
Table 2.
This information should be factored into system expansion plans to forecast future energy requirements, considering the estimated energy production from existing plants with a certain level of confidence.
5. Discussion and Conclusions
This paper provides a comprehensive solution for conducting energy impact studies of intermittent renewable energy production when measurement data are unavailable, using free software tools, which is a significant advantage that allows for broad application.
To this end, a methodology has been developed that allows for the estimation of expected renewable generation output considering the stochasticity of the primary resource, demonstrating good performance for the analyzed wind farm project. The main contribution is defining a methodology to integrate a free tool like SimSEE with an open-access meteorological database like MERRA2, ensuring confident enough data correction. With this, the power forecasting of actual wind farm projects is allowed since SimSEE might be accepted as a replacement of SDDP or other commercial software. In this connection, it is important to mention that, although there are other novel proposals for wind farm stochastic power forecasting, such as the application of VARMA or copula, the proposed methodology is focus on using a well-known software since the purpose is to define a procedure capable of being accepted for assessing actual renewable projects in Ecuador.
As a main result, it has been demonstrated that the connection of the Yanahurcu wind farm project will have a considerable impact on the Ecuadorian electric power system, since this plant will deliver 211.35 GWh/year, which equals the wind energy production in Ecuador in 2023; this means that through the implementation of this project, the current annual production from wind sources would be doubled.
The proposed methodology has been developed as part of the project PII-DEE-2023-04, “Analysis of steady and dynamic state microgrid operation considering the implementation of Battery Energy Storage Systems (BESS)”. Further research is under development for evaluating the performance of this methodology with other types of renewable sources.