1. Introduction
The use of wind energy has increased significantly since 2000, reaching 906 GW of installed capacity in 2022 [
1]. As a result, electricity production from wind power has increased significantly and accounted for approximately 7.5% of global electricity production in 2022 [
2]. According to data from the IRENA laboratory, global wind energy generation capacity has increased approximately 98 times over the last two decades [
3]. In the case of onshore wind turbines, the installed capacity increased from 22 GW in 2001 to 842 GW in 2022. Meanwhile, the installed capacity of offshore wind farms increased from zero in 2002 to 64 GW in 2022 [
1].
Newly installed wind turbines are mainly three-blade horizontal axis units (HAWT), with a highest efficiency of approximately 50% [
4]. It should be added that in the coming years a very rapid increase in new capacity installed in both onshore and offshore wind farms is expected [
1,
5].
Despite these optimistic forecasts of an increase in the number of new installations, wind farms are performing worse than producers expected. Results from existing wind farms indicate that capacity factors are often overestimated by 10% to 30%. The average realized value for Europe in 2004–2009 was less than 21%, which reduced expected profits by over 60% and resulted in a 40% lower than expected reduction in CO
2 emissions [
6].
One of the key reasons for this is the underestimation of the deterioration of the aerodynamic properties of turbine blades over time, caused by changes in their roughness due to erosion and contamination with foreign bodies, peeling of the coating, icing, as well as the wind energy deficit in the aerodynamic cross-section behind the wind turbines.
The paper in [
7] showed that wind turbine blades can accumulate significant leading edge roughness, which adversely affects aerodynamic parameters. The results showed that increasing roughness decreased lift and increased drag force. Similarly, in the work in [
8], it was shown that the periodic change in the roughness of the airfoils was caused by the accumulation of dead insects on the leading edge, which caused a drop in electricity production by as much as 25%, despite good wind conditions. This was caused by accelerated transition and separation of boundary layers compared to clean airfoils.
In the work in [
9,
10], field experiments were carried out to investigate the accumulation of ice on 50 m long turbine blades and the power losses in the electricity production of multi-megawatt wind turbines caused by icing. It was shown that, despite strong winds, icy wind turbines rotated much slower and even shut down frequently. As a consequence, the power loss caused by icing was reduced by up to 80%.
The review paper in [
11] summarized and comprehensively analyzed typical types of damage to wind turbine blades, such as trailing edge cracking, lightning strike, leading edge corrosion contamination, icing, and delamination, as well as the mechanisms of their formation. Their work can serve as a guide to analyzing changes in the shape and roughness of aerodynamic profiles during the operation of wind turbines. Based on the information provided by the review paper, in the work in [
12], the shapes of icy and eroded airfoil profiles were recreated and used for calculations and estimation of their impact on the generation of aerodynamic forces. It was shown that in the case of ice-covered, dirty, and eroded turbine blades, the power production can decrease by approximately 50%, 16%, and 12.5%, respectively.
The above literature review draws attention to a very important aspect related to permanent or temporary change in the shape of profiles during the use of a wind turbine and the need to take into account their impact on the generation of electricity.
In this context, it is still important to study the aerodynamic properties of the airfoils that make up turbine blades, especially in the context of reducing the impact of roughness changes on their operation, as well as the possibility of developing new aerodynamic shapes with specific properties. Moreover, this may also be a very interesting and important research topic regarding the use of artificial intelligence algorithms.
In recent years, many articles have been published related to the use of artificial intelligence to research turbines and wind farms, especially in the context of the creation of aerodynamic forces by airfoils. In this context, the authors of various works have used various machine learning methods, such as linear regression, support vector, radial basis function, K-nearest neighbors, decision tree, gradient boosting tree, random forest, AdaBoost, mutli-gene genetic programming, and neural networks. In most cases, the use of so many methods was motivated by checking their efficiency and suitability for specific research problems related to the performance of aerodynamic shapes.
In the research presented in [
13], an investigation of automatic programming to predict the aerodynamic coefficients and power efficiency of the AH 93-W-145 wind turbine blade at different Reynolds numbers and angles of attack was conducted. Their results showed that the multi gene GP method achieved the highest accuracy. Meanwhile, the article [
14] focused on an intensive comparative study on the approximation performance of three artificial intelligence methods: an artificial neural network, radial basis function, and support vector regression. Finally, the support vector regression model was selected and combined with the Monte Carlo simulation method for uncertainty analysis of a wind turbine airfoil. It was shown that this algorithm could capture the uncertainty propagation from the surface roughness to the airfoil aerodynamic performance.
In the work in [
15], the Elman neural network method was used to estimate the aerodynamic coefficients of a delta wing. Meanwhile, in the research reported in the article [
16], a neural network was used to predict the aerodynamic forces acting on a NACA 2415 airfoil. The results showed a good estimate of the lift force and a slightly poorer estimate of the drag force.
Especially important in the context of the current study is the investigation presented in the paper [
17], where five machine learning algorithms were studied in the context of lift and drag force prediction: random forest, gradient boosting regression, decision tree regressor, AdaBoost algorithm, and linear regression. It was shown that the random forest method was superior in its predictive performance. Similarly, the article in [
18] examined a Darrieus-type wind turbine with standard and modified S1046 airfoils and found that of four different machine learning models, decision tree, linear regression, K-nearest neighbors, and random forest, the last performed the best.
The above literature review shows that the prediction of airfoil aerodynamic properties is most promising when performed using random forest regression and that in most cases a graphical representation of airfoil shapes was used as input. A possible alternative to the graphical representation of an airfoil may be the use of dedicated numerical parameterization, which unambiguously describes its shape and can be applied to a wide range of airfoils of various types. One example of such parameterization is PARSEC, which defines 11 parameters describing the shape of an aerodynamic profile [
19,
20].
The main novelty of the current research is the use of PARSEC parameterization together with the random forest regression algorithm to determine the ability to predict the lift force of the aerodynamic shape described by this parameterization. Although the research focused only on the lift force for a given attack angle, this indicates a potential direction in developing a methodology for proposing new aerodynamic shapes with the desired aerodynamic characteristics.
3. Results and Discussion
The random forest regressor method solves regression tasks by minimizing the mean squared error formulated as
where
N is the number of data points,
is the value of
predicted by the RFR algorithm, and
is the actual value of
(calculated by the
XFOIL software).
In order to check the predictive ability of the used RFR model, the considered dataset was divided into two subsets, the first for training purposes, the second for testing. The training set had 60% and the test set had 40% of the records of the base dataset.
Table 3 presents sample results of the comparison of the actual values and the values predicted by the RFR algorithm of the lift coefficient
of airfoils described by the PARSEC parameters presented in
Table 2. Randomly selected sample results from
Table 3 indicate that the method used has the potential to produce satisfactory outcomes. It should be noted that both the relative error, marked in the table as
Rel Error, varied in a range from 0.0001 to 0.0009, while the absolute error
Abs Error varied in a range from 0.0002 to 0.0011. Both of these readings were at a satisfactory level.
In this work, the considered predictive RFR algorithm was used to run five different cases with airfoils randomly assigned to the training and testing datasets, assuming 60% and 40% to the corresponding sets. The prediction results of the RFR model for each case are presented in
Figure 5,
Figure 6,
Figure 7,
Figure 8 and
Figure 9. In the following figures, the results are arranged in ascending order according to the values of the lift coefficients for the considered airfoils. Attention should be paid to the significant range of lift force coefficients, from approximately 0.6 to almost 2.2, and the non-linear character of the curve. This means that the profiles considered in the study were characterized by a considerably large variety of shapes and, consequently, their usage. It can be seen that the prediction algorithm performed best in the linear part of the curve and worst in the area of low
values, which is particularly visible in
Figure 5.
Summarizing the results from
Figure 5,
Figure 6,
Figure 7,
Figure 8 and
Figure 9, it can be seen that the choice of aerodynamic profiles for the training set had a certain impact on the quality of the results obtained. For example, comparing
Figure 6 and
Figure 7 with
Figure 5 and
Figure 8, one can notice a difference in the quality of predicting the
value within large values of this coefficient. This may mean that in order to reduce the sensitivity of the model to the selection of elements of the training set, the absolute number of airfoils in the training set should be further increased.
In order to check the predictive power of the model, the coefficient of determination
, was defined as
where
is the arithmetic mean of the predicted values of
by the RFR algorithm (for explanation of the other symbols see Equation (
2)).
The values of
calculated using Equation (
3) for each considered case are presented in
Table 4. It can be seen that the value of the
coefficient was approximately 0.85. The obtained
values agreed with the values reported in [
17], which ranged from 0.944 to 0.863, depending on the ratio of the size of the training set and the test set (a train/test ratio ranging from 0.2 to 0.8 was tested). It should be added that, in the work [
17], the aerodynamic airfoils were represented graphically and not as a set of parameters as in the current work, and that the train/test ratio in the current work was 1.5.
On this basis, it can be concluded that the RFR machine learning method coped satisfactorily with predicting the lift force, even though the training set consisted of airfoils with very different geometries and was represented by 11 independent PARSEC parameters.
It should be emphasized that the methodology used was relatively simple, and yet it produced promising results.
Figure 10 shows that about 60% of predictions had a relative error below 5%, and more than 85% had a relative error below 10%. In
Figure 5,
Figure 6,
Figure 7,
Figure 8 and
Figure 9, it can be seen that the results with the worst predictability occurred quite sporadically and are randomly distributed in the middle part of the presented curves. As was mentioned before, the few cases with higher relative errors may have resulted from the large diversity of the geometries of the considered airfoils, which, despite a relatively large dataset, could have led to a situation where some characteristic geometric features were not represented an appropriate number of times. With this in mind, it can be concluded that the dataset should be selected so that each PARSEC parameter is represented an appropriate number of times in an appropriately wide range.
To find which of the PARSEC parameters could potentially have the greatest impact on the quality of prediction, selected statistics from 5% of profiles with the largest relative prediction error and 5% of profiles with the smallest prediction error were compared. These values of the statistics are listed in
Table 5 and
Table 6, respectively.
Table 7 shows the relative comparison of the selected statistics of the 5% worst and 5% best predictions calculated according to the following formulas:
where
is the value of individual PARSEC parameters from
Table 5 for a given statistic:
min,
max,
median,
mean, and
is the value of the corresponding PARSEC parameter from
Table 6. Note that Equation (
4) is in the form of a relative error formula, which can be interpreted that the largest values correspond to the greatest impact of a given PARSEC parameter on the prediction accuracy. A particularly influential PARSEC parameter for the prediction quality is one for which all the values of
are simultaneously relatively large. One such parameters is
, which is responsible for the thickness of the trailing edge of the airfoil, see
Figure 3.
On the contrary, the PARSEC parameters having the least impact on the quality of predictability by the RFR algorithm are those for which all values
are simultaneously relatively small. Analyzing the data collected in
Table 7, such parameters include the pairs of
and
, and
and
, which correspond to the location of the curvature of the lower and upper crest of the airfoil, see
Figure 3.
4. Conclusions
This work used a machine learning algorithm based on the random forest regression method to predict the lift force of aerodynamic airfoils. In the study, 688 airfoils belonging to different families and characterized by very diverse geometries were used as a dataset. The PARSEC parameterization was used to describe the shape of the airfoils, resulting in 11 independent parameters that were used as input to the machine learning algorithm.
A series of calculations were performed for the selected Reynolds number and angle of attack. Despite very diverse geometries, the prediction results turned out to be satisfactory, giving in the range between 0.83 and 0.87. This means that the random forest regression method can be used for these types of calculations and predictions.
It can be concluded that the presented research could be successfully extended to a wide range of Reynolds numbers and angles of attack, and used to develop a methodology based on the reverse engineering principle, in which one could potentially search for a wind turbine or a wing airfoil shape that gives specific aerodynamic properties.
The future direction of research related to the presented methodology should include calculations for a wide range of angle of attacks (above aerodynamic stall), both for lift and drag forces, and for a range of Reynolds numbers taking into account the flow regime of the planned application.
Possible applications of the developed method could concern both the optimization of the aerodynamic profiles of wind turbines as well as the profiles of wings or propellers of flying vehicles. In the case of wind turbines, care should be taken to ensure that the set of profiles contains a broad representation of the families of aerodynamic profiles used for wind turbine blades.
The main limitation of the presented research is the presentation of the operation of the random forest regression algorithm only for one selected Reynolds number and angle of attack. Therefore, there is no guarantee that the algorithm would provide consistent and appropriate results if the training set were expanded to include a wide range of Reynolds numbers and angles of attack. Nevertheless, based on the results obtained in another work [
17], where a graphical representation of profiles was used, which seems more challenging than using PARSEC parameters, satisfactory predictions were obtained for a wide range of angles of attack. Therefore, this can be treated as a good prognosis for the methodology presented in the current research.
Another limitation is related more to the use of PARSEC parameterization, as there is no guarantee that all possible airfoil shapes, both existing and proposed, can be described using PARSEC.