1. Introduction
Numerical optimization has been widely applied during the aerodynamic design of High-Pressure Turbine (HPT) components. In this context, Machine Learning (ML) tools play a key role in identifying complex relations between input variables and output objectives [
1], providing fast predictions after an adequate training phase. Several studies proved that estimations from Computational Fluid Dynamics (CFD) can be used to collect reliable data for training surrogate models [
2,
3,
4,
5]. Recent approaches suggest faster solutions that merge the knowledge from an existing dataset of 2D simulations with high-fidelity 3D CFD simulations [
6], thus reducing the overall number of simulations required. Among the machine learning tools, Artificial Neural Networks (ANN) are usually preferred for their flexible architecture and wide versatility. In the aerodynamic optimization context, Mengistu and Ghaly [
7] used ANN as a surrogate model to optimize a transonic turbine vane and a subsonic compressor rotor in terms of adiabatic efficiency and pressure loss coefficient. Zhang et al. [
8] instead predict the airfoil lift coefficient with a convolutional neural network. Du et al. [
9] proposed a deep-learning tool for performance prediction and turbine blade profile design. This methodology uses a dual Convolutional Neural Network (CNN) trained over Reynolds-Averaged Navier–Stokes (RANS) simulations to predict the physical field distribution based on design variables and recognize aerodynamic performance parameters from physical field information. Random Forest (RF) models also represent a valid alternative to an ANN for regression problems, even if they are used more rarely. However, Dasari et al. [
10] proved that if RF undergoes proper hyperparameter tuning, it can be used as a surrogate model to support design space exploration. Hyperparameter optimization represents a fundamental step for both defining the ideal architecture of a metamodel and calibrating the metamodel’s parameters based on the specific problem that must be solved. Among the different techniques, Bergstra and Bengio [
11] demonstrated that random search can find models that are similar or even better than the one found by grid search, within a small fraction of the computational time. Once trained and tuned, surrogate machine learning models can strongly facilitate design optimization by reducing the necessary prediction time of the desired output. Thanks to this property, metamodels are often combined with evolutionary optimization algorithms such as genetic algorithms to identify the maximum or the minimum within the design space. Genetic algorithms are biologically inspired optimization methods that follow the principle of selection, crossover, and mutation to find the best individual starting from a large initial population. Thanks to these characteristics, they are widely recognized as excellent methods for hard optimization problems [
12]. This strategy is usually contrasted with gradient-based approaches (i.e., adjoint methods), which use deterministic methods and exploit the information of the gradient of the output parameter with respect to the input variables to produce a progressive increase in the objective function at each iteration of the process. Gradient-based approaches have the advantage of converging within a limited number of iterations but the optimization can easily stop in a local optimum if the initial solution is not chosen carefully [
13,
14]. Apart from optimization purposes, a properly trained machine learning model is able to provide fast and accurate predictions for any value of the input data within the training design range, thus avoiding the computational effort of a traditional numerical simulation used to predict the desired output.
Component interaction has always been a paramount topic in turbomachinery due to the coupling between a reactive module and the HPT stage [
15]. This is especially true in Rotating Detonation Engines (RDE), where the unsteady supersonic combustion products are entrained into the turbine vane, which is usually designed for a steady inflow. Therefore, particular interest should be dedicated to the aerodynamic optimization of HPT vanes for RDE) applications [
16], as they are particularly subjected to strong spatial and temporal variations of the inlet conditions. In this context, Liu et al. [
17] studied two axial turbine designs exposed to a pulsating inlet with an inlet Mach number of 0.3 and 0.6 through Unsteady Reynolds-Averaged Navier–Stokes (URANS) simulations. They concluded that the aerodynamic efficiency is significantly penalized for an inlet Mach number of 0.6 due to the local flow separations induced by the endwall diffusion. Moreover, for constant endwall shape the turbine results unstarted. As regards the total pressure damping, the
solution provides better attenuation. A further optimization step was presented by Grasa and Paniagua [
18], who parametrized a diffusive vane in the high-subsonic regime to reduce pressure loss and averaged pressure distortion. This analysis considers also three different inlet angles to account for the effect of the vane incidence. Later, Gallis et al. [
19] proposed a parametric optimization for both the diffusive endwalls and the airfoil including a flow control system through an array of cooling holes located upstream of the leading edge. The effect of the flow control system was to mitigate the oscillating inflow and to better guide the outflow for the subsequent rotor.
The current work proposes an efficient optimization process for designing an HPT vane that operates at using machine learning tools such as ANN and RF. The original blade and diffusive endwall profiles are first parametrized using splines and control points, then each parameter is varied within its design range to achieve new configurations. Subsequently, the machine learning tools are trained on a Design Of Experiments (DOE) composed of 885 samples selected through the Latin Hypercube Sampling (LHS) approach. Each sample is tested with a RANS simulation aimed at calculating the vane efficiency, the exit flow angle, and the inlet Mach number. Based on these outputs, the metamodel hyperparameters are tuned with a Random Search (RS) approach to optimize the prediction accuracy. The ANN and the RF are then coupled with a genetic algorithm that identifies the optimal solution. The effects of the optimized geometrical features on the internal flow field are also extensively discussed.
2. Parametric Design and Data Collection
The vane profile that is used as a baseline case for the optimization is the one described by Sieverding et al. [
20] and analyzed by Denos et al. [
21]. However, even though the original vane by Sieverding et al. [
20] was inserted into a straight annular channel, in the present work the endwalls are diffusive to allow for a higher inlet Mach number value of ≈0.6 with respect to the original working condition of ≈0.2. The nominal endwall configuration considered in this activity is the one located at the center of the design space described by Gallis et al. [
19]. The baseline numerical domain is represented in
Figure 1.
The lateral view of the coupled geometry with the endwall and the vane is visible in
Figure 2, while the nominal profile is visible from the top of the channel in
Figure 2b. Both figures also show the control points of the splines used to create both the vane and the diffusive endwall profiles. More specifically, black points are fixed, red points can only translate in one direction (either horizontally or vertically), and blue points have 2 degrees of freedom and can move both vertically and horizontally. For the sake of simplicity, the endwall profile is considered symmetrical with respect to the mid-span, and the area expansion is controlled by four control points whose positions are free to translate in the
Z-direction. The vane profile is deformed by four points on the suction side and three on the pressure side, for an overall number of 14 variables including the stagger angle. The blade profile is later closed using an elliptic curve at the leading edge and a circular arc at the trailing edge, taking care of maintaining a continuous derivative of the curves in the conjunction points. The above-mentioned parametrization was kept equal to the one presented by Gallis et al. [
19]. However, the number of samples in the DOE was increased from ≈300 to ≈900 to improve the design space coverage.
The sampling approach used to create the DOE is the Latin Hypercube Sampling (LHS). LHS was originally proposed by McKay et al. [
22] and is a nearly random statistical instrument used to generate samples in case of a multi-dimensional design space. It consists of dividing the range of variability of each design variable
into
N sub-intervals of equal marginal probability
, and randomly sampling once from each sub-interval. In the current work, the number of samples and the corresponding sub-intervals were selected to be equal to
to obtain an extensive coverage of the entire design space. Out of the 900 samples, 885 design points successfully completed both the automatic meshing process and the calculation, thus producing the overall dataset. The design space exploration for the endwall and the vane profile is represented in
Figure 3a,b. The aerodynamic performance of each sample is compared with the one obtained for the baseline geometry that is composed by the nominal airfoil [
20] and by the symmetrical diffusive endwall profile located at the center of the design space explored by Gallis et al. [
19].
3. Numerical Methodology
In the current work, the commercial solver ANSYS CFX
™ (2022R1) is used to run the RANS numerical analysis of the vane. The boundary conditions applied to the domain are summarized in
Table 1.
It is important to stress that all the simulations are performed at an equal inlet total pressure. As a consequence, the overall mass flow rate changes according to the variation of the minimum throat area. This approach was chosen in order to control the inlet Mach number with the mass flow rate since the objective of the study is to target
. CFD simulations are performed under steady-state conditions. The “high resolution” scheme is adopted for advection and turbulence discretization, while the k-
SST model is used for turbulence closure. The solver uses a coupled pressure-based approach, and the viscous work term is included in the calculation. The simulation of the baseline domain is performed on an unstructured mesh with ≈2,000,000 tetrahedral elements (
Figure 4a,b). Furthermore, a detailed view of the 20 inflation layers that are used to keep
in the wall regions is provided in
Figure 4c. The mesh size was selected after the evaluation of the Grid Convergence Index (GCI) [
23]. This analysis was conducted by considering the inlet mass flow rate, the inlet mass-weighted averaged Mach number, and the mass-weighted averaged total pressure at the outlet of the vane for three different levels of grid size. The refinement ratio between coarse (C), medium (M), and fine (F) mesh was kept equal to ≈1.25. The results in
Table 2 suggest that the medium refinement provides mesh-independent results with an asymptotic range of convergence
. Consequently, the medium level of mesh was adopted for all the CFD simulations presented in this research work. More details about the mesh sensitivity are described by Gallis et al. [
19].
The convergence of the simulation is controlled through the maximum number of iterations, which is set equal to 250. This approach ensures residual minimization in the order of . Additionally, solution reports are used to check the stabilization of the desired outputs during the simulation.
Concerning the numerical method, RANS-based simulations introduce the idea of Reynolds-averaging to split an instantaneous quantity into the sum of a time-averaged and a fluctuating term. The instantaneous velocity
is thus decomposed into a time-averaged component
and a time-varying component
. According to this idea, the governing Navier–Stokes equations can be reformulated as in Equation (
1) for continuity, Equation (
2) for momentum, and Equation (
3) for energy.
The
term in Equations (
2) and (
3) indicates the stress tensor and it is calculated according to Equation (
4), while
indicates the momentum sources and
is the density. The energy Equation (
3) is instead formulated considering the total enthalpy
, the energy sources are accounted for by the term
, and
indicates the thermal conductivity.
The Reynolds stresses
can be instead modeled thanks to the introduction of an appropriate turbulence model, that closes the Reynolds-averaged equations. More specifically, the
SST model by Menter [
24] introduces two transport equations for the turbulent kinetic energy
k (Equation (
5)) and for the specific turbulent dissipation rate
(Equation (
6)).
The term
in Equations (
5) and (
6) is the production rate of turbulence, while if buoyancy model is enabled,
and
are the turbulence buoyancy terms. Additionally,
,
,
,
, and
are constants whose values can be found in the ANSYS CFX
™ theory guide [
25]. The turbulent viscosity
is computed from Equation (
7) and contains the blending function
(Equation (
8)), where
y is the distance to the nearest wall,
is the kinematic viscosity,
S is an invariant measure of the strain rate, and
.
Once modeled,
can be finally combined with the Reynolds stresses term using Equation (
9).
Regarding the working fluid, air with an ideal gas hypothesis and constant specific heat
is assumed. The aerodynamic performance of the vane is estimated with the vane adiabatic efficiency
in Equation (
10), where
and
are the static and the total outlet pressure, while
is the inlet total pressure and
is the specific heat ratio.
The efficiency equation can also be expressed by referring to the outlet real velocity
and the outlet isentropic velocity
. The efficiency alone gives only an index of the aerodynamic performance of the vane, without caring about the operating conditions of the machine. To overcome this issue, the Root Squared Index
was introduced:
Equation (
11) combines the deviation of the actual measurement with respect to the reference values in terms of inlet Mach number
, efficiency, and exit flow yaw angle
. More specifically,
,
and
is the nominal exit metal angle of the vane. Each squared term in Equation (
11) was then multiplied by a weight coefficient
to ensure equal importance of each objective. Equation (
10) for
and Equation (
11) for
are considered objective functions for two independent optimization processes. In addition, the vane outlet conditions are estimated through the total pressure loss coefficient
, which is calculated with Equation (
12), where the outlet total and static pressure
and
are both measured at an axial distance equal to the
of chord
C downstream from the vane trailing edge.
Moreover, the Contraction Ratio (CR) is introduced during the post-processing step to quantify the variation of the minimum cross-sectional area and it is defined in Equation (
13), where
is the inlet area, which is fixed, and
is the vane throat area that is calculated from Equation (
14).
5. Optimization
The first step of the optimization process consists of training each machine learning model using the best hyperparameters identified during the random search process. This step was performed using a 70%–15%–15% split for train, validation, and test. Starting from the neural network models, the training history for the single and the combined objective function are reported in
Figure 5a and
Figure 5b, respectively. These graphs show the evolution of training and validation loss with training epochs. In both cases, the models do not suffer from the overfitting phenomenon as the validation loss does not deviate from the training one. Moreover, the
convergence is approached after approximately 1000 epochs.
More details about the generalization capability of the models can be appreciated by looking at the parity plots in
Figure 6a–f. The parity plot of each sub-dataset compares the predictions with the true values, and predictions that perfectly match the true values lie along the diagonal red line. Both the single and the combined objective function problems in
Figure 6a–c and in
Figure 6d–f, respectively, show a narrow distribution of the samples along the diagonal line. This results in a
in all sub-dataset of the model used to predict
and
for the model used to predict
. Furthermore, a very important aspect is the ability to accurately predict the results at the extremes of the parity plot, as these data points represent the area of greatest interest for identifying the minima or maxima of the Objective Function (
).
Similar considerations can also be derived for the RF case. As discussed before, despite the hyperparameter tuning, the optimal RF model does not achieve the same performance as the neural network. The validation and test parity plots above all (
Figure 7b,c) show a more sparse distribution of points along the diagonal line, and the
index drops from
to
when the model needs to generalize on data that was not used for training. This behavior confirms the cross-validated results reported in
Table 6 and adds useful information regarding the presence of overfitting in the RF model.
The lack of accuracy of the RF model can be likely attributable to insufficient data. To prove this idea, the model was trained for an increasing number of samples, and the evolution of the training and test
index shows a progressive improvement of the model generalization ability with increasing data size (
Figure 8). However, improvements for the numbers of samples greater than 600 are quite moderate and the model is not able to achieve the same performance as the ANN, given the same number of samples.
Once the optimal machine learning model was obtained, it was used to predict the vane geometry that maximizes the single and the combined objective functions. However, the optimization of
was conducted only for the ANN case, as the RF model proved to perform worse for this kind of problem. The optimization process uses a genetic algorithm implemented with the Python library PyGAD [
36]. The genetic algorithm starts with an initial population of 200 individuals that evolves through 350 consecutive generations. Each individual is represented by a chromosome whose length (number of genes) is proportional to the number of input variables. Performance is instead calculated via a fitness function and the most qualified individuals are selected for mating at each generation. The mating process consists of crossover and mutation. “Crossover” means that the chromosomes are split and swapped between the two parents, while during “mutation” some genes in the chromosome are replaced with random values to increase the disparity between individuals. Eventually, an elitism mechanism is employed to preserve the best 10 individuals of each generation. In the current application, the size of the mating pool was set equal to 120 parents, while
of the genes undergo mutation by replacement. The crossover strategy is instead based on a “single-point” approach. Each individual in the genetic algorithm evolution corresponds to a sample whose performance is predicted by the machine learning model. Moreover, the same GA parameters are considered for both the ANN and the RF to obtain their best predictions.
Figure 9a,b reproduces the objective function evolution with the generations, and the optima are identified with a corresponding case ID for simplicity.
OPT-1 and
OPT-2 represent the optimal geometry predicted, respectively, by ANN and RF for
, while
OPT-3 refers to the geometry optimized with ANN in terms of
. The plot underlines that most improvements are achieved within the first 100 generations. Subsequently, only slight enhancements occur until the end of the evolutionary process. The evolution of the objective function in
Figure 9a shows a limit of the RF model in exceeding
. This behavior can be confirmed by the lack of accuracy of the model shown in the parity plot (
Figure 7), as predictions in the upper extreme of the plot underestimate the corresponding true values.
The optimized diffusive endwall and vane profiles found by the GA are shown in
Figure 10a,b. This graphical comparison reveals that ANN and RF predict the same optimal geometry for maximizing
. Both
OPT-1 and
OPT-2 deviate significantly from the baseline geometry, especially at the leading edge of the vane and at the endwall. The trailing edge instead is almost identical to the original one. Minor differences emerge instead from the comparison of
OPT-1 and
OPT-2 with
OPT-3, visible only in the first portion of the suction side up to
and in the central zone of the pressure side.
The optimal geometries were then tested using RANS simulations to determine the reliability of the predicted objectives and to investigate the impact of these geometrical features on the physics of the problem. Results in terms of model predictions and CFD validation are summarized in
Table 7.
Regarding the vane efficiency, the prediction error of both OPT-1 and OPT-2 is close to , confirming the two models’ great prediction accuracy. CFD validation demonstrated that OPT-1 is the best solution for optimization. Given the similarities between OPT-1 and OPT-2, only the first one was examined in depth through post-processing to study the internal flow field. Both the solutions OPT-1 and OPT-3 reach the maximum result in terms of . Moreover, using a combined objective function does not penalize the overall stator efficiency, which is almost unchanged between OPT-1 and OPT-3.
6. CFD Results
In this section, the CFD simulations are post-processed to estimate the impact of the optimized geometries on the internal flow field. The helicity maps in
Figure 11a–f shows the intensity of the secondary flows induced by the fluid separation from the walls. A first Separation Bubble (SB) occurs as the endwall starts to diffuse. Subsequently, the impingement with the leading edge produces the horseshoe vortex, which is split into Pressure Side Horseshoe Leg (PSHL) and Suction Side Horseshoe Leg (SSHL). Due to the pressure difference between the pressure and the suction side, the PSHL is rapidly pushed toward the suction side of the adjacent airfoil.
In the baseline geometry (
Figure 11a,d), the faster endwall diffusion intensifies the magnitude of the PSHL. The latter subsequently merges with the horseshoe vortices creating strong PSHL that push the SSHL to the upper and lower corners of the suction side of the vane, as visible in
Figure 11a. The optimized geometries
OPT-1 and
OPT-3 in
Figure 11e,f behave similarly to each other, and in both cases, the PSHL almost disappears. This can be motivated by the smoother endwall diffusion and by the elongated vane profiles, which minimize the local variation of the cross-sectional area. As a consequence, the PSHL is attenuated and the SSHL results are bigger with respect to the baseline case.
Figure 11b,c shows that the secondary structures in the optimized geometries are closer to the extremes in the span-wise direction, while the mid-span is almost not perturbated by recirculation zones.
These observations find consistency in the analysis of the outlet conditions, calculated on a plane extracted at an axial distance equal to the
of
C downstream from the vane trailing edge. Here, the total pressure loss coefficient
in
Figure 12 is plotted along the span-wise position. For the baseline design, the upper and lower vortices merge at the mid-span position creating a local peak in pressure loss. The optimized geometries
OPT-1 and
OPT-2 are instead characterized by the presence of two independent vortices at
and ≈0.7. Moreover, the exit flow yaw angle
in
Figure 12b shows a similar trend. The maximum flow deviation from
occurs at the mid-span position
for the baseline geometry while coinciding for
OPT-1 and
OPT-2 with a peak of
located at the
of the span. The pitch angle
is also reported in
Figure 12c to provide insight into the intensity of the vertical component of the velocity, which can cause instability in the turbine. Fluctuations in
are always lower than
, and once again the optimized geometries present a smoother profile with smaller deviations. Finally, the
profile in
Figure 12d confirms that the outlet plane of the baseline geometry is strongly affected by secondary flows, which creates a local deceleration of the flow at the mid-span position. Both
OPT-1 and
OPT-3 are instead characterized by an almost uniform
profile.
The inlet Mach number was found to be proportional to the overall mass flow rate, which changes according to the variation in the minimum throat area. This aspect can be quantified by introducing the contraction ratio
, which is measured as the ratio between the inlet and the throat area. The baseline geometry in
Figure 13a ingests the inlet flow rate at
with a
. When
is considered within the objective function (
OPT-3 in
Figure 13c), the
reduces up to
and
. In this context, solution
OPT-1 (
Figure 13c) serves as a middle ground as it has
and
. The isentropic Mach number in
Figure 13d shows the occurrence of a shock at the
of the span and
of the axial chord for the baseline geometry, which is a source of aerodynamic losses. Moreover, the span-wise distribution of the load is not uniform. In the
OPT-1 case represented in
Figure 13e instead, the load is evenly distributed over the span, as well as occurs for
OPT-3 in
Figure 13e. The latter presents a stronger peak in terms of
at the
of the axial chord which coincides with the sudden curvature change in the suction side that produces a local over-acceleration. Eventually,
OPT-1 geometry in
Figure 13e shows a better load distribution and this is conformal with the higher vane efficiency achieved by this geometry.
The pressure and velocity contour maps extracted at the mid-span are represented in
Figure 14a–c and
Figure 15a–c. Both figures confirm the behavior showed by the Mach contours represented in
Figure 13a–c, as the inlet flow initially decelerates, thus producing a local increase in the static pressure and then accelerates for the presence of the airfoil.
Figure 15a–c represents a detailed view of the suction side in the
region, where the changes in the curvature affect the flow field behavior. In fact, the
OPT-3 solution (
Figure 15c) is influenced by the strong curvature variation occurring close to
Z/C = 1, which produces a local acceleration and a subsequent thickening of the boundary layer in the downstream region, where the pressure gradient becomes adverse as reported in
Figure 13f in the
region. This phenomenon is greatly mitigated in
OPT-1 (
Figure 15b), and almost absent in the baseline case (
Figure 15a).
7. Conclusions
The current framework focuses on the optimization of an HPT vane designed to ingest an inlet gas flow rate at . The vane and the diffusive endwall profiles were first parametrized with splines and then deformed by moving the position of the control points. This method was used to generate 885 different samples using an LHS approach that was tested through RANS CFD simulations. The aerodynamics of each sample was quantified with the vane efficiency. Furthermore, the Root Square Index was introduced to quantify the impact of the new geometry on , , and simultaneously. The overall dataset was then used to train an ANN and an RF model aimed at predicting and as objective functions. Models were first tuned with a random search approach and then coupled with a genetic algorithm to search for optimal solutions.
The ANN model proved superior to the RF in generalizing predictions from test data, achieving and for the prediction of and , respectively, against by RF. Moreover, the ANN model performance does not suffer from train-test split dependency, since cross-validated results are independent from the k-fold index. In the RF case, a significant dependency from data split can be noticed both during cross-validation and when the model is trained for increasing data size. Regarding the machine learning tools studied in this paper, it can be concluded that a properly tuned ANN does not encounter difficulties in managing the relation between the objective functions and the 18 input variables in their current range of variability. An increase in the design space or in the number of variables could be considered for a future study, in order to increase the complexity of the design and look elsewhere for a new optimum point. Since most of the computational effort of the present work is represented by data collection from CFD analysis, it is important to find a compromise between the predictive accuracy and the computational time. Analyzing the results from the ANN and the RF models, it is possible to conclude that the 885 samples are enough to achieve the desired level of prediction accuracy for the ANN model ( 0.98–0.99). Moreover, the index for the RF model is strongly affected by the dataset size for . For a slight increase in is still observed, so it is not suggested to use a restricted dataset.
The current optimal geometries predicted by the models were thus simulated through CFD to verify the reliability of the predictions and to study the physics of the problem. The results demonstrated the excellent accuracy of the models, with the predictions matching the simulations within an error of approximately for both ANN and RF. The major sources of aerodynamic losses can be identified in the generation of separation bubbles induced by the endwall diffusion angle, in the horseshoe structures created at the vane leading edge, and in the presence of shocks after the vane throat area. The optimized geometries in terms of present strong mitigation of aerodynamic losses were obtained through a smoother endwall diffusion and an elongated vane shape. The combination of these geometrical features avoids sudden cross-sectional variation in the streamwise direction, preventing the flow separation from the diffusive endwalls. Similar considerations can be also extended to OPT-3, as the position of the leading edge and the shape of the endwalls coincide with OPT-1. However, OPT-3 achieves by slightly increasing the vane throat area and thus the overall mass flow rate through the system.