1. Introduction
Numerical simulations are one of the backbones of modern day aerospace engineering, spanning across structural and aerodynamic analyses. The level of accuracy that can be achieved, often referred to as level of fidelity, and its affordability have significantly increased over the years thanks to remarkable advances in software and hardware capabilities. For example, in the field of computational fluid dynamics (CFD), Reynolds-averaged Navier-Stokes (RANS) or even large eddy simulations are nowadays feasible also for complex configurations, whereas two decades ago, only Euler simulations were affordable for the same geometrical complexity. Hence, engineers frequently encounter situations where data from different fidelity levels are available. Traditionally, one would substitute lower-fidelity data with higher-fidelity, more accurate solutions as they are available. Alternatively, efforts may be directed towards improving lower-fidelity solutions through a correction scheme. A notable example is enhancing panel methods with CFD solutions for more accurate flutter predictions. Another option is to leverage multi-fidelity or variable-fidelity techniques to create continuous models that contain as much information as possible.
Multi-fidelity approaches describe methods that facilitate the integration of scalar quantities from different fidelity levels, such as lift coefficient values for varying angles of attack from different numerical simulations or even experimental measurements. Several methods have been proposed for this task, including bridge functions [
1,
2], co-Kriging [
3,
4,
5,
6,
7], and hierarchical Kriging [
8]. Even though all such models enable combining data from different levels of fidelity, their underlying formulations vary. Bridge functions necessitate the preprocessing of data to compute deltas between different fidelity levels, subsequently described by the variable-fidelity model. These differences can be accounted for through additive or multiplicative terms. Co-Kriging models take a slightly different path and can be viewed as a general extension of single-fidelity Kriging to a method that is assisted by auxiliary variables or secondary information, typically employing an autoregressive formulation [
4]. Finally, hierarchical Kriging models use lower-fidelity data models as trend functions for the next higher level [
8]. Lately, also neural networks have been employed to combine data from different fidelity levels [
9,
10]. Each modeling approach offers the capability to combine data from different levels of fidelity, with minor advantages and disadvantages when compared. Hence, there is no clear superior candidate, and the choice of a specific modeling type depends more on availability and the individual use case at hand.
Especially for tasks that require a significant amount of function evaluations, e.g., optimization or uncertainty quantification (UQ), relying on the highest available fidelity often remains unfeasible. Different approaches have been proposed and actively employed to tackle multi-query scenarios. The most simplistic one is to reduce the fidelity level that is investigated, i.e., simplify the governing equations that are solved or reduce the spatial and temporal resolutions of the simulation until the scenario becomes feasible. A more sophisticated approach uses surrogate models that treat the numerical simulation as a black box and emulate the input-to-output relation within certain bounds. Such models can then be employed during multi-query scenarios instead of the numerical simulation itself and lead to so-called surrogate-based or surrogate-assisted approaches that are the de facto state of the art for the optimization of cost-extensive black-box functions [
3] as well as uncertainty propagation [
11]. To ensure that models meet the accuracy requirements, it is common to iteratively refine them through adaptive sampling sometimes also labeled as infill or active learning. Various criteria are available, and the interested reader is referred to the literature for a more in-depth introduction [
12,
13,
14]. Accounting for all levels of fidelity at hand during the adaptive sampling stage should result in reduced overall computational cost necessary to perform a certain investigation, as the available information is used efficiently.
Even though both aforementioned fields, variable-fidelity modeling as well surrogate-based techniques, are well established, the intersection between them has only become a research focus recently. Nevertheless, a few approaches have already been proposed. A straightforward approach is to collect data from all fidelity levels during the design-of-experiment (DoE) phase while restricting the adaptive sampling stage to only the highest fidelity level available [
3,
15]. In the last decade, some multi-fidelity sampling techniques have been proposed in the context of surrogate-based optimization [
16,
17,
18,
19]. The methods proposed by Huang in [
16] and later extended by Di Fiore in [
18] are based on a multiplicative acquisition function, where the expected improvement metric [
20] is multiplied by other factors in order to take into consideration the evaluation cost and the accuracy of the different fidelity levels. Shu [
17] introduces an alternative aggregate acquisition function. Here, the acquisition function is divided into two components based on whether the highest or lowest fidelity level is being considered, and the evaluation cost is taken into consideration as a multiplicative factor. Foumani in [
19] proposes a similar approach, wherein the two components of the acquisition function serve two distinct objectives: exploration and exploitation. Specifically, the exploration and exploitation parts are active when either the low or the high fidelity level is under consideration, respectively. All these multi-fidelity sampling techniques identify the next infill sample and fidelity level by maximizing their acquisition function through the solution of a single-objective mixed-integer optimization problem. In particular, all the aggregate acquisition functions (either sum or multiplication) are characterized by a strong coupling between the effectiveness of the infill point as well as the accuracy and evaluation cost of different fidelity levels. As a result, the algorithm may persistently sample at lower fidelity levels, especially in scenarios where a fidelity level is significantly less expensive than the higher-fidelity counterpart at the price of a limited reduction in prediction accuracy. The two-step multi-fidelity sampling criterion presented in this manuscript is expected to address this limitation by decoupling the identification of the next infill point from the selection of the fidelity level.
In this paper, we introduce a novel multi-fidelity infill criterion that can be combined with all single fidelity criteria, either for optimization or uncertainty quantification tasks. In particular, the selection of the fidelity level acts as a second step after the next sampling location has been computed and it relies on the Jensen–Shannon divergence. The efficacy of the proposed multi-fidelity sampling technique is successfully demonstrated on an aerodynamics application, addressing both optimization and uncertainty quantification challenges. Specifically, the application problem uses four aerodynamics solvers of increasing fidelity: a panel code, an Euler solver, and a RANS solver applied to two computational meshes of different sizes. The results show how the multi-fidelity sampling technique achieves results comparable to its the single-fidelity counterpart, albeit at a significant reduction in computational cost.
The manuscript is organized as follow: the two-step sampling methodology and its integration in optimization and uncertainty quantification schemes is described in
Section 2.
Section 3 presents the analysis and results of the application of the resulting algorithms on the aerodynamics problem. Finally, the conclusion section,
Section 4, summarizes the key findings of the results and outlines potential future extensions of the presented algorithm.
2. Methodology
All surrogate-based techniques have the same iterative architecture in common which can be summarized in three phases: fit the surrogate models based on the available data, identify the next sample (or samples in case of batch-sampling techniques) via an infill criterion, and evaluate the resulting design by means of the selected analysis tools as shown in
Figure 1. The process is repeated until a prescribed convergence criterion is reached, or the whole computational budget is consumed. Even though there might be some differences in the implementation of the iterative algorithm, the choice and definition of the infill criterion are the main distinctive characteristics of each surrogate-based technique [
3]. In particular, the infill criteria specifically designed for multi-fidelity sampling are able to determine not only the next infill sample but also the fidelity level to consider for the design evaluation.
Consider a generic multi-fidelity sampling problem with a set of
fidelity levels
(with
and
as the highest and lowest fidelity levels, respectively). At a generic iteration
i, the identification of the infill sample location and fidelity level for iteration
can be formalized as:
where
is the vector of the design variables,
A is the acquisition function of the selected infill criterion, and
are the lower and upper bounds, respectively, that define the design space. Given that the fidelity level is represented by the integer variable
l, Equations (1a)–(1c) is a mixed integer optimization problem (even though it is simple), and it requires specific techniques to be efficiently solved.
Some important notes for the reader: henceforth, the term “multi-fidelity” is omitted before “surrogate model” unless explicitly necessary. Hence, any mention of “surrogate model” in the following text should be understood to refer to a multi-fidelity surrogate model. For instance, the term “highest fidelity surrogate model” denotes the multi-fidelity surrogate model trained by utilizing all available data from every fidelity level. In addition, the surrogate models are assumed to be statistical models, i.e., models that, when queried, return a statistical distribution representing the output uncertainty given the training data and the selected model functional form. Furthermore, different infill criteria may require either the minimization of the maximization of their respective acquisition functions. Without loss of generality, the acquisition function is assumed to be maximized in the remaining part of this manuscript.
2.1. Two-Step Multi-Fidelity Sampling
The multi-fidelity sampling technique presented in this manuscript is derived from the general formulation in Equations (1a)–(1c) and it is designed as an extension of any existing single-fidelity infill criterion. Specifically, the selection of the fidelity level is decoupled from the identification of the next sample, which is completed by optimizing the single-fidelity acquisition function on the highest fidelity surrogate model. The fidelity level is then chosen as the most cost-efficient one for which the associated surrogate model is judged “sufficiently accurate” at the previously identified infill location. Every iteration of the proposed surrogate-based multi-fidelity sampling scheme consists of the following:
Determining the location of the next infill sample by the maximization of the acquisition function over the highest fidelity surrogate model:
where
is the multi-fidelity surrogate model of the highest fidelity, i.e., level 0.
Identifying the set of “accurate enough” fidelity levels at the next infill location by means of the selected accuracy metric:
Selecting the fidelity level to use for the evaluation of the next infill sample as the fastest one within the “accurate enough” list:
where
is the fidelity level evaluation time.
Obviously, this is a generic description of the two-step multi-fidelity sampling scheme that has to be integrated in the specific surrogate-based implementation, e.g., surrogate-based optimization (SBO) in
Section 2.4 and surrogate-based uncertainty quantification (SBUQ) in
Section 2.5.
The sample location
resulting from the solution of Equation (
2) has a couple of important properties. First of all, it is determined by leveraging the highest fidelity surrogate model, in other words, a surrogate model that is trained using all the available data. This sample is, by the definition of a surrogate-based sampling scheme, the most effective location in the domain with respect to the infill metric. Moreover, the choice of the evaluation fidelity level (which occurs in the second step) does not affect the location of the next sample, unlike in other multi-fidelity schemes [
16,
18]. This characteristic of the sampling scheme is based on the assumption that, if the location of the next sample is determined by the most accurate surrogate model, there is no rationale for altering it depending on the evaluation fidelity level, and thus the evaluation cost and accuracy.
The design performance of the infill sample is then evaluated by using the fastest fidelity level (Equation (
4)), for which the associated surrogate model is judged “sufficiently accurate” at the infill location with respect to an accuracy metric. In contrast to other multi-fidelity sampling techniques where the accuracy metric (or a function depending on it) is directly multiplied by the acquisition function [
16,
18], the proposed two-step approach uses the accuracy metric to perform a binary classification of the fidelity levels in “accurate” and “inaccurate” (Equation (
3)). A popular approach to assess the expected accuracy of a generic fidelity level
l at a given location
is by evaluating the surrogate model of level
l at
and comparing the returned probability distribution with the one obtained from the evaluation of the surrogate model of the highest fidelity level (i.e., level 0). Several metrics have been proposed to compare the two distributions, with popular examples being the correlation factor and the Kullback–Leibler divergence (KLD or KL-divergence). However, both metrics have limitations. The correlation factor estimates the correlation between two sets of observation data rather than the distance and similarity of two probability distributions, like the KL-divergence. While the KL-divergence is a proper statistical distance, it is not symmetric and does not have an upper bound. The first limitation can be addressed by always computing the KL-divergence between the highest and the lower fidelity (and not vice versa), but the lack of an upper bound poses challenges in defining a threshold below which the considered fidelity level is considered “accurate”. For these reasons, the Jensen–Shannon divergence (JSD or JS-divergence), a symmetric statistical distance bounded between 0 and 1, is adopted in this study to assess the accuracy of the multi-fidelity surrogate models as described in the next section (
Section 2.2). While the need to specify the accuracy metric threshold (Equation (
3)) may initially seem like a drawback of the method, it also provides users with the flexibility to adjust the algorithm’s behavior according to their preferences. For example, the threshold might be actively reduced towards the end of the optimization in order to ensure that the majority of samples are evaluated at the highest fidelity. This adaptability allows users to tailor the algorithm to the specific application.
2.2. Jensen–Shannon Divergence as Accuracy Metric
The choice of the accuracy metric is crucial for the behavior of the multi-fidelity sampling schemes (Equation (
3)). However, both of the previously mentioned options have some limitations: the correlation factor is not formally a distance metric between two distributions, and the KL-divergence is not symmetric (i.e.,
), a property that is relevant for the assessment of the prediction accuracy between two models. For this reason, the multi-fidelity sampling technique presented in this manuscript adopts the JS-divergence as an accuracy metric.
The Jensen–Shannon divergence between two probability distributions
A and
B is a symmetric and smoothed version of the KL-divergence, and it is bounded between 0 (identical distributions) and 1 (infinitely different distributions):
where
M is a mixture distribution of
P and
Q (the upper bound in Equation (
6) depends on the logarithm bases used in the evaluation of KLD in Equation (
5)). Depending on the formulation of the mixture distribution, it is possible to derive the closed form of the JSD for some probability distribution families [
21]. However, a discrete approximation of the JSD is a valid and fast alternative to the closed form for the purposes of multi-fidelity sampling (
Section 2.6). Combining Equation (
3) with Equation (
5), the list of expected accurate fidelity levels at the next infill sample
is:
where
is the probability distribution returned by a query of the surrogate model of fidelity level
l at
, and
is the maximum acceptable JSD value to consider the fidelity level
l “accurate enough”.
The value of
is a parameter that has to be set by the user and that controls the minimum accuracy requirement for a fidelity level to be considered “accurate enough” for the next sampling iteration. Lower values
make the accuracy requirement more strict, therefore reducing the probability that the algorithm selects a lower fidelity level for the next sampling iteration. Even though the
is a user-defined parameter, the property of JS-divergence of being bounded between 0 and 1 helps to identify some target values that are “empirically” meaningful and that are valid irrespective of the magnitude of the probability distributions that are compared. For instance, the JS-divergence between two normal distributions, each sharing the same standard deviation and with mean values differing by two times the standard deviation (
Figure 2), approximates 0.7 regardless of the particular values assigned to the mean and standard deviation:
2.3. Comparison with Multiplicative Sampling Schemes
The decoupling between the identification of the next infill sample and the selection of the fidelity level distinguishes the proposed two-level approach from other methods based on the multiplicative multi-fidelity expected improvement [
16,
18]. In these sampling schemes, both the location and fidelity level of the next sample are determined simultaneously by maximizing a single acquisition function that is the multiplication of the single fidelity acquisition function and other factors that take into consideration the accuracy and evaluation cost of the fidelity levels. Therefore, the location and fidelity level of the next sample result from the solution of a single-objective optimization problem obtained from the multiplication of three distinct and conflicting objectives, i.e., the maximization of the single-fidelity acquisition function, the maximization of the accuracy between the selected and the highest fidelity levels, and the minimization of the evaluation time of the selected fidelity level. As a consequence, this approach shares the primary drawback of the “weighted sum” technique applied to solve a multi-objective optimization problem as a single-objective problem: if the multiplicative factors lack appropriate scaling, the effect of one factor may be dominant and drive the whole optimization. For instance, consider a multi-fidelity sampling scenario where a mid-fidelity level has a decent prediction accuracy and significantly lower evaluation costs compared to the highest fidelity level. In such a case, the multiplicative algorithm may consistently favor the lower-fidelity solver because the multiplicative acquisition function tends to offset the penalty associated with the reduced accuracy with the benefit of decreased computational cost. This limitation is critical in a surrogate-based optimization application because a sample is considered optimal only if it is evaluated at the highest fidelity level. If no sample is evaluated at the highest fidelity level, it is impossible to update the knowledge about the current optimum.
The proposed two-level multi-fidelity sampling effectively addresses such a limitation of multiplicative techniques. The difference in magnitude of the infill metric, accuracy metric, and evaluation cost is irrelevant in the proposed criterion because these metrics are considered in different stages of the infill sample and fidelity level selection phase, therefore removing any risk of interaction. Consequently, the two-step sampling scheme is safeguarded against always selecting the same fidelity level as observed in multiplicative approaches [
16,
18].
2.4. Surrogate-Based Optimization
The popularity of surrogate-based optimization techniques has resulted in the development of numerous acquisition functions. Popular examples are probability of improvement (PI) [
22], knowledge gradient (KG) [
23], and expected improvement (EI) [
24]. Out of these, expected improvement is arguably the most popular given its elegant and simple formulation, and for this reason, it is adopted in the SBO analyses presented in this manuscript. Even though a complete derivation of the expected improvement is out of the scope of this article (the whole derivation is available in [
24]), a brief description is provided in the following paragraph. At a generic iteration
i, the goal of the expected improvement technique is to identify the location in the domain that maximizes the expected reduction in the objective function value with respect to the current minimum, i.e., the expected improvement:
where
is the current minimum and
Y is the probability distribution returned by a query of the surrogate model at location
. Under the assumption that
Y is a normal distribution (
), the closed form of Equation (
9) is [
24]:
where
and
represent the cumulative distribution and probability density functions or the standard normal distribution, respectively. Equation (
10) is the acquisition function that is maximized at each optimization iteration in order to identify the location of the next infill point as described in
Section 2, Equation (
2).
Usually, real design optimization problems are characterized by several constraints that are introduced to guarantee a certain level of feasibility in the design space. Several techniques are available to handle constraints within surrogate-based optimization schemes [
25,
26], and popular choices are the penalty function and the probability of feasibility. The former is straightforward to implement, but its effectiveness is strongly dependent on the user-defined penalty values, which may artificially restrict the design space beyond what is necessary. For this reason, the constraints are handled in this manuscripts by means of the probability of feasibility [
27]. Assuming that the problem has
constraints defined as:
and a surrogate model is built for each of them, the probability of the feasibility of a generic design
with respect to all the constraints is computed as:
where
is easily computed from the probability distribution returned by the evaluation of the constraint surrogate model. The resulting acquisition function based on expected improvement and probability of feasibility is obtained by multiplying Equations (
10) and (
12):
and represents the acquisition function that is used to obtain all optimization results presented in this manuscript.
Another important aspect to consider in multi-fidelity surrogate-based optimization is the potential presence of objective and constraint functions with a different number of fidelity levels. This scenario is quite common, especially in cases where multiple disciplines (e.g., aerodynamics and structures) are involved in the optimization process. It is important to remember that the two-step approach described in
Section 2 independently selects the fidelity level for each function within the problem. Consequently, the correct implementation of a multi-fidelity surrogate-based optimization requires the definition of a logic to determine which fidelity level to use in situations where the selected fidelity levels are not identical. In a single-objective multi-fidelity problem, the minimum is obviously obtained by comparing only the feasible data obtained at the highest fidelity. Therefore, it is imperative that whenever the objective function is evaluated at the highest fidelity, all the constraints are also assessed at the highest fidelity. In contrast, when the objective function is not evaluated at the highest fidelity level, constraints can be evaluated at the fidelity level selected by the scheme presented in
Section 2 Equation (
4). Denoting with
and
the selected fidelity level for the objective and constraint functions, respectively, the combined logic for the selection of the constraint fidelity level can be formalized as:
2.5. Surrogate-Based Uncertainty Quantification
Similar to SBO for optimization, surrogate-based uncertainty quantification (SBUQ) aims to transfer the uncertainties present in the inputs effectively to the Quantity of Interest (QoI), and subsequently, to accurately determine the relevant statistics of QoI [
28]. Instead of employing the Monte Carlo method, which involves direct computation using complex and resource-intensive black-box functions, this approach applies the method to a more straightforward and computationally inexpensive surrogate model that approximates the behavior of the original function. In the present work, this is achieved by constructing a Kriging surrogate model, which is formulated based on an initial Design of Experiments (DoE) in the stochastic domain, coupled with its black-box solutions. To improve the accuracy of the surrogate, an active infill criterion based on the (approximated) statistics is adopted. Here, the infill criterion focuses on the mean value of the QoI, seeking to achieve high global accuracy of the surrogate model.
To ensure a balanced space filling in the stochastic space, the infill criterion uses the prediction mean square error
at any given point
in the Kriging surrogate. At every infill iteration, the surrogate model is (re)constructed at the new location
where the product of the joint probability distribution function of the input uncertainties
and the error estimate is maximized:
The component ensures sampling from the regions of high probability in the stochastic space, and the error term exploits regions where the surrogate model is inaccurate. We use differential evolution to search for the optimal location in the surrogate. The statistic of the QoI is then approximated using a large number of Quasi Monte Carlo sample evaluations of the surrogate model.
2.6. Practical Implementation Details
The presented multi-fidelity surrogate-based sampling architecture is implemented in the Surrogate Modeling for AeRo-data Toolbox in python (SMARTy) [
29] that is continually developed at the German Aerospace Center (DLR). While the implementation of the two-step multi-fidelity sampling technique is straightforward and it can be readily adapted to existing surrogate-based algorithms, there are a couple of specific aspects that should be highlighted.
As described in
Section 2.1, the JS-divergence has a closed form for some families of probability density distributions [
21] (e.g., normal distributions). However, the closed forms might be tedious to implement, therefore increasing the risk of coding errors. For this reason, the JS-divergence is numerically computed in this manuscript using the ready-to-use
spatial.distance.jensenshannon method of Scipy python library [
30]. The method accepts two probability vectors as input, representing the two probability distribution, for which JS-divergence is to be computed. In particular, if the two distributions are normal distribution (
and
), the probability vectors are defined by evaluating the probability density function of the two probability distributions at linearly spaced locations in a range defined as:
The second aspect to specify is the algorithm adopted for the optimization of the acquisition function Equation (
2). As described in
Section 2, the two-step multi-fidelity sampling approach decouples the identification of the next infill point from the selection of the fidelity level, and the former results in the solution of a single objective global optimization problem. In all the applications presented in this manuscript, the acquisition function is optimized by means of a differential evolution algorithm. This choice is guided by the multi-modal nature of the optimization problem and the computationally inexpensive evaluation of the acquisition function, thus suggesting the utilization of a stochastic global algorithm for its optimization.
4. Conclusions
The manuscript introduces a novel infill technique for multi-fidelity sampling in the context of surrogate-based algorithms. In particular, the method decouples the identification of the new infill sample from the selection of the fidelity level that has to be used for its performance evaluation. This key characteristic distinguishes the proposed technique from the majority of existing multi-fidelity schemes which typically rely on a single multiplicative acquisition function. Consequently, the proposed algorithm avoids the risk of continuously sampling from a lower fidelity level in case its computational cost is significantly lower compared to others. Another relevant novelty introduced in this study is the use of the Jensen–Shannon divergence for quantifying the accuracy of the different fidelity levels.
The efficacy of the proposed sampling method was successfully tested in both surrogate-based optimization and uncertainty quantification scenarios. In the optimization test problem, a multi-fidelity surrogate-based optimization algorithm was used for aerodynamic shape optimization with the aim of maximizing the lift-to-drag ratio. The solution resulting from the proposed multi-fidelity sampling is compatible with the optimal solution obtained with a standard single-fidelity sampling. Notably, the multi-fidelity strategy exhibits approximately a five-fold reduction in computational cost compared to its single-fidelity counterpart. Similarly, the multi-fidelity sampling strategy is used for the propagation of operational and geometrical uncertainties to quantify the mean and standard deviation of the lift-to-drag ratio. The statistics obtained from the proposed multi-fidelity sampling approach is not only in accordance with the traditional single-fidelity sampling but also significantly cheaper to compute than the latter. The investigation also highlighted an interesting observation, where the correlation between the level of uncertainty in the inputs and can significantly impact the accuracy of the surrogate model and thereby the estimated statistics.
It is important to clarify that the research outlined in this manuscript focuses solely on single-objective optimization. Hence, the logic described in this paragraph is tailored to this specific context. The extension to multi-objective optimization is currently under development, and it will be probably presented in a future publication.