1. Introduction
Metamodels, which are also referred to as surrogate models, are essentially approximate mathematical models of real physical systems. In the past decade, metamodels have become increasingly popular in the field of energy sources because of their significant advantages in reducing the computational cost of time-consuming tasks [
1,
2]. Melo et al. [
3] pointed out that researchers in many countries are developing metamodels to estimate the energy performance of the building stock. Bornatico et al. [
4] used a kind of metamodel to optimize energy systems, and found that the metamodel converged to the same solution at 150 times the speed of the fine model. Westermann and Evins [
5] summarized and discussed recent studies on the application of metamodels in sustainable building design. Ferrero Bermejo et al. [
6] reviewed and compared two typical metamodels, namely the artificial neural networks and the support vector machine, for energy forecasting and condition-based maintenance in PV plants.
Actually, a good metamodel mainly depends on its accuracy and generality for different design tasks. To enhance the performance of metamodels, researchers have carried out a lot of studies over the past few decades [
7,
8,
9,
10,
11]. As a result, a large number of metamodels have been proposed, of which several types have gained wide acceptance in various applications. They are polynomial response surface (PRS) [
12,
13,
14], support vector regression (SVR) [
15,
16,
17], radial basis functions (RBF) [
18,
19], extended radial basis functions (E-RBF) [
20], moving least squares (MLS) [
21], artificial neural networks (ANN) [
22,
23], multivariate adaptive regressive splines (MARS) [
24] and Kriging (KRG) [
25,
26]. These different metamodels give us more options for different tasks. However, lacking the prior knowledge of the actual physical systems, it is challenging to find a suitable metamodel in advance for a new task. In particular, the worst metamodel may be chosen for the task.
A simple way to overcome the difficulty is to build a series of metamodels based on a given training dataset at first, and then select the best one on the basis of some statistical techniques like the cross-validation method. Another favorite way is to construct an ensemble metamodel, which assembles two or more individual metamodels by introducing weight factors. The basic idea of such an ensemble metamodel can be traced back to 1990s [
27,
28], and currently it has become a research hotspot [
8,
29]. According to the characteristics of the weight factors, the techniques for building the ensemble metamodels can be mainly categorized into methods based on local errors, methods based on global errors, and methods based on regression.
In the first category, the weight factors
are functions of design space, which are determined by the local errors of individual metamodels at the point of interest. Zerpa et al. [
30] introduced a local weighted average model for the optimization of alkaline-surfactant-polymer flooding processes by using the prediction variances of three individual metamodels (PRS, KRG, and RBF). Sanchez, Pintos, and Queipo [
31] proposed a general approach toward the ensemble of kernel-based models based on the local prediction variances. Acar [
32] investigated the efficiency of methods based on the local errors, and developed a new approach to determine the weight factors by using the pointwise cross-validation errors instead of the prediction variances. Zhang, Chowdhury, and Messac [
33] proposed a new metamodeling technique called adaptively hybrid functions, whose weight factors are determined based on the local measure of accuracy in the pertinent trust region. Lee and Choi [
34] presented a new pointwise ensemble of metamodels, of which the weight factors are calculated by using the
v nearest points cross-validation errors.
In the second category, the weight factors
are constant values in the entire design space, which are determined by the global errors of individual metamodels. Goel et al. [
35] studied a global weight factor selection approach based on the generalized mean square cross-validation errors (GMSE). Acar and Rais-Rohani [
36] developed an accurate ensemble of metamodels by solving an optimization problem that minimizes GMSE or root mean square errors (RMSE). Viana, Haftka, and Steffen [
37] obtained the optimal weight factors of the optimization problem by using the Lagrange multipliers. This method was also employed by Toal and Keane [
38] to construct an ensemble of ordinary, universal, non-stationary and limit KRG models. Additionally, Acar [
39] performed the simultaneous optimization of the weight factors and the shape parameters in the ensemble of RBFs.
It should be noted that in the first two categories the weight factors of individual metamodels are restricted to a positive range
and the sum of these factors is equal to 1
. Since they are different from the first two categories, the techniques in the third category mainly use the regression methods (like least squares) to determine the weight factors. Accordingly, there is no longer any restriction on the weight factors, which may even have negative values. Polynkin and Toropov [
40] introduced a novel mid-range metamodel assembly for the large-scale optimization problems, which is constructed based on the linear regression method. Ferreira and Serpa [
41] developed an augmented least-square approach for creating the ensemble of metamodels, which can be extended to the efficient global optimization. Zhou and Jiang [
42] constructed an ensemble of four individual metamodels (PRS, KRG, SVR, and RBF) from the view of the polynomial regression, and proposed a metamodel selection method on the basis of the stepwise regression to eliminate the redundant ones from the set of the candidate metamodels.
Motivated by these existing works, this paper proposes a different method for constructing the ensemble metamodels, which combines the advantages of regression-type and interpolation-type metamodels. The regression-type metamodels have better global trend fitting capacity than the interpolation-type metamodels, while the interpolation-type metamodels perform better than the regression-type metamodels in the vicinity of the sampling locations. By thoroughly exploring the characteristics of regression-type and interpolation-type metamodels, the proposed method could extract some useful information from the feedback of the regression-type metamodels to further improve the functional fitting capability of the ensemble metamodels.
4. Results and Discussion
4.1. RMSE
Figure 2 shows the boxplots of RMSE of the metamodels over 100 test sets for each benchmark problem with
training points. It can be seen that: (1) for all the benchmark problems, the most accurate ensemble metamodels outperform the most accurate individual metamodels; (2) without exception, the least accurate individual metamodels perform worse than the least accurate ensemble metamodels; (3) for each benchmark problem, the performance differences among the four individual metamodels are greater than that among the four ensemble metamodels.
To provide a better comparison for these metamodels, the error values are normalized with respect to the most accurate individual metamodel for each benchmark problem.
Table 2 shows the normalized means of RMSE of the metamodels for each benchmark problem with
training points. The bold values in
Table 2 are the most accurate individual/ensemble metamodels, the italic values are the least accurate individual/ensemble metamodels, the underlined values are the ensemble metamodels that perform better than all the individual metamodels, the “Best & Best” values denote the differences between the most accurate ensemble metamodels and individual metamodels, and the “Worst & Worst” values denote the differences between the least accurate ensemble metamodels and individual metamodels. From
Table 2, it can be seen that: (1) compared with the most accurate individual metamodels, the means of RMSE of the most accurate ensemble metamodels are reduced, ranging from 1.1% to 22.2%; (2) compared with the least accurate individual metamodels, the means of RMSE of the least accurate ensemble metamodels are reduced, ranging from 21.1% to 52.5%; (3) except for BP3, more than two ensemble metamodels perform better than the most accurate individual metamodels; (4) for BP5, all the four ensemble metamodels perform better than the most accurate individual metamodel.
Table 3 shows the frequency of the accuracy ranking (using RMSE) of the metamodels for the six benchmark problems with
training points. It can be seen that: (1) the frequency of the ensemble metamodels that rank 1st or 2nd is 11, yet the frequency of the individual metamodels is only one; (2) the frequency of the individual metamodels that rank 7th or 8th is 12, yet the frequency of the ensemble metamodels is zero; (3) considered the frequency of the metamodels that rank the top/bottom two, all the ensemble metamodels have better performance than the individual metamodels; (4) PrsRbfm performs best among the four ensemble metamodels, followed by SvrRbfm, PrsRbfi, and SvrRbfi.
To clearly compare the accuracy of each ensemble metamodel with their corresponding individual metamodels,
Figure 3 shows the normalized means of RMSE of each ensemble scheme for the six benchmark problems with
training points. It can be seen that: (1) in Scheme 1, PrsRbfm ranks 1st among PRS, RBFM, and PrsRbfm for all the benchmark problems; (2) in Scheme 2, PrsRbfi ranks 1st for all the benchmark problems; (3) in Scheme 3, SvrRbfm ranks 1st for four benchmark problems and 2nd for two benchmark problems; although RBFM ranks 1st for two benchmark problems, it is the worst performer for three benchmark problems; (4) in Scheme 4, without exception, the accuracy of SvrRbfi outperforms that of SVR and RBFI.
Table 4 shows the normalized standard deviations of RMSE of the metamodels for each benchmark problem with
training points. It can be seen that: (1) compared with the most accurate individual metamodels, the standard deviations of RMSE of the most accurate ensemble metamodels are reduced for BP5 and BP6, yet the standard deviations are increased for the other four benchmark problems; (2) compared with the least accurate individual metamodels, the standard deviations of RMSE of the least accurate ensemble metamodels are reduced, ranging from 8.4% to 35.5%.
According to the above experimental results, we think the proposed metamodeling approach could reduce the risk of selecting the worst individual metamodel, and the constructed ensemble metamodels perform better than the used individual metamodels in terms of accuracy. In particular, PrsRbfm performs best among the four ensemble metamodels, followed by SvrRbfm, PrsRbfi, and SvrRbfi.
To provide an explicit explanation for the better performance of the proposed approach, a low-dimensional problem (BP1) and an ensemble scheme (ensemble of SVR and RBFM) are selected as examples.
Figure 4 shows the contour plot of the actual function and the approximate functions of SVR, RBFM, and SvrRbfm. It can be seen that: (1) SVR has better global trend fitting capacity than RBFM, such as in the red box area; (2) RBFM performs better in the vicinity of the sampling locations, such as in the red ellipse region; (3) SvrRbfm combines the global trend of SVR and the local accuracy of RBFM, such as in the red box area and the red ellipse region.
Therefore, the reason for the better performance of the ensemble metamodels may be that the proposed metamodeling approach combines the advantages of the regression-type and interpolation-type metamodels. The actual model is regarded as the sum of a regression-type model and a deviation function. Some useful information is first extracted by the regression-type metamodel to capture the global trend of the actual model in the entire design space. Then, some other information is extracted from the deviations at the sampling locations by using the interpolation-type metamodel to achieve the local accuracy in the vicinity of sampling locations.
4.2. Effect of Performance Criteria
The choice of different performance criteria may influence the results of the metamodels. To reduce the source of uncertainty in the results as much as possible, the max absolute error (MAE) is selected as another performance criterion.
Figure 5 shows the boxplots of MAE of the metamodels over 100 test sets for each benchmark problem with
training points.
Table 5 shows the normalized means of MAE of the metamodels for each benchmark problem with
training points. From
Figure 5 and
Table 5, it can be seen that: (1) for each benchmark problem, the performance differences among the four ensemble metamodels are less than that among the four individual metamodels; (2) except for BP6, more than two ensemble metamodels perform better than the most accurate individual metamodels; (3) compared with the most accurate individual metamodels, the means of MAE of the most accurate ensemble metamodels are reduced for five benchmark problems; (4) compared with the least accurate individual metamodels, the means of MAE of the least accurate ensemble metamodels are reduced, ranging from 14.2% to 48.9%.
Table 6 shows the frequency of the accuracy ranking (using MAE) of the metamodels for the six benchmark problems with
training points. It can be seen that: (1) considered the frequency of the metamodels that rank the top/bottom two, PrsRbfm, PrsRbfi, and SvrRbfm outperform all the individual metamodels; (2) although SvrRbfi is a little worse than PRS, it still performs better than its corresponding individual metamodels (SVR and RBFI); (3) PrsRbfm is the best performer of the four ensemble metamodels, followed by SvrRbfm, PrsRbfi, and SvrRbfi.
In summary, the choice of the performance criteria influence the results slightly, but the conclusions obtained by the two criteria remain unchanged.
4.3. Effect of Sampling Densities
The choice of different sampling densities may also influence the results of the metamodels. To investigate the effect of the sampling densities, this paper selects another two schemes with different sampling densities, which are and .
Table 7 shows the normalized means of RMSE of the metamodels for each benchmark problem with
training points. It can be seen that: (1) compared with the most accurate individual metamodels, the means of RMSE of the most accurate ensemble metamodels are reduced, ranging from 0.9% to 8.1%; (2) compared with the least accurate individual metamodels, the means of RMSE of the least accurate ensemble metamodels are reduced, ranging from 23.4% to 53.8%; (3) except for BP3, more than two ensemble metamodels perform better than the most accurate individual metamodels; (4) all the ensemble metamodels perform better than the four individual metamodels; (5) PrsRbfm is the best performer among the four metamodels, while SvrRbfi is the worst performer.
Table 8 shows the normalized means of RMSE of the metamodels for each benchmark problem with
training points. It can be seen that: (1) compared with the most accurate individual metamodels, the means of RMSE of the most accurate ensemble metamodels are reduced for five benchmark problems, ranging from 0.9% to 16.9%; (2) compared with the least accurate individual metamodels, the means of RMSE of the least accurate ensemble metamodels are reduced, ranging from 20.9% to 51.3%; (3) all the ensemble metamodels have better performance than the four individual metamodels.
In summary, the choice of different sampling densities influences the results slightly, but the conclusions obtained by the three schemes with different sampling densities remain unchanged.
4.4. Significance of Results
The results above have proven the effectiveness of the proposed method to some extent. To further demonstrate the advantages, the proposed method is compared with some other popular ensemble metamodels, which are BPS (Best PRESS surrogate), PWS (PRESS weighted average surrogate), and OWSD (Optimal weighted surrogate using the diagonal elements). The detailed descriptions of these ensemble metamodels can be found in relevant literature [
35,
37]. Additionally, Kriging with first order polynomial regression function (KRG1) and Kriging with second-order polynomial regression function (KRG2) are also included in the performance comparison. To be noted, the principle and modeling process of Kriging are different from that of the proposed metamodeling approach in this paper.
Figure 6 compares the performance of PrsRbfm, SvrRbfm, KRG1, KRG2, BPS, PWS, and OWSD. It can be seen that: (1) for BP1, PrsRbfm and SvrRbfm perform better than the other five metamodels; (2) for BP2, SvrRbfm and BPS are the best two performers; (3) for BP3, the accuracy of PrsRbfm and BPS are better than that of the other metamodels; (4) for BP4, PrsRbfm and KRG2 are the best two performers; (5) for BP5, SvrRbfm and BPS are more accurate than other metamodels; (6) for BP6, PrsRbfm and KRG2 perform better the other metamodels.
In summary, the proposed metamodeling approach possesses some advantages when compared with KRG1, KRG2, BPS, PWS, and OWSD.