1. Introduction
Dam safety is related to the national economy and people’s livelihood. With the rapid development of information technology, big data, the Internet of Things, electronic communication, and other technologies, dam operation and safety control are gradually developing towards automation and intelligence [
1] while continuous and reliable dam safety monitoring data is a prerequisite for the scientific evaluation of dam operation and safety conditions. However, affected by factors such as short-term abnormalities in monitoring instruments, monitoring instrument replacement, measurement errors, and external environmental disturbances [
2], dam safety monitoring data is prone to data omissions, oscillation fluctuations, response misalignment, and other data anomalies. The occurrence of these anomalies will affect the continuity and reliability of the monitoring sequence, cause misjudgment of the dam’s operational state, and even endanger the safe operation of the dam [
3]. Therefore, it is essential to build a high-precision model for repairing dam safety monitoring data to control the general law and development trend of dams in real time for the intelligent control of dam safety operation [
4,
5].
At present, the repair of safety monitoring data mainly starts from the dimension of time and mostly adopts linear regression analysis [
6,
7], principal component analysis [
8], machine learning algorithm [
9,
10], and so on. Among them, Vazifehdan et al. [
11] proposed a method of combining a naive Bayesian network with tensor decomposition to repair missing data. Stojanovic et al. [
12] used linear regression and genetic algorithm to construct an adaptive system for dam behavior modeling, which significantly improves the accuracy of data repair. Du et al. [
13] proposed a model for repairing data with good performance in terms of training time and model robustness based on a deep learning approach. Nevertheless, these methods realize data repair by constructing time series models, which have high repair accuracy for data sequences with good periodic regularity and fail to meet the application requirements for data sequences with long-missing measurement time and poor data regularity [
14]. Moreover, the single-point monitoring model can only reflect the local characteristics of the dam, but cannot reflect the overall state. The dam as a whole structure, the sequence of monitoring points for deformation, seepage and stress-strain of the dam contains information in both the temporal and spatial dimensions, that is, there is a correlation between time and space. Therefore, constructing a multi-dimensional spatial model for repairing data from the spatial coordination and consistency of dam effect variables is undoubtedly an effective path to solving this problem.
Since the gradual development of spatial information technology in the 1960s, many scholars have conducted a lot of research on spatial models. Models such as Thiessen polygon [
15], inverse distance weighting [
16,
17], and kriging [
18,
19] had been introduced into spatial model construction. At present, there have been in-depth studies on the construction methods of spatial models in many fields such as meteorology [
20,
21] and geostatistics [
22] that achieved good results. For example, Adhikary [
23] used the KGP, which combined the nonparametric variogram model based on genetic programming with kriging, as a feasible alternative technique for spatial estimation and mapping of rainfall. Seo [
24] proposed a hybrid model called RKNNRK that combined regression kriging and neural network residual kriging for determining spatial precipitation distribution. In the field of water conservancy project, the research of spatial model is still in the exploratory stage. Mao et al. [
25] proposed a deep neural network multi-view learning method (DNN-MVL) based on inverse distance weighting to reveal the complex non-linear spatio-temporal relationships of dam deformation effectively. Zhao et al. [
26] developed a spatio-temporal monitoring model of the center of mass based on a least-squares support vector machine by introducing the initial coordinates of the center of mass. Lu et al. [
27] used the kriging interpolation method to construct a high-precision horizontal space-time gradient expansion model of the core wall of the earth-rock dam, which effectively reflected the three-dimensional deformation trend of the dam. Dai et al. [
28] combined Kalman filtering and kriging spatial interpolation to construct a spatio-temporal model of dam deformation, efficaciously filtering out the noise of deformation data in time and space. Yang et al. [
29] proposed a geographically and temporally weighted regression (GTWR) model which improved the accuracy of data repair during the missing period of measuring points. These methods are limited by monitoring technology and analysis theory, and mainly consider the geometric location of measuring points, and cannot reasonably consider the influence of important environmental factors on the number of monitoring effects. The accuracy and physical meaning of the model need to be improved [
30], and also there are relatively few application cases in dam safety monitoring data restoration.
To sum up, a cokriging spatial model for repairing data based on variable importance for the projection method was proposed in this paper. The model could identify the important environmental impact factors of different parts, and combined with the spatial layout characteristics of the same monitoring effect variables to construct a one-dimensional, two-dimensional, or three-dimensional spatial model with the measured values of the measuring points as the main variables and the important environmental impact factors as the covariates. The expansion of dam safety monitoring data to the whole spatial area of the dam effectively solves the problem of the existing spatial model being unable to take into account the influence of environmental quantities and the selection of environmental impact factors relying on experience, which can significantly improve the accuracy of repairing abnormal or missing data and ensure that safety monitoring data can truly reflect the safe operation of the dam.
2. Methodology
Reasonable selection of main variables and covariates is the key to affecting the accuracy of the cokriging spatial model. In general, the main variables are usually chosen as the effect variables, while the covariates are mostly chosen as the key factors affecting the effect variables, which are the environmental impact factors for dam safety monitoring. Hence, the VIP-cokriging spatial model for repairing dam safety monitoring data was proposed in this paper, which combined the variable importance for projection method and the basic principle of cokriging. Firstly, the dimension of spatial model construction is determined according to the spatial arrangement characteristics of similar monitoring instruments. Secondly, the variable importance for the projection method is used to identify important environmental impact factors for effect variables and use them as covariates. Then, according to the principle of cokriging, it uses cross-covariance and cross-semi-variance function modeling between multiple regionalized variables to characterize the correlation between them to obtain unbiased and optimal estimates of the regionalized variables. In order to calculate the covariance, the main variable covariance function, co-variable covariance function and cross-covariance function, the exponential model with better universality is selected to fit the covariance function. Finally, by fitting the covariance function, the measured values of spatially similar measurement points, covariates, and spatial location parameters are used to calculate spatially repaired measurements of data anomalies or unmeasured points. The flow is shown in
Figure 1. The specific steps of the method are described as follows:
2.1. Decision of the Construction Dimension of Spatial Model
The spatial position of monitoring instruments is usually expressed using dam wheelbase, pile number and elevation, which are essential parameters for constructing the VIP-cokriging spatial model. Depending on the spatial arrangement characteristics of similar measurement points, it is generally considered to construct one-dimensional, two-dimensional, and three-dimensional models. The one-dimensional models are usually constructed along a line of measurement, such as the horizontal displacement one-dimensional model of gravity dam crest and the uplift pressure one-dimensional model of sluice foundation, etc. The two-dimensional models are often constructed in conjunction with typical sections of dam safety monitoring, such as the internal horizontal displacement two-dimensional model of earth-rock dam and the temperature two-dimensional model of gravity dam section. The three-dimensional models mainly aim at the effect variables of the whole dam or the entire dam area, such as the appearance deformation three-dimensional model of the earth-rock dam, the three-dimensional model for seepage around the dam, and so on.
Due to the complexity and difficulty of implementing kriging interpolation in three-dimensional space, and the covariance is only a function of the distance between points. Cokriging is only applicable to two-dimensional plane coordinate systems. Therefore, when constructing the three-dimensional model, it is necessary to transform the three-dimensional spatial coordinates of the buried position of the monitoring equipment into two-dimensional plane coordinates, that is, the measuring points are projected onto the model building plane, and keep the distance between the measuring points after two-dimensional transformation basically consistent with the distance under the original coordinates. In practical engineering, different two-dimensionalization methods of three-dimensional coordinates should be selected according to the arrangement of instruments to ensure the application effect of the model after two-dimensional coordinates.
2.2. Selection of Covariates
The degree of correlation between the main variables and the covariates affects the accuracy of the cokriging model, so the factors with the higher degree of influence should be selected as the covariates. The influencing factors of dam deformation usually include water pressure, rainfall, temperature, aging, and so on. Taking the earth-rock dam as an example, the deformation of earth-rock dam caused by the change of water pressure is mainly formed by the compression and shear of soil and the displacement of dam foundation. The effect of temperature on the deformation of earth-rock dam is not obvious, and the change of external temperature generally only affects the surface soil. Rainfall infiltration raises water level and changes soil water content, thus affecting dam deformation. The creep of rockfill, the compression and plastic deformation of fissure joints and other weak structures at the bottom of reservoir under the action of water pressure, and the irreversible deformation of dam caused by cyclic loads such as the rise and fall of water level in front of dam also increase with the increase in dam operation life. Generally, three covariates are selected at most, and the more covariates are selected, the more complex and time-consuming the calculation is. Therefore, selecting appropriate covariates is the key to improving the accuracy and computational efficiency of the cokriging spatial model.
In this paper, the variable importance for projection method based on the partial least squares regression method [
31] was used to identify the influence degree of environmental factors. It can effectively solve the problems of unstable recognition effect and lack of spatial consistency commonly found in the currently used methods such as the mathematical model method, weighted area method, and grey correlation analysis [
32]. The central idea of the method is that the explanatory ability of the factor principal component
for the component
is equivalent to the correlation coefficient R
2 of the linear regression equation of
on
. Since the explanatory effect of
on
is transmitted through the principal component
, if the explanatory ability of
on
is strong and the effect of
on
is very significant, it can be assumed that the explanatory ability of
on
is strong. Therefore, the environmental factors such as water pressure, temperature, rainfall, and aging are regarded as independent variables
, and the effects variables such as dam deformation and seepage are regarded as dependent variables
. Then, the VIP
j value of the variable importance for the projection index is solved for each environmental factor separately, as seen in Equation (1). Among them, the values of water pressure, temperature and rainfall are directly measured by instruments, and the days from the observation time to the initial observation time are taken as the aging measurement values.
where
is the number of independent variables;
is the principal component extracted from related independent variables;
is the correlation coefficient between dependent variables and principal components, which indicates the explanatory ability of the principal component to
;
is the weight of independent variables on principal components.
The VIP
j value represents the degree of importance of the independent variables to the fit of the dependent variables, so the higher the value of VIP
j, the more significant the contribution of the independent variables to the dependent variables, and the more critical it is in the interpretation. Generally, if the VIP
j value of the independent variable is less than or equal to 0.8, its explanatory ability for
can be ignored [
33]. In this paper, combined with the analysis of several practical engineering cases, it was found that the introduction of environmental factors with VIP
j value greater than 0.8 as covariates had a significant effect on improving the accuracy of the spatial model, while the introduction of environmental quantity factors with VIP
j value less than 0.8 had little effect, see
Section 3 for details. Therefore, VIP
j = 0.8 was taken as the control threshold for introducing covariates, that was, the environmental quantities with VIP
j value greater than 0.8 were taken as covariates to participate in the construction of the cokriging spatial model.
2.3. Construction of Model
Cokriging interpolation greatly improves the accuracy of the interpolation by introducing one or more covariates (also referred to as auxiliary variables) that are closely related to the main variables, which is an improved method of kriging interpolation. This method is based on the theory of co-regionalized variables and carries out spatial interpolation according to it. The cokriging interpolation formula is as in Equation (2).
where
Z1,
Z2, …,
Zn and
Yj1,
Yj2, …,
Yjn are
n sample data for the main variables and covariates, respectively;
j is the number of covariates;
λ1,
λ2, …,
λn and
μj1,
μj2, …,
μjn are the cokriging weighting coefficients to be determined;
is the repaired value of the random variable at 0.
Taking the second order as an example, if there are two regionalized values
and
(
i = 1, 2, …,
n), which are related to the attribute in a region, then
. In order to satisfy the unbiased condition of the estimator, the sum of weights of main variables should be equal to 1, and the sum of weights of covariates should be equal to 0, as seen in Equation (3).
According to the irreversible bias of the estimated value and the least square method, the Lagrange multiplication is used to solve the cokriging equation, as seen in Equation (4).
It is expressed in matrix form as shown in Equation (5).
The cokriging method is based on the theory of co-regionalized variables. It uses cross-covariance and cross-semi-variance function modeling between multiple regionalized variables to characterize the correlation between them to obtain unbiased and optimal estimates of the regionalized variables. Because each environmental component has different effects on the effect variables of different parts, the environmental measurement value cannot be directly used as covariates to participate in the construction of the model. In this paper, by using the duration sequence of the measured value of each measuring point and the water pressure, temperature, rainfall, and aging, the non-linear fitting method was adopted to search for the most suitable non-linear expression of the measured value of each measuring point about each environmental variable in the global range. The measured value of the environmental variables at the moment to be calculated was brought into the function equation between them, and the change of the effect variable caused by the environmental variable was inferred, which was taken as a covariate to participate in the calculation, as seen in Equation (6).
where
h,
t,
p, and
θ are the values of water pressure, temperature, rainfall, and aging;
is the most suitable nonlinear expression for the measured values of the duration series of measuring point
i with respect to the measured values of the duration series of water pressure, same for
,
and
;
is the value of the water pressure at moment
k + 1, same for
,
and
;
is the water level covariate value of measuring point
i at moment
k + 1, same for
,
and
.
In order to calculate the covariance, the main variable covariance function, co-variable covariance function, and cross-covariance function need to be fitted by selecting appropriate function models. The covariance function and the parameters to be fitted are shown in
Figure 2.
Covariance function models commonly used in kriging theory include the exponential model, the spherical model, and the gaussian model [
34]. As the exponential model has better generalizability compared to the spherical model, the Gaussian model may not be interpolable at certain locations. With overall consideration, the exponential model was chosen to fit the covariance function, as seen in Equation (7).
where
is nugget;
is sill;
is partial sill;
is not the range here, because when
there is
, so
, and the range is
. When
and
, it is called a standard exponential model.
2.4. Calculation of Unrepaired Value
The optimal main variable covariance function, co-variable covariance function, and cross-covariance function are fitted by the exponential model, and the values of main variable covariance, covariate covariance, and cross-covariance are calculated by Equation (8) according to the known distance between any two points. Then, the calculated results are brought into Equation (4), and the weights of each measuring point and each covariate are obtained by solving
where
is the variance in the region
.
Finally, using the measured values Z1, Z2, …, Zn and their weights λ1, λ2, …, λn, the covariate values Yj1, Yj2, …, Yjn, and their weights μj1, μj2, …, μjn for spatially similar intact measuring points, the repair values of data abnormal or missing measuring points can be calculated using Equation (2).
3. Discussion on the Threshold of Covariate Introduction
Reasonable selection of covariates affects the model’s accuracy and computational efficiency, which is the key to constructing a cokriging spatial model. In order to ensure universality, the one-dimensional and the two-dimensional cokriging spatial model for different projects were constructed in this section. At the same time, the influence of different covariate combinations on the accuracy of the model was deeply analyzed, and the rationality of setting the VIPj control threshold of 0.8 was demonstrated.
The error analysis of a single measuring point adopted the most commonly used cross-verification method [
35].
N spatial models were obtained by sequentially leaving the measured values of each measuring point in a measurement time (
n measuring points in total) as the verification set, and the measured value samples of the other
n − 1 measuring points as the training set. The cross-validation results of all measuring points in the monitoring effect quantity were thus obtained, and the error analysis of each measuring point was carried out.
Overall error analysis represented the cross-validation results of all measuring points in each measurement by calculating the mean absolute error (MAE). It analyzed the error sequence of each measurement as a whole, as seen in Equation (9). The lower the MAE, the higher the overall accuracy.
where
is the measured value of the measuring point
i;
is the repaired value of measuring point
i.
3.1. Analysis of the Influence of Covariates Selection on the Accuracy of the One-Dimensional Spatial Model
A survey line was arranged at the elevation of 476.5 m upstream of a concrete face rockfill dam to monitor the vertical deformation of the surface, with a total of 10 measuring points, numbered SA6-1 to SA6-10. All measuring points are intact, and their arrangement and duration hydrograph are shown in
Figure 3 and
Figure 4. The VIP method was used to identify the influencing factors of the effect variables of each measuring point. It was found that the important influencing factors of each measuring point were the same, aging and temperature had a great influence on vertical deformation, VIP
j value exceeded 0.8 followed by water pressure, the VIP
j value ranged from 0.25 to 0.64, and rainfall had the least influence; the VIP
j value was less than 0.3, as shown in
Figure 5.
For the convenience of comparison and analysis, the aging single-factor, aging temperature two-factor, and aging temperature water pressure three-factor spatial cokriging models were constructed by using the measured values of each measuring point. The single-point error analysis was conducted for the measured values on 24 December 2020, as shown in
Table 1. In the environmental variables, the aging is a monotonic increasing function, and the water level, temperature and rainfall are all periodic functions with years as the unit. Therefore, the overall error analysis was conducted for the measured values from January 2020 to December 2020 (due to COVID-19 pandemic, there was no measured value in February 2020), as shown in
Figure 6. It could be seen that the repair accuracy of other measuring points was ideal, with absolute error within 5 mm and relative error within 3%, except for SA6-1 and SA6-10. Depending on the relative position of the points to be repaired in the group of known points, it can be divided into interpolation and extrapolation. Interpolation refers to repairing the unknown measuring points within the range, while extrapolation refers to repairing the unknown measuring points outside the range. The two measuring points SA6-1 and SA6-10 were located on the leftmost and rightmost banks in terms of spatial location. Hence, when cross-validation was carried out, the values at this point needed to be repaired by other deformation measurement points, and there were fewer measurement points around to refer to, which would lead to larger errors. On the contrary, SA6-3, SA6-7, and SA6-9 were located in the middle of the measuring points group, and the repair error was tiny. According to the error analysis of a single measuring point, the accuracy of the two-factor model of aging temperature was obviously better than that of the single-factor model of aging, and the absolute error of each measuring point could be reduced by more than 35%. The accuracy of the three-factor model of aging temperature water pressure was slightly lower than that of the two-factor model of aging temperature, and the repair accuracy of the three-factor model of about 60% of the measuring points was lower than that of the two-factor model. The overall error analysis showed that the mean absolute error of the two-factor model was 3.85~4.08 mm, which was on average 44% lower than the mean absolute error of the single-factor model (6.59~7.37mm). The mean absolute error of the three-factor model was slightly higher than that of the two-factor model, ranging from 3.90 mm to 4.20 mm. Therefore, introducing a temperature factor with the VIP
j greater than 0.8 as a covariate could significantly improve the accuracy of the model. Nevertheless, if the water level factor with the VIP
j less than 0.8 was added on this basis, only the measuring points SA6-2, SA6-4 and SA6-8 had a small improvement in model accuracy, while the other measuring points showed a decline in model accuracy.
In order to verify the stability of the temperature aging two-factor VIP-cokriging model, according to the monitoring sequence, one measurement was selected every year for cross-verification. The calculation results are shown in
Figure 4. From the figure, we could see that except the edge measuring points SA6-1 and SA6-10, the repair accuracy of other measuring points was ideal, and the relative error was basically controlled within 5%. It further proved that the accuracy of the model was relatively stable. There is no doubt that the framework cannot provide good predictions at one time interval and poor results at another.
3.2. Analysis of the Influence of Covariates Selection on the Accuracy of Two-Dimensional Spatial Model
One internal horizontal displacement measuring line was set at the elevations of 346.00 m, 379.00 m, 404.00 m, and 445.00 m on the left 0 + 008.20 m section of a concrete face rockfill dam, with a total of 20 measuring points, numbered EXa1-1~EXa1-2, EXa2-1~EXa2-4, EXa3-1~EXa3-6, and EXa4-1~EXa4-8, as shown in
Figure 7.
The variable importance for the projection method was used to identify the influence factors of effect variables at each measuring point. It was found that the important influence factors at each measuring point were consistent. The effects of aging and water pressure were more remarkable, with the VIP
j values being all greater than 1, while the effects of temperature and rainfall were relatively small, with the VIP
j values being basically between 0.4 and 0.6, as shown in
Figure 8.
As above, the spatial cokriging models of aging single-factor, aging water pressure two-factor, and aging water pressure rainfall three-factor were constructed by using the measured values of each measuring point. The single-point error analysis was conducted for the measured values on 11 June 2020, as shown in
Table 2. The overall error analysis was conducted for the measured values from January 2020 to December 2020, as shown in
Figure 9. It could be seen that the maximum relative error appeared at EXa3-4, which was analyzed to be caused by the small measured value. The analysis combined interpolation and extrapolation and found that although measuring points such as EXa1-1 and EXa4-1 were marginal points, there were more intact measuring points around them, so the extrapolation of the two-dimensional model was better than that of the one-dimensional model. According to the single measurement point error analysis, the accuracy of the two-factor cokriging model is significantly improved compared to the accuracy of the single-factor cokriging model, which is difficult to ignore, and the absolute error of each measurement point was reduced by more than 80%. The accuracy of the three-factor cokriging model is comparable to that of the two-factor cokriging model, the relative error reduction is not significant, and the accuracy of the two-factor cokriging model is slightly better than that of the three-factor cokriging model for the measuring points EXa2-1 and EXa3-1. The overall error analysis showed that the mean absolute error of the two-factor model (0.07~0.23 mm) was significantly lower than the mean absolute error of the one-factor model (3.95~4.22 mm), while the mean absolute error of the three-factor model (0.06~0.23 mm) was less different from that of the two-factor model. Thus, by analyzing the repair effects of introducing different covariates in the one-dimensional and two-dimensional models, the results obtained for both were consistent.
Based on the above, taking all environmental variables with the VIPj greater than 0.8 as covariates to construct a cokriging spatial model could greatly improve the accuracy of the model. However, if the factor with the VIPj value less than 0.8 was introduced as a covariate, there was no obvious effect on improving the model’s accuracy, and it might even reduce the accuracy of the model at some measuring points, and it would significantly increase the calculation amount and reduce the work efficiency. Therefore, considering the accuracy and technical efficiency of the model, it was reasonable to introduce the VIPj value greater than 0.8 as the environmental factor of the cokriging spatial model.
4. Verification and Analysis of Model Accuracy
To further analyze and validate the rationality and validity of the VIP-cokriging spatial model, the one-dimensional model case from
Section 3.1 was used as an example in this section. The aging temperature two-factor cokriging spatial model was compared with the inverse distance weighting model [
36], ordinary kriging model [
37], and universal kriging model [
38], which were more commonly used in the construction of spatial models for dam safety monitoring at present.
In this paper, the data of seven measurements from June 2020 to December 2020 of this measuring line were cross-repaired. The error statistics of the single point at typical moments are shown in
Table 3, and the overall error is shown in
Figure 10. As can be seen from the graph, the inverse distance weighting model was the worst, with the maximum relative error of edge measuring points reaching 178.73%, and the relative error of non-edge measuring points basically exceeding 10%, with the mean absolute error of 22.09 to 23.54 mm. The analysis of the reason should be related to the fact that the method only considered the influence of distance on unknown points while ignoring the correlation between measuring points. The ordinary kriging model took into account the autocorrelation between each measuring point in space, and its accuracy was significantly improved compared with the inverse distance weighting model. The relative errors of its non-edge measurement points were controlled within 10%, and the mean absolute error was 16.74~17.79 mm. This method considered that the variables were stationary in the region, that is, there was no trend quantity, whereas the vertical displacement of dam appearance generally had a growing trend. Therefore, the universal kriging model, which introduced the trend variables over the whole spatial distribution, was relatively good, with the mean absolute error of 11.46 to 12.68 mm.
Overall, the VIP-cokriging model had the highest accuracy. The relative error of non-edge measuring points was mostly controlled within 3%, and the relative error of the edge measuring points was reduced compared with the other three models. Its mean absolute error was 3.85 to 4.08 mm, which was 83%, 77%, and 67% lower than that of the inverse distance weighting model, the ordinary kriging model, and the universal kriging model. The main reason was that cokriging introduced one or more auxiliary environmental variables that had a strong correlation with the effect variables, and using the cross-correlation between them could effectively improve the spatial expansion accuracy of the main variable at the unknown points, which could solve the problem of the insufficient number of measuring points.
5. Engineering Application
The PBG gravel soil earth core wall rockfill dam with a maximum height of 186 m was completed and stored in water for power generation in 2009. Its internal vertical displacement was monitored by the hydraulic overflow settlement gauge, and the measuring points of the typical profile (profile 0 + 240 m) were arranged, as shown in
Figure 11. Due to the impact of temporary dam top removal and permanent dam top construction in 2012, there were many missing measurements between 2012 and 2013. Since completing the automation transformation in August 2013, the continuity of measured values was good. However, there were still some phenomena such as unstable measured values at some measuring points and blockage of instrument pipelines. For example, the CH10 and CH13 at the elevation of 725 m were damaged and measuring stopped in 2015. The stability of the CH14 measured value was poor, with an anomalous steep step from July 2013 to July 2014. At present, the measured value is close to the measuring range of the instrument, and the reliability of the measured value is poor, as shown in
Figure 12.
To control the overall distribution characteristics of vertical displacement at profile 0 + 240 m, the original monitoring data of all measuring points are used to draw an isoline map of settlement deformation at a typical high water level moment (12 October 2017), as shown in
Figure 13a. It could be seen from the figure that the settlement deformation of the profile showed an increasing downward trend from the middle of the dam body, and the maximum settlement deformation occurred above the dam foundation. It was inconsistent with the law that the maximum settlement deformation of gravel soil core wall rockfill dam generally occurs in the area of 1/3~2/3 dam height. Analysis of the reasons should be related to the lack of measuring points at 731 m elevation and the unreliable measurement of CH14. To avoid misjudgment of dam operation behavior, a VIP-cokriging two-dimensional spatial model of internal vertical displacement at profile 0 + 240 m was constructed in this paper. The model was used to supplement the missing measured values of CH10 and CH13, and to repair the measured values of CH14.
By analyzing the variable importance for projection, as shown in
Figure 14, the cokriging spatial model is constructed by selecting aging and water pressure as covariates. To further verify the accuracy and stability of the model, cross-validation was carried out by selecting the moments when all measurement points were intact. Except for the points CH8 and CH16, the relative errors of the remaining measurement points ranged from 0.24% to 1.90%, especially the relative errors of CH10, CH13, and CH14 measurement points were minor and controlled within 2%; the calculation results are shown in
Figure 15 and
Table 4. The reasons for it should be related to the fact that the points CH13 and CH14 were internal measuring points and that the CH10 point was close to the known measuring points and had more known measuring points around it. Therefore, it was feasible to use the VIP-cokriging spatial model to repair the data of measuring points.
The duration hydrograph of the repaired measuring points is shown in
Figure 12, and the isoline map of the vertical displacement of the profile is shown in
Figure 13b. The pattern of changes in the measured values of the repaired points was the same as that before 2012, with an overall convergence and stabilization and a high degree of restoration accuracy. The overall pattern of vertical deformation of the profile showed that the internal settlement deformation was greater than the surface at the same elevation. The settlement was mainly controlled by the section thickness; the closer to the core wall, the greater the section thickness and the greater the settlement relative to other locations at the same elevation. The maximum settlement deformation of the dam occurred in the middle and lower part of the dam, about 1/3 to 2/3 of the dam height area, which was consistent with the general settlement deformation law for gravel soil core wall rockfill dams.
6. Conclusions
In this paper, aiming at the problems of the spatial models only considering the geometric position of measuring points, or the data repair mostly adopting time series models which may lead to poor accuracy, a spatial model for repairing dam safety monitoring data based on VIP-cokriging was proposed. The construction of the model included the steps of the decision of the construction dimension of the spatial model, selection of covariates, fitting of covariance function, and calculation of unrepaired value. Finally, the accuracy and applicability of the model were verified with engineering examples, and the following conclusions can be drawn:
(1) Aiming at the problem of the spatial model for repairing dam safety monitoring data not considering the correlation with environmental variables, a cokriging spatial model based on the variable importance for projection was proposed. Firstly, the spatial model construction dimension was determined according to the spatial arrangement characteristics of similar monitoring instruments. Secondly, based on partial least squares regression, principal component analysis and typical correlation analysis were introduced to decompose and filter the information, and important environmental impact factors were identified and calculated as covariates. Finally, the weights of the same kind of intact measuring points and the weights of the covariates calculated by fitting the covariance function were brought into the model to calculate the repair value of the abnormal or missing measuring points.
(2) Based on engineering cases of the one-dimensional and the two-dimensional models, the relationship between the accuracy of the VIP-cokriging model and the selection of covariates was analyzed. By comparing the repair accuracy of different covariate schemes chosen, the reasonableness of the VIPj control threshold setting of 0.8 was demonstrated. Choosing the environmental variable with the VIPj value greater than 0.8 to participate in the calculation could significantly improve the model’s accuracy. Otherwise, it might increase the calculation amount and reduce the work efficiency without improving the model’s accuracy. Moreover, the VIP-cokriging model has strong applicability and high accuracy at any time.
(3) The VIP-cokriging spatial repair model can greatly improve the data repair effect and has the advantages of high precision and strong applicability. The application showed that VIP-cokriging effectively improved the accuracy of the spatial expansion of the main variables at the unknown point and reduced the error range at repair points by taking into account the influence of important environmental variables and utilizing the cross-correlation between environmental variables and effect variables. Compared with the inverse distance weight model, the ordinary kriging model, and the universal kriging model, the repair error was reduced by 83%, 77%, and 67%.
(4) Aiming at the problems of the damage of the monitoring points and the instability of the measured values at the profile 0 + 240 m of the PBG dam, a two-dimensional VIP-cokriging spatial model was adopted to extend the monitoring effect variables to the whole area of the dam space. The scientific correlation between existing monitoring points and the unknown spatial location of the dam was achieved, and the problem of repairing missing or abnormal data at key measuring points was solved. The construction of the model was beneficial to reasonably grasping the overall distribution of dam deformation, effectively avoiding the misjudgment of the safe operation of the dam, and had good application value in engineering.
(5) The construction of three-dimensional model needs three-dimensional coordinates of monitoring equipment position information, but the cokriging method is only suitable for two-dimensional plane coordinate system, and kriging interpolation in three-dimensional space is very complex and difficult to realize. The covariance of the kriging interpolation method is a function based on the distance between measuring points. When constructing the three-dimensional model of dam safety monitoring, the three-dimensional coordinates can be transformed into two-dimensional coordinates, but the two-dimensional coordinates will possibly lose the anisotropy in the monitoring effect variables, which would lead to the calculation results that cannot reflect the real state of the dam. Therefore, in the follow-up study, the cokriging method should be expanded in three-dimensional space to build a more perfect three-dimensional repair model.