Water Level Simulation in River Network by Data Assimilation Using Ensemble Kalman Filter

Chen, Yifan; Cao, Feifeng; Meng, Xiangyong; Cheng, Weiping

doi:10.3390/app13053043

Open AccessArticle

Water Level Simulation in River Network by Data Assimilation Using Ensemble Kalman Filter

by

Yifan Chen

^1,2

,

Feifeng Cao

^2,*,

Xiangyong Meng

³ and

Weiping Cheng

⁴

¹

Zhejiang Provincial Key Laboratory of Hydraulic Disaster Prevention & Mitigation, Hangzhou 310020, China

²

College of Civil Engineering, Zhejiang University of Technology, Hangzhou 310023, China

³

Zhejiang Institute of Hydraulic & Estuary, Hangzhou 310020, China

⁴

College of Civil Engineering and Architecture, Zhejiang University, Hangzhou 310058, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(5), 3043; https://doi.org/10.3390/app13053043

Submission received: 21 January 2023 / Revised: 22 February 2023 / Accepted: 22 February 2023 / Published: 27 February 2023

(This article belongs to the Section Civil Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Featured Application

The new data assimilation method based on the ensemble Kalman filter with the optimized parameters produces accurate water level predictions in river networks; it is thus a potential model for the real-time prediction of water levels in river networks in practice.

Abstract

Water level simulation for complex water river networks is complex, and existing forecasting models are mainly used for single-channel rivers. In this paper, we present a new data assimilation model based on the ensemble Kalman filter (EnKF) for accurate water level simulation in complex river networks. The EnKF-based data model was tested on simulated water level data from a river network hydrodynamic model and optimized through parameter analysis. It was then applied to a real mountainous single-channel river and plain river network and compared with a data assimilation model based on the extended Kalman filter (EKF). The results showed that the EnKF-based model, with a medium ensemble sample size of 100–150, normal observation noise of 0.0001–0.01 m, and a high standard deviation of 0.01–0.1 m, outperformed the EKF-based model, with a 49% reduction in simulation errors, a 45% reduction in calculation cost, and a 43% reduction in filtering time. Furthermore, the EnKF-based data assimilation model predicted the water level in the plain river network better than the mountainous single-channel river. Around 5 to 8 h were required for data assimilation; afterwards, the model could make accurate predictions covering 20 to 30 h. The EnKF-based data assimilation model offers a potential solution for water level predictions in river networks.

Keywords:

water level simulation; ensemble Kalman filter; extended Kalman filter; data assimilation; water river network; parameter analysis

1. Introduction

River networks play an important role in watershed landscapes due to their contributions to the watershed’s hydrological processes. Compared to single open-channel rivers, they are more complex and more sophisticated hydraulic simulation models are required in order to solve unsteady flow problems in river networks. In addition, due to human activity, the river network structure of basins has changed significantly, causing spatial and temporal differences in hydrological connectivity [1]. The complex nature of river networks produces challenges in flow modeling, particularly in flood forecasting for a river network. As a result, the theories and corresponding algorithms for simulating and forecasting hydraulic flow in river networks have not been well established. Although flood forecasting with machine learning or deep learning models has been attempted in relation to flood simulations [2,3,4,5], a large amount of data are required as inputs for the sufficient training of model parameters. The large number of parameters and the overfitting of parameters are also great concerns when attempting to realize accurate real-time predictions [6].

The existing one-dimensional Saint-Venant equations, which include a series of nonlinear partial differential equations [7], are generally used to solve the hydrodynamic process by means of the river network hierarchical differential method [8]. However, many factors, such as the uncertainty of the model parameters and the complex boundary conditions, limit the application of the one-dimensional Saint-Venant equations for river networks; the corresponding models fail to adjust modeling errors with the data measured in real time. The use of Kalman filter (KF) real-time correction technology to improve the reliability of water simulations and the forecast accuracy of hydrodynamic models has become an important research topic in the current approach towards flood forecasting [9,10,11,12,13]. To improve the accuracy of the model, many studies have combined KF with other models. For example, Madsen and Skotner [14] developed a river flood correction model by introducing the form of the spatial distribution of the gain information into the model’s error correction process, which was used to improve the form of the dynamic noise interfering matrix of the original system. Ziliani et al. [15] combined the ensemble Kalman filter (EnKF) with a shallow water equation (SWE) to reconstruct and predict a dam break event that was experimentally reproduced. Their study indicated that filter results could be made more accurate via the estimation of parameters alongside the state. Furthermore, some studies have adopted different types of parameters to improve flood forecasting. Meng et al. [16] established a data assimilation scheme based on the EnKF and a smoother containing two parameters: soil moisture and streamflow, particularly considering the time delay. By exploiting the benefits of soil moisture and streamflow, the model showed great potential for short-term flood forecasting by merging observations from ground measurements and remote-sensing data retrieval.

In addition, EKF and EnKF have been used to estimate both states and parameters in stochastic differential mixed-effect models with observation noise. These algorithms are efficient, easy to implement, and require less computation time than similar methods, such as the SAEM algorithm [17]. Given that the forecasting accuracy of a hydraulic model for river networks is mainly influenced by hydraulic states, such as discharge and model parameters such as bed roughness, the combination of state and parameter assimilation has been used in hydrological data assimilation studies [12,17]. However, it is unlikely that the joint estimation of state and parameters will fully replace batch calibration.

Due to the advantages of EnKF and EKF, they are also commonly used in many other fields for numerical weather prediction, stochastic signal processing, and Bayesian state estimation. These fields often face high-dimensional problems and require scalable algorithms to process and make sense of the data. A comparison showed that EKF was more accurate and faster in optimizing the magnetic particle tracking (MPT) method in particle tracking [18]. Similarly, a modified EnKF was established to deal with nonlinearity in the multiple-target tracking (MTT) and multi-sensor fusion (MSF) scenarios, which was more accurate than traditional KF [19]. In stochastic signal processing (SP), high-dimensional problems are becoming more relevant in applications such as simultaneous localization and mapping (SLAM) and extended target tracking, where the state vector can contain many parameters. EnKF can be applied to these problems as it can handle nonlinear and non-Gaussian state-space models [20]. EnKF has also been applied in reservoir engineering, where the pressure, saturation, and temperature at different grid cells in the reservoir represent the state of the system. EnKF estimates the state of the system by integrating measurements from various sources such as production data, seismic data, and well-log data. EnKF has the benefit of allowing for the prediction of reservoir performance by running the simulator in a forward direction from the latest data assimilation time (analysis step) with each ensemble member [13].

One of the key aspects in river network research is flood forecasting. Flood simulation and forecasting in plain river network areas are greatly affected by the multi-level spatial river network structure, the complex water movement, the intervention of an upper dam or gate regulation, as well as the influence of uncertain factors such as boundary input conditions, riverbed variations, and model parameters. There are still more difficulties and challenges in such areas compared to single open-channel rivers in terms of calculation efficiency, the data treatment (e.g., data assimilation) process and degree, and forecasting periods (a few hours to days ahead). To date, very few studies have been conducted to investigate the performance of KF or its variants on the water flow simulation of river networks and, where available, they have simplified the river networks with the use of limited river branches and nodes [21], or short-term flow forecasting was carried out [9]. For example, Munier et al. [9] developed a lumped hydrological model to propagate the discharge through the Seine river basin in France and KM was used to assimilate the discharge data to improve the simulation accuracy of flow forecasting within 72 h. However, the simulation results could be significantly influenced by the degree of river network spatialization and the spatial distribution.

In this study, an EnKF algorithm was developed for performing data assimilation on simulated water level data from a river network hydraulic model for real-time water level prediction. A new data assimilation scheme was proposed for applying EnKF to simulated water level data to improve the simulation capacity, followed by a parameter analysis conducted on a simulated river network in order to understand and optimize the key parameter values. Based on the optimized parameter data, an experimental evaluation was conducted on a real single-channel river and a real plain river network with an area of 325 km². In addition, the data assimilation results obtained with EnKF were compared to those obtained with the classic Kalman filter for hydraulic simulation, i.e., the extended Kalman filter (EKF), with regard to their simulation accuracy, calculation cost, and data assimilation efficiency. In this study, we aimed to provide a data assimilation scheme using EnKF to improve the real-time simulation capacity of a river network hydraulic model.

2. One-Dimensional Hydrodynamic Model of a River

Saint-Venant equations are often used to describe one-dimensional flow, as follows.

The continuous equation:

\frac{\partial A}{\partial t} + \frac{\partial Q}{\partial x} = q_{L}

(1)

The momentum equation:

\frac{\partial Q}{\partial t} + \frac{\partial}{\partial x} (\frac{α Q^{2}}{A}) + g A \frac{\partial Z}{\partial x} + g A \frac{Q |Q|}{K^{2}} = q_{L} v_{x}

(2)

where A is the cross-section area; t is time; Q is the average flow of the cross-section; x is displacement along the water flow; q_L is the lateral streamflow per unit length; a is the momentum correction coefficient; g is the acceleration of gravity; Z is the water level of the section; K is the flow modulus, K = AR^2/3/n, where R is the hydraulic radius of the river and n is the Manning coefficient; and v_x is the velocity component of the lateral stream flow along the water flow, usually taken as v_x = 0.

Saint-Venant equations are hyperbolic partial differential equations for which analytical solutions are not yet available. The finite difference method is an effective numerical discretization method for solving the Saint-Venant equations and is divided into two schemes: the explicit difference scheme and the implicit difference scheme. Although the explicit difference scheme does not require the joint solution of equations, the time step and space step are limited by strict adherence to the “Courant condition” in order to maintain stability, whereas the implicit difference scheme is more flexible in the selection of the time and space step and is therefore popularly used. In this study, the Preissmann four-point implicit difference scheme were adopted to discretize the equations, and the space-time discretization schemes of the continuous and momentum equations were obtained as follows:

Difference form for the continuous equations:

\begin{array}{r} \frac{θ (Δ B_{i + 1} + Δ B_{i}) + (B_{i}^{t} + B_{i + 1}^{t})}{2} \cdot \frac{Δ Z_{i + 1} + Δ Z_{i}}{2 Δ t} + (θ \frac{Δ Q_{i + 1} - Δ Q_{i}}{Δ x} + \frac{Q_{i + 1}^{t} - Q_{i}^{t}}{Δ x}) \\ = \frac{θ (Δ q_{L i} + Δ q_{L i + 1}) + (q_{L i}^{t} + q_{L i + 1}^{t})}{2} \end{array}

(3)

Difference form for the momentum equations:

\begin{array}{l} \frac{Δ Q_{i + 1} + Δ Q_{i}}{2 Δ t} + \frac{θ}{Δ x} [\frac{{(Q_{i + 1}^{t} + Δ Q_{i + 1})}^{2}}{A_{i + 1}^{t} + Δ A_{i + 1}} - \frac{{(Q_{i}^{t} + Δ Q_{i})}^{2}}{A_{i}^{t} + Δ A_{i}}] + \frac{1 - θ}{Δ x} [\frac{{(Q_{i + 1}^{t})}^{2}}{A_{i + 1}^{t}} - \frac{{(Q_{i}^{t})}^{2}}{A_{i}^{t}}] + \\ [\frac{g θ}{2} (Δ A_{i + 1} + Δ A_{i}) + \frac{g}{2} (A_{i + 1}^{t} + A_{i}^{t})] [\frac{θ}{Δ x} (Δ Z_{i + 1} - Δ Z_{i}) + \frac{1}{Δ x} (Z_{i + 1}^{t} - Z_{i}^{t})] + \\ \frac{g θ}{2} [\frac{(A_{i + 1}^{t} + Δ A_{i + 1}) (Q_{i + 1}^{t} + Δ Q_{i + 1}) | Q_{i + 1}^{t} + Δ Q_{i + 1} |}{{(K_{i + 1}^{n} + Δ K_{i + 1})}^{2}} + \frac{(A_{i}^{t} + Δ A_{i}) (Q_{i}^{t} + Δ Q_{i}) | Q_{i}^{t} + Δ Q_{i} |}{{(K_{i}^{n} + Δ K_{i})}^{2}}] + \\ \frac{g (1 - θ)}{2} [\frac{A_{i + 1}^{t} Q_{i + 1}^{t} | Q_{i + 1}^{t} |}{{(K_{i + 1}^{n})}^{2}} + \frac{A_{i}^{t} Q_{i}^{t} | Q_{i}^{t} |}{{(K_{i}^{n})}^{2}}] = 0 \end{array}

(4)

where θ is the Preissmann weighting factor; superscripts t and t + 1 and subscripts i and i + 1 represent the time and one-dimensional spatial layers, respectively; and B is the river width. Thus, the following linear equations, expressed in increments, are obtained as follows:

\{\begin{cases} a_{1 i} Δ Q_{i} + b_{1 i} Δ Z_{i} + c_{1 i} Δ Q_{i + 1} + d_{1 i} Δ Z_{i + 1} = e_{1 i} \\ a_{2 i} Δ Q_{i} + b_{2 i} Δ Z_{i} + c_{2 i} Δ Q_{i + 1} + d_{2 i} Δ Z_{i + 1} = e_{2 i} \end{cases}

(5)

where

a_{1 i} = - 4 \frac{θ Δ t}{Δ x_{i}} \frac{1}{B_{i}^{t} + B_{i + 1}^{t}}; b_{1 i} = 1; c_{1 i} = 4 \frac{θ Δ t}{Δ x_{i}} \frac{1}{B_{i}^{t} + B_{i + 1}^{t}}; d_{1 i} = 1;

e_{1 i} = 2 \frac{Δ t (q_{L i}^{t} + q_{L i + 1}^{t})}{B_{i + 1}^{t} + B_{i}^{t}} - 4 \frac{Δ t}{Δ x_{i}} \frac{Q_{i + 1}^{t} - Q_{i}^{t}}{B_{i + 1}^{t} + B_{i}^{t}}; a_{2 i} = 1 - 4 \frac{θ Δ t}{Δ x_{i}} \frac{Q_{i}^{t}}{A_{i}^{t}} + 2 θ Δ t g \frac{A_{i}^{t} | Q_{i}^{t} |}{{(K_{i}^{t})}^{2}};

b_{2 i} = \frac{θ Δ t}{Δ x_{i}} [2 \frac{{(Q_{i}^{t})}^{2} B_{i}^{t}}{{(A_{i}^{t})}^{2}} - g (A_{i}^{t} + A_{i + 1}^{t}) + g (Z_{i + 1}^{t} - Z_{i}^{t}) B_{i}^{t}] + g θ Δ t \frac{Q_{i}^{t} | Q_{i}^{t} |}{{(K_{i}^{t})}^{2}} (B_{i}^{t} - 2 \frac{A_{i}^{t}}{K_{i}^{t}} \frac{d K_{i}^{t}}{d Z_{i}^{t}});

c_{2 i} = 1 + 4 \frac{θ Δ t}{Δ x_{i}} \frac{Q_{i + 1}^{t}}{A_{i + 1}^{t}} + 2 θ Δ t g \frac{A_{i + 1}^{t} | Q_{i + 1}^{t} |}{{(K_{i + 1}^{t})}^{2}}

\begin{array}{c} d_{2 i} = \frac{θ Δ t}{Δ x_{i}} [- 2 \frac{{(Q_{i + 1}^{t})}^{2} B_{i + 1}^{t}}{{(A_{i + 1}^{n})}^{2}} + g (A_{i}^{t} + A_{i + 1}^{t}) + g (Z_{i + 1}^{t} - Z_{i}^{t}) B_{i + 1}^{t}] + \\ g θ Δ t \frac{Q_{i + 1}^{t} | Q_{i + 1}^{t} |}{{(K_{i + 1}^{t})}^{2}} (B_{i + 1}^{t} - 2 \frac{A_{i + 1}^{t}}{K_{i + 1}^{t}} \frac{d K_{i + 1}^{t}}{d Z_{i + 1}^{t}}) \end{array};

e_{2 i} = \frac{Δ t}{Δ x_{i}} [- 2 \frac{{(Q_{i + 1}^{t})}^{2}}{A_{i + 1}^{t}} + 2 \frac{{(Q_{i}^{t})}^{2}}{A_{i}^{t}} - g (A_{i + 1}^{t} + A_{i}^{t}) (Z_{i + 1}^{t} - Z_{i}^{t})] - g Δ t [\frac{A_{i + 1}^{t} Q_{i + 1}^{t} | Q_{i +!}^{t} |}{{(K_{i + 1}^{t})}^{2}} + \frac{A_{i}^{t} Q_{i}^{t} | Q_{i}^{t} |}{{(K_{i}^{t})}^{2}}]

(6)

A_1i~E_2i are the expressions for each of the known quantities in time layer t. After the elimination of the elements, we can obtain

\{\begin{cases} Δ Z_{i + 1} = a_{1 i}^{'} Δ Z_{i} + b_{1 i}^{'} Δ Q_{i} + c_{1 i}^{'} \\ Δ Q_{i + 1} = a_{2 i}^{'} Δ Z_{i} + b_{2 i}^{'} Δ Q_{i} + c_{2 i}^{'} \end{cases}

(7)

where

a_{1 i}^{'} = \frac{c_{2 i} b_{1 i} - c_{1 i} b_{2 i}}{c_{1 i} d_{2 i} - c_{2 i} d_{1 i}}; b_{1 i}^{'} = \frac{c_{2 i} a_{1 i} - c_{1 i} a_{2 i}}{c_{1 i} d_{2 i} - c_{2 i} d_{1 i}}; c_{1 i}^{'} = \frac{c_{1 i} e_{2 i} - c_{2 i} e_{1 i}}{c_{1 i} d_{2 i} - c_{2 i} d_{1 i}};

a_{2 i}^{'} = \frac{d_{2 i} b_{1 i} - d_{1 i} b_{2 i}}{d_{1 i} c_{2 i} - d_{2 i} c_{1 i}}; b_{2 i}^{'} = \frac{d_{2 i} a_{1 i} - d_{1 i} a_{2 i}}{d_{1 i} c_{2 i} - d_{2 i} c_{1 i}}; c_{2 i}^{'} = \frac{d_{1 i} e_{2 i} - d_{2 i} e_{1 i}}{d_{1 i} c_{2 i} - d_{2 i} c_{1 i}}

(8)

The corresponding section recurrence equations are obtained by sequentially working backwards from the first section of the river to any section:

\{\begin{cases} Δ Z_{i + 1} = E M T (2 i + 1, 1) Δ Z_{1} + E M T (2 i + 1, 2) Δ Q_{1} + E M T (2 i + 1, 3) \\ Δ Q_{i + 1} = E M T (2 i + 2, 1) Δ Z_{1} + E M T (2 i + 2, 2) Δ Q_{1} + E M T (2 i + 2, 3) \end{cases}

(9)

where, when i = 0,

\{\begin{cases} E M T (1, 1) = 1, E M T (1, 2) = 0, E M T (1, 3) = 0 \\ E M T (2, 1) = 0, E M T (2, 2) = 1, E M T (2, 3) = 0 \end{cases}

(10)

and when i > 0,

\{\begin{cases} E M T (2 i + 1, 1) = a_{1 i}^{'} E M T (2 i - 1, 1) + b_{1 i}^{'} E M T (2 i, 1) \\ E M T (2 i + 1, 2) = a_{1 i}^{'} E M T (2 i - 1, 2) + b_{1 i}^{'} E M T (2 i, 2) \\ E M T (2 i + 1, 3) = a_{1 i}^{'} E M T (2 i - 1, 3) + b_{1 i}^{'} E M T (2 i, 3) + c_{1 i}^{'} \\ E M T (2 i + 2, 1) = a_{2 i}^{'} E M T (2 i - 1, 1) + b_{2 i}^{'} E M T (2 i, 1) \\ E M T (2 i + 2, 2) = a_{2 i}^{'} E M T (2 i - 1, 2) + b_{2 i}^{'} E M T (2 i, 2) \\ E M T (2 i + 2, 3) = a_{2 i}^{'} E M T (2 i - 1, 3) + b_{2 i}^{'} E M T (2 i, 3) + c_{2 i}^{'} \end{cases}

(11)

For a river with m reaches, the number of cross sections is m + 1. Based on Equation (7), we can obtain

\{\begin{cases} Δ Q_{m + 1} = a Δ Z_{1} + b Δ Z_{m + 1} + c \\ Δ Q_{1} = d Δ Z_{1} + e Δ Z_{m + 1} + f \end{cases}

(12)

where

a = E M T (2 m + 2, 1) - \frac{E M T (2 m + 1, 1) * E M T (2 m + 2, 2)}{E M T (2 m + 1, 2)}; b = \frac{E M T (2 m + 2, 2)}{E M T (2 m + 1, 2);}

c = E M T (2 m + 2, 3) - \frac{E M T (2 m + 1, 3) * E M T (2 m + 2, 2)}{E M T (2 m + 1, 2)}; d = - \frac{E M T (2 m + 1, 1)}{E M T (2 m + 1, 2)};

e = \frac{1}{E M T (2 m + 1, 2)}; f = - \frac{E M T (2 m + 1, 3)}{E M T (2 m + 1, 2)}

(13)

According to the above equations, we can see that the flow increment at the first and last cross-sections of the river can be expressed in terms of the increment of the water level. Therefore, the corresponding equations using the non-increment method can be obtained by expanding Equation (12). The equations are:

\{\begin{cases} Q_{m + 1}^{t + 1} = a Z_{1}^{t + 1} + b Z_{m + 1}^{t + 1} - a Z_{1}^{t} - b Z_{m + 1}^{t} + Q_{m + 1}^{t} + c \\ Q_{1}^{t + 1} = d Z_{1}^{t + 1} + e Z_{m + 1}^{t + 1} - d Z_{1}^{t} - e Z_{m + 1}^{t} + Q_{1}^{t} + f \end{cases}

(14)

In addition, in this study we adopted a three-stage joint solution method based on a large sparse matrix solver to solve the river network. The method divides the solution of the equations into three levels—the micro-segments, river segments, and nodes—and calculates and back-substitutes them step by step, in order to compress the size of the matrix and improve the calculation speed.

3. General Form of a River Network Hydrodynamic System

The general form of a one-dimensional hydrodynamic equation formed using the model described above can be expressed as

F (X, t) = 0

(15)

where F is defined as the hydrodynamic equation. X is the hydraulic state variable, including the discharge, water depth, or water level, and t is time.

Given the existence of white noise in relation to hydraulic state variables, the following discretized nonlinear hydraulic state equation can be obtained:

X_{k} = Φ_{k} (X_{k - 1}, k - 1) + Γ_{k} (X_{k - 1}, k - 1) W_{k}

(16)

where Φ_k refers to the nonlinear hydraulic state functional from the state at time k − 1 to the state at k, including the hydrodynamic model functions written in Section 2, and Γ_k is the weighting matrix of state noise, which is normally taken as a unit matrix. W_k is a dynamic noise with a mean value equal to zero.

In addition, the equation for the measured river network state variables

Y_{k}

can be expressed as:

Y_{k} = H_{k} (X_{k}) + V_{k}

(17)

where H_k is the linear measurement function at time k; V_k is the measurement noise with a mean value equal to zero.

By combining Equations (2) and (3), we obtain the equations describing the hydrodynamic nonlinear dynamic system of a river network:

\{\begin{cases} X_{k} = Φ_{k} (X_{k - 1}, k - 1) + Γ_{k} (X_{k - 1}, k - 1) W_{k} \\ Y_{k} = H_{k} (X_{k}) + V_{k} \end{cases},

(18)

4. Application of Ensemble Kalman Filter

In this study we established a new scheme based on EnKF and applied it to a river network hydrodynamic system. A three-stage joint solution method was used to solve the hydrodynamic model of the river network, and the water level was selected as the state variable of the nonlinear dynamic system of the river network. Before the start of the filtering treatment, the Box–Muller transform method was used to generate an initial sample ensemble of nodal water level randomly. The data followed the normal distribution

N (μ, σ^{2})

. To ensure the diversity of the sampling data, with reasonable dispersion, the average value μ was determined based on the simulated water level at the start of filtering, and the standard deviation was also set as a certain proportion of the nodal water depth at the start of filtering.

The initial sample ensemble of the nodal water level was generated by means of the Box–Muller transform method. Its average value

\bar{X}

is shown as:

\bar{X} = \frac{1}{m} \sum_{i = 1}^{m} X_{i},

(19)

where m represents the number of samples, i.e., the size of the ensemble. The sample values obey a normal distribution and have a standard deviation of

σ

.

The disturbance term ΔX_i for the ith ensemble member is

Δ X_{i} = X_{i} - \bar{X},

(20)

and ΔX is defined as the ensemble of disturbance terms:

Δ X = (Δ X_{1}, Δ X_{2}, \dots, Δ X_{m}) .

(21)

Then, the error variance matrix can be expressed as follows:

P = \frac{1}{m - 1} Δ X {(Δ X)}^{T},

(22)

The data assimilation process for river networks using the EnKF algorithm is shown in Figure 1. The details of the algorithm’s development are shown below.

When filtering, the first step is to calculate the state simulation values of each sample set, that is, the prediction values

{\hat{X}}_{k / k - 1}

:

{\hat{X}}_{k / k - 1} = Φ_{k} ({\hat{X}}_{k - 1})

(23)

Then, the prediction error variance matrix and gain matrix are calculated as in Equations (24) and (25), respectively. Calculating the gain matrix requires the introduction of the observation noise level, that is, the observation error. The Box–Muller method is also used to discretize the observation data to obtain the observation set, which is fed into the ensemble Kalman filtering algorithm, together with the forecast water level set for assimilation, to obtain a set of assimilated water levels, whose mean value is the optimal solution for data assimilation.

P_{k / k - 1} = \frac{1}{m - 1} Δ {\hat{X}}_{k / k - 1} {(Δ {\hat{X}}_{k / k - 1})}^{T},

(24)

K_{k} = P_{k / k - 1} H_{k}^{T} {(H_{k} P_{k / k - 1} H_{k}^{T} + R_{k})}^{- 1},

(25)

where R_k is the observation error covariance;

P_{k / k - 1} H_{k}^{T}

and

H_{k} P_{k / k - 1} H_{k}^{T}

are calculated according to the following equations:

P_{k / k - 1} H_{k}^{T} = \frac{1}{m - 1} \sum_{i = 1}^{m} \begin{array}{l} [{\hat{X}}_{i}^{k / k - 1} - \bar{{\hat{X}}_{k / k - 1}}] \times \\ [H_{k} ({\hat{X}}_{i}^{k / k - 1}) - \bar{H_{k} ({\hat{X}}_{k / k - 1})}]^{T} \end{array},

(26)

H_{k} P_{k / k - 1} H_{k}^{T} = \frac{1}{m - 1} \sum_{i = 1}^{m} \begin{array}{l} [H_{k} ({\hat{X}}_{i}^{k / k - 1}) - \bar{H_{k} ({\hat{X}}_{k / k - 1})}] \times \\ [H_{k} ({\hat{X}}_{i}^{k / k - 1}) - \bar{H_{k} ({\hat{X}}_{k / k - 1})}]^{T} \end{array},

(27)

where

\bar{{\hat{X}}_{k / k - 1}} = \frac{1}{m} \sum_{i = 1}^{m} {\hat{X}}_{i}^{k / k - 1},

(28)

\bar{H_{k} ({\hat{X}}_{k / k - 1})} = \frac{1}{m} \sum_{i = 1}^{m} H_{k} ({\hat{X}}_{i}^{k / k - 1}),

(29)

Finally, update each ensemble member of the water level variable based on the gain matrix:

{\hat{X}}_{i}^{k} = {\hat{X}}_{i}^{k / k - 1} + K_{k} [Y_{i}^{k} - H_{k} ({\hat{X}}_{i}^{k / k - 1})],

(30)

where

Y_{i}^{k} = Y^{k} + Δ Y_{i}^{k}

is the observation with disturbance,

Δ Y_{i}^{k}

is a disturbance term,

Δ Y_{i}^{k}

~N(0, R), and R is the observation error covariance. Note that the corrected water level is used as the current moment value, which is then used as the next moment value of the water level prediction. The mean value of the corrected water level is set as the best estimate of the current moment of data assimilation (optimal solution).

The correction error variance matrix is also calculated as an approximation of the error covariance matrix of the corrected values:

P_{k} = \frac{1}{m - 1} Δ {\hat{X}}_{k} {(Δ {\hat{X}}_{k})}^{T},

(31)

With recurrence in time, the above process can be cycled continuously to obtain the data assimilation results for each calculation moment. In addition, the filtering process can be ended when the correction error variance matrix is less than a certain value in order to improve the efficiency of the river network hydrodynamic model calculation; similarly, the filtering process can be started when the error value between the simulated and observed values exceeds a certain set value in order to improve the simulation accuracy of the river network hydrodynamic model.

5. Parameter Analysis

In this section, parameters, including the standard deviation

σ

of the initial sample ensemble, the observation error/noise level δ, and the size of the ensemble m, are analyzed in terms of their influence on the data assimilation performance for a simulated river network with unsteady-state flows. Note that this river network was adopted from the literature [22]. Figure 2 illustrates a looped canal river network with 14 branches connected by nodes depicted by number. This network was simulated as a convergent type, representing drainage systems in practice. Water level measurement stations were set at nodes 9 and 13, named Station 1 and Station 2. The detailed characteristics of this looped canal network can be found in the literature [22].

The analysis was conducted on a computer with an Intel Core i7-8700K CPU, 3.2 GHz CPU clock speed and 16.0 GB DDR4 RAM. The system was run on Windows 10 Enterprise, version 1909.

5.1. Observation Noise Level and Standard Deviation of the Initial Ensemble

Table 1 shows the test scheme. The results calculated based on the true value of the roughness were regarded as the precise water level values, and the observation noise (i.e., error) was added to the precise water level values to produce the observation values. The observation noise levels δ (0.0, 0.0001, 0.0005, 0.001, 0.005, 0.01, and 0.05 m) were analyzed. The standard deviation of the initial ensemble was 0.0, 0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, and 0.1 m. The average water depth value was 2.5 m; correspondingly, the relative observation noise level values were 0%, 0.004%, 0.02%, 0.04%, 0.2%,0.4%, and 2.0%. The relative standard deviation of the initial ensemble was 0%, 0.004%, 0.02%, 0.04%, 0.2%,0.4%, 2.0%, and 4.0%. A constant ensemble size of 100 was tested.

The performance of the EnKF algorithm with different parameters was assessed on the basis of the root-mean-square-error (RMSE) of the water level, which was calculated based on the simulated and precise values. When the observation noise level δ equaled 0.0, except for standard deviation = 0.0 (i.e., equivalent to no data assimilation), the data assimilation failed. This is because the gain matrix K decreased continuously during the data assimilation process, implying that the weight of using new information to correct the predicted values decreased, i.e., the tracking performance of the system decreased. It was thus necessary to use an observation noise level of more than zero to ensure a certain disturbance level and prevent data assimilation failure.

Figure 3 shows that the RMSE values changed greatly with different settings for the parameters of standard deviation, σ, and observation noise level, δ, at Stations 1 and 2. When σ = 0.0, the sample data in the initial ensemble were constant numbers, which were the initial conditions of the water level at each node. In this case, the simulation results were not affected by the observation noise or the ensemble size, and the results were equivalent to the results without data assimilation. The RMSE values decreased significantly with increasing standard deviation, particularly at the low level of observation noise. This indicates that a relatively high standard deviation and low observation noise level could lead to a good data assimilation result. For example, at Station 1, a negligible RMSE value of 0.0011 m was produced when σ = 0.1, δ = 0.0001, whereas a high RMSE value of 0.1563 m was obtained when σ = 0.0001, δ = 0.05.

5.2. Ensemble Size

As the computational resources are limited for realistic and large-scale hydraulic applications, the ensemble size should be small. We thus tested four ensemble sizes, n = 50, 100, 150, and 200. The observation noise level δ was taken to be 0.0001 m and 0.05 m, and the standard deviation σ was taken to be 0.0001 m and 0.1 m. Table 2 presents the parameters and their values.

Table 3 and Table 4 show the RMSE values with different ensemble sizes at certain observation noise levels and standard deviation values. Generally, the RMSE values decreased with increasing ensemble size for the low observation noise level (i.e., δ = 0.001, σ = 0.0001 or 0.1), whereas for large observation noise level of 0.05, either the RMSE values were constant or the calculation failed, indicating that the data assimilation process was not improved by increasing the ensemble size. As shown in Table 5, computational cost increased significantly with ensemble size, and a large ensemble size could result in an extensive cost in terms of computational resources. Thus, we determined that the ensemble sample size could be set at a medium level (i.e., 100 or 150). Based on the sensitivity analysis of the observation noise level and the standard deviation of the initial ensemble, we suggest that a relatively low observation noise level (e.g., from 0.0001 to 0.01 m) and a relatively high standard deviation (e.g., from 0.01 to 0.1 m), along with a medium ensemble sample size (e.g., from 100 to 150), should be used to produce a good data assimilation with low RMSE values.

6. Experimental Evaluation

6.1. Data Description

To demonstrate the effectiveness of the proposed algorithm with the optimized parameter values, obtained as described in the parameter analysis section, we used two sets of data from a real single open-channel river in Canada and a real river network in China.

Case 1: A Real Single Open-Channel River.

The Kootenai River originates in British Columbia, Canada, and flows through Montana, Idaho, USA. A 24-km-long section of the river was used for this study, as shown in Figure 4. There are three hydrological stations on the river: Bonners Ferry, Tribal Hatchery, and Klockmann Ranch, which are located upstream, midstream, and downstream of the river, respectively. The measured water volume flow at Tribal Hatchery was used as the boundary condition for the upstream entrance (as shown in Figure 5a), and the measured water level at Klockmann Ranch (as shown in Figure 5b) was used as the boundary condition for the downstream exit, whereas Bonners Ferry was used as the test site. The cross-sectional information and water level and flow information used in the tests were obtained from the official website of the American Geographic Society. Table 6 shows the basic information of the selected 24-km-long section of the Kootenai river.

Case 2: A Real River Network.

Located in the southeast of Zhejiang Province, China, the Le Liu Hong plain river network covers an area of 325 km². It is divided into three river network systems—Hongqiao, Lecheng, and Liushi—according to its topography and geomorphology, which are controlled by control gates and are independent of each other. Each of the three river network water systems has a water level station, Hongqiao station, Lecheng station, and Liushi station, as shown in Figure 6. The hydrodynamic model of the river network of Le Liu Hong plain includes 398 rivers, 304 nodes, and a total of 1962 cross-sections. In this study, we used the 2019 Lekima typhoon storm flood data for data assimilation testing.

6.2. Parameter Settings and Evaluation Metrics

The Box–Muller method was used to generate a random number set for the initial water level values at Bonners Ferry station in Case 1 and Hongqiao, Lecheng, and Liushi stations in Case 2. The set mean value was given as the state value at the beginning moment of data assimilation. Based on the parameter analysis and optimization described in the parameter analysis section, the ensemble size m, the standard deviation

σ

, and the observation noise level δ were taken to be 100, 0.1 m, and 0.005 m, respectively. As the water depth values were between 6 and 9 m for Case 1 and between 3 and 4.5 m for Case 2, the corresponding relative observation noise level values and the relative standard deviation values were approximately less than 5% of the water depth values.

The RMSE, calculated based on the water lever difference between the simulation and measurements from the start of the data assimilation process to the end of the water level simulation, was used as an evaluation metric. Furthermore, the algorithm calculation cost in seconds and the time for the algorithm to reach stability were used as the other two evaluation metrics. For Case 1, the simulation period covered 27 days, and the data assimilation process started on day 5 (hour 96). For Case 2, the simulation period was from 8:00 on 7 August 2019 to 8:00 on 12 August 2019 (5 days in total), assuming that data assimilation took place from 0:00 on 9 August 2019. The filtering ended when the elements on the diagonal of the water level correction error variance matrix were less than 0.001 m².

6.3. Algorithms for Comparison

Our data assimilation scheme based on the EnKF algorithm was compared to the EKF method, which is commonly used for hydraulic predictions for river networks. The corresponding observed noise level and filtering time period were consistent with those calculated using the ensemble Kalman filter for both testing cases. As both algorithms contain an observation noise level parameter, this parameter was varied, at 0.0005, 0.005, and 0.05 m, to examine whether it affected the difference in performance between the EnKF algorithm and the EKF algorithm. Note that the observation noise value of 0.05 m was so high that it was impractical in real river networks; thus, this setting was only used for research purposes.

6.4. Results and Discussion

6.4.1. Comparison of the EnKF and EKF Algorithm

Figure 7 shows the water level error values between the simulation and the measurements among different data treatment schemes for Case 1. The evaluation metrics used for evaluating data assimilation performance are tabulated in Table 7. The data assimilation started in hour 96, and after 23 h of filtering, the data assimilation end condition had been reached for both the EnKF and EKF algorithms. The data assimilation process, whether using either EnKF or EKF, produced an instant and efficient reduction in the simulation errors. The time required for both EnKF and EKF algorithms to reach stability was minimal, indicating that both algorithms could correct the prediction results instantly for this single-river case. With the end of the assimilation phase at hour 119, the filtered water level error increased slightly but still fluctuated, mainly between −0.05 m and 0.05 m, which was much lower than the water level error without data assimilation. The RMSE values decreased significantly by around two times after the data assimilation process was carried out using EnKF or the EKF algorithm. Although the EnKF algorithm produced a slightly lower RMSE, the difference was negligible for Case 1. However, the algorithm calculation cost for EKF was 16.7 s, which was 19.2% higher than that of EnKF, indicating that EKF was more time-consuming than EnKF. This result is similar to those found in the literature [23], namely, that the computational time of EnKF was 0.8 of that of EKF. This is because EKF requires the computation of the tangent linear state operator, whereas EnKF avoids this computation step.

Note that the river in Case 1 was a mountainous single-channel river with a relatively simple water flow. This enabled a significant reduction in the algorithm calculation time and thus in the difference in calculation cost between EKF and EnKF. Indeed, for Case 2, which was a complex river network, we observed greater differences in the evaluation metrics.

Figure 8 shows the simulated water level without data assimilation at the beginning period of the simulation (before hour 40) and with subsequent data assimilation using EKF and EnKF. Generally, the simulation error was substantially reduced and approximated to zero at the end of data assimilation using EKF or EnKF. This simulation error would spread out gradually over the long term after the end of data assimilation, indicating that the data assimilation effect could still be maintained for a relatively long period (up to 20 to 30 h) with negligible errors after the end of the data assimilation process.

Table 8 presents a comparison of the evaluation metric results of EKF and EnKF for different stations. The calculation time required by EnKF for this river network was 4243.6 s, which was only 45% of that required by EKF (9335.4 s). Compared to the mountainous single-channel river, as expected, the calculation time increased hugely; furthermore, EnKF exhibited a much lower calculation cost value relative to EKF, i.e., the ratio of the calculation cost between EnKF and EKF decreased greatly, from 81% to 45%, indicating that the calculation efficiency of EnKF was much higher than that of EKF for a complex river network. We also found that the RMSE values of EnKF were much lower than those of EKF, regardless of the water level measuring stations selected. Particularly, Lecheng station had an RMSE value of only 9% when using EnKF compared to EKF. Interestingly, EnKF was more stable than EKF during the data assimilation period at the three stations. The high fluctuations of EKF may reduce its convergence. The filtering time (i.e., the time taken by the algorithm to reach stability after the beginning of filtering) was also much shorter for EnKF than for EKF. For example, for the Hongqiao station, EnKF took 5 h to reach stability for accurate water level calculations, whereas EKF took 20 h.

6.4.2. Parameter Sensitivity

To verify the effectiveness of EnKF with different key parameter values, Figure 9 shows the RMSE values of EnKF compared to EKF at different observation noise levels. The EnKF algorithm exhibited much lower simulation errors at observation noise level values of 0.0005 and 0.005, regardless of the observation stations. The EKF algorithm had a slightly lower simulation error at Hongqiao station and Liushi station, which may be due to the impractically high value of observation noise in practice.

Similarly, EnKF’s calculation costs and the time required for EnKF to reach stability after the beginning of filtering at different observation noise levels were also consistently much lower than those of EKF, particularly at practical observation noise levels (0.0005 and 0.005), as shown in Table 9. The time for the EnKF to reach stability after the beginning of filtering was 27% to 81% of that for EKF, and the calculation cost for EnKF was 27% to 81% to that for EKF.

The above testing comparison between EnKF and EKF showed the great advantages of the EnKF-based data assimilation model when it was applied to the complex river network. Indeed, both EnKF and EKF have their advantages in predicting water levels. However, EnKF was more effective, especially in terms of computational effort and the time to reach stable prediction.

The computational characteristics of the Kalman filter (e.g., a short online estimation time, small storage size, and convenient error analysis) are assessed to determine its suitability for real-time correction processing of dynamic systems. For a nonlinear dynamic system of a complex river network, the EKF algorithm is often used, but there are still some problems with this approach, mainly: (1) it is difficult to calculate the prediction error variance matrix, and its accuracy cannot be guaranteed, as verified by our experiments in which data assimilation based on EKF produced much higher simulation errors than EnKF-based data assimilation. (2) It is necessary to calculate both the tangent state operator and the covariance of dynamic noise [12], which makes the algorithm structure complex and causes a large amount of calculations. The calculation cost of EKF is almost twice as high as that of EnKF. (3) The hydraulic calculation of river networks is a strongly nonlinear problem. The use of the EKF algorithm requires one to omit the terms above the first-order derivative; however, the omitted terms could precisely reflect the nonlinear effect of the strong nonlinear problem. (4) The EKF algorithm contains an assumption in the application, that is, the error obeys a normal distribution; therefore, it is not suitable to be applied to a system with a non-normal distribution of errors. (5) EKF is not capable of addressing horizontal error correlations in the model or measurement errors in large systems due to computational limitations [24].

For water level prediction in practice, the accuracy and timing of water level predictions are also important. Our experiments on the Le Liu Hong plain river network showed that the EnKF-based assimilation model was able to forecast the water level after 5 to 8 h and make predictions up to 20 to 30 h after the data assimilation process was completed. These timings are similar to those found in other research results. For example, Karri et al. [25] conducted data assimilation research on the regional waters of Singapore, revealing that EnKF is efficient in short-term forecasting of water levels and currents, and the model could make better predictions up to 24 h, with the highest forecast quality within 6 h. Although increasing the initial ensemble size could lead to increased computation costs, as shown in Table 5, the overall computational cost was still relatively low. For example, the computational cost of the EnKF-based algorithm, as tested on the simulated river network, was only 223 s when using a personal computer when the ensemble size was 200, observation noise was 0.0001, and standard deviation was 0.1.

Given the complex nature of a river network, more parameters, such as the water-level measurement station number and the river roughness coefficient, can be simultaneously corrected during the simulation time with hydraulic state variables. This further increases the forecasting accuracy and efficiency of the EnKF-based data assimilation model developed in this paper. Sensitivity to the type of river network and connection complexity could also be essential to testing the model’s application capacity in practice. In the present study, we focused on comparing the data assimilation strategy based on the hydraulic model of a river network coupled with EKF and that coupled with EnKF. Our main objective was to understand the advantages of the data assimilation method based on EnKF. Thus, the 1D hydraulic model based on the Saint-Venant equations was used to solve the state variables of river networks. Indeed, 2D models are being developed by researchers [26,27,28], and future work can be carried out to develop a data assimilation strategy based on a 2D hydraulic model.

7. Conclusions

Given the complex nature of a river network, it is challenging to perform good water level predictions with high accuracy and low computational cost when using the traditional EKF-based data assimilation approach. In this study, we developed a new data assimilation scheme based on the EnKF algorithm to optimize real-time water level simulation in river networks. The parameter analysis results showed that the initial sample ensemble size, its standard deviation, and the observation noise level could significantly impact the data assimilation performance. Optimized parameter settings, with a medium ensemble sample size (e.g., from 100 to 150), a relatively low observation noise level (e.g., from 0.0001 to 0.01 m), and a relatively high standard deviation (e.g., from 0.01 to 0.1 m), resulted in good data assimilation performance.

The experimental testing conducted on a mountainous single-channel river showed that the data assimilation method based on EnKF had a similar capacity to correct simulation errors to that of the EKF-based model, but for a river network with complex branches and connecting nodes, compared to the EKF-based model, the EnKF-based data assimilation method was able to decrease simulation errors greatly by up to 49%, to reduce calculation costs by 45%, and to reduce the filtering time by up to 43%. In addition, this performance was not impacted by the key model parameter, i.e., the observation noise level.

The experiments conducted on a real water river network showed that the EnKF-based data assimilation model forecast water levels with a high accuracy after 5 to 8 h and could make predictions up to 20 to 30 h after the data assimilation process was completed, indicating the potential application capacity of the EnKF-based assimilation model developed in this paper.

Author Contributions

Conceptualization, Y.C. and F.C.; methodology, Y.C.; validation, X.M. and W.C.; formal analysis, Y.C.; writing—original draft preparation, Y.C.; writing—review and editing, F.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number 12002310 and 51709235, General Scientific Research Projects of Education of Zhejiang Province, grant number Y201941076, and Zhejiang Water Science and Technology Research Project, grant number RB2202. The APC was funded by General Scientific Research Projects of Education of Zhejiang Province, grant number Y201941076.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data, models, and code generated or used in this study are available upon request from the corresponding author.

Acknowledgments

We thank Papergoing group for their writing support.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, X.; Li, F.; Zhao, Y. Impact of changes in river network structure on hydrological connectivity of watersheds. Ecol. Indic. 2023, 146, 109848. [Google Scholar]
Yuan, X.M.; Zhang, X.Y.; Tian, F.C. Research and application of an intelligent networking model for flood forecasting in the arid mountainous basins. J. Flood Risk Manag. 2020, 13, 15. [Google Scholar] [CrossRef]
Miau, S.; Hung, W.H. River Flooding Forecasting and Anomaly Detection Based on Deep Learning. IEEE Access 2020, 8, 198384–198402. [Google Scholar] [CrossRef]
Yeditha, P.K.; Kasi, V.; Rathinasamy, M.; Agarwal, A. Forecasting of extreme flood events using different satellite precipitation products and wavelet-based machine learning methods. Chaos 2020, 30, 12. [Google Scholar] [CrossRef]
Yoo, H.J.; Kim, D.H.; Kwon, H.-H.; Lee, S.O. Data driven water surface elevation forecasting model with hybrid activation function—A case study for Hangang River, South Korea. Appl. Sci. 2020, 10, 1424. [Google Scholar] [CrossRef] [Green Version]
Gupta, S.; Gupta, R.; Ojha, M.; Singh, K.P. (Eds.) A Comparative Analysis of Various Regularization Techniques to Solve Overfitting Problem in Artificial Neural Network. In Proceedings of the 4th International Conference on Recent Developments in Science, Engineering and Technology (REDSET), Gurgaon, India, 13–14 October 2017; GD Goenka Univ.: Sohna, India; Springer: Berlin, Germany, 2018. [Google Scholar]
Cunge, J. Practical Aspects of Computational River Hydraulics; Pitman Publishing Ltd.: London, UK, 1980; p. 420. [Google Scholar]
Shayannejad, M.; Ghobadi, M.; Ostad-Ali-Askari, K. Modeling of Surface Flow and Infiltration During Surface Irrigation Advance Based on Numerical Solution of Saint-Venant Equations Using Preissmann’s Scheme. Pure Appl. Geophys. 2022, 179, 1103–1113. [Google Scholar] [CrossRef]
Munier, S.; Litrico, X.; Belaud, G.; Perrin, C. Assimilation of Discharge Data into Semidistributed Catchment Models for Short-Term Flow Forecasting: Case Study of the Seine River Basin. J. Hydrol. Eng. 2015, 20, 13. [Google Scholar] [CrossRef] [Green Version]
Humpherys, J.; Redd, P.; West, J. A Fresh Look at the Kalman Filter. SIAM Rev. 2012, 54, 801–823. [Google Scholar] [CrossRef] [Green Version]
Sorensen, J.V.T.; Madsen, H.; Madsen, H. Data assimilation in hydrodynamic modelling: On the treatment of non-linearity and bias. Stoch. Environ. Res. Risk Assess. 2004, 18, 228–244. [Google Scholar] [CrossRef]
Sun, L.; Seidou, O.; Nistor, I.; Liu, K. Review of the Kalman-type hydrological data assimilation. Hydrol. Sci. J. 2016, 61, 2348–2366. [Google Scholar] [CrossRef] [Green Version]
Aanonsen, S.I.; Nœvdal, G.; Oliver, D.S.; Reynolds, A.C.; Vallès, B. The ensemble Kalman filter in reservoir engineering—A review. Spe J. 2009, 14, 393–412. [Google Scholar] [CrossRef]
Madsen, H.; Skotner, C.J. Adaptive state updating in real-time river flow forecasting—A combined filtering and error forecasting procedure. J. Hydrol. 2005, 308, 302–312. [Google Scholar] [CrossRef]
Ziliani, M.G.; Ghostine, R.; Ait-El-Fquih, B.; McCabe, M.F.; Hoteit, I. Enhanced flood forecasting through ensemble data assimilation and joint state-parameter estimation. J. Hydrol. 2019, 577, 123924. [Google Scholar] [CrossRef]
Meng, S.; Xie, X.; Liang, S.J. Assimilation of soil moisture and streamflow observations to improve flood forecasting with considering runoff routing lags. J. Hydrol. 2017, 550, 568–579. [Google Scholar] [CrossRef]
Soto, J.; Infante, S. (Eds.) Ensemble Kalman Filter and Extended Kalman Filter for State-Parameter Dual Estimation in Mixed Effects Models Defined by a Stochastic Differential Equation. In Technology, Sustainability and Educational Innovation (TSIE); Springer International Publishing: Cham, Switzerland, 2020. [Google Scholar]
Tao, X.; Wu, H. (Eds.) A Comparison of the Sequential Quadratic Programing Algorithm and Extended Kalman Filter Method in the Magnetic Particle Tracking Reconstruction. In Proceedings of the AIAA Scitech 2019 Forum, San Diego, CA, USA, 7–11 January 2019. [Google Scholar]
Zhang, Z.; Fu, K.; Sun, X.; Ren, W. Multiple target tracking based on multiple hypotheses tracking and modified ensemble Kalman filter in multi-sensor fusion. Sensors 2019, 19, 3118. [Google Scholar] [CrossRef] [Green Version]
Roth, M.; Hendeby, G.; Fritsche, C.; Gustafsson, F. The Ensemble Kalman filter: A signal processing perspective. EURASIP J. Adv. Signal Process. 2017, 2017, 56. [Google Scholar] [CrossRef] [Green Version]
Clement, L.; Thas, O. Estimating and modeling spatio-temporal correlation structures for river monitoring networks. J. Agric. Biol. Environ. Stat. 2007, 12, 161–176. [Google Scholar] [CrossRef]
Islam, A.; Raghuwanshi, N.; Singh, R.; Sen, D. Comparison of gradually varied flow computation algorithms for open-channel network. J. Irrig. Drain. Eng. 2005, 131, 457–465. [Google Scholar] [CrossRef]
El Serafy, G.; Mynett, A. Comparison of EKF and EnKF in Sobek River: Case study Maxau-Ijssel. In Hydroinformatics: (In 2 Volumes, with CD-ROM); World Scientific: Singapore, 2004; pp. 513–520. [Google Scholar]
Reichle, R.H.; Walker, J.P.; Koster, R.D.; Houser, P.R. Extended versus ensemble Kalman filtering for land data assimilation. J. Hydrometeorol. 2002, 3, 728–740. [Google Scholar] [CrossRef]
Karri, R.R.; Badwe, A.; Wang, X.; El Serafy, G.; Sumihar, J.; Babovic, V.; Gerritsen, H. Application of data assimilation for improving forecast of water levels and residual currents in Singapore regional waters. Ocean. Dyn. 2013, 63, 43–61. [Google Scholar] [CrossRef]
Samela, C.; Persiano, S.; Bagli, S.; Luzzi, V.; Mazzoli, P.; Humer, G.; Reithofer, A.; Essenfelder, A.; Amadio, M.; Mysiak, J.; et al. Safer_RAIN: A DEM-based hierarchical filling-&-Spilling algorithm for pluvial flood hazard assessment and mapping across large urban areas. Water 2020, 12, 1514. [Google Scholar]
Cueto-Felgueroso, L.; Santillán, D.; García-Palacios, J.H.; Garrote, L. Comparison between 2D Shallow-Water Simulations and Energy-Momentum Computations for Transcritical Flow Past Channel Contractions. Water 2019, 11, 1476. [Google Scholar] [CrossRef] [Green Version]
Santillán, D.; Cueto-Felgueroso, L.; Sordo-Ward, A.; Garrote, L. Influence of Erodible Beds on Shallow Water Hydrodynamics during Flood Events. Water 2020, 12, 3340. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the EnKF-based data assimilation process.

Figure 2. Illustration of the simulated water river network.

Figure 3. RMSE values with various standard deviation and observation noise level values at (a) Station 1 and (b) Station 2.

Figure 4. Schematic diagram of the Kootenai river.

Figure 5. Boundary conditions for the upstream entrance: (a) water flow at Tribal Hatchery and the downstream exit, (b) water level at Klockmann Ranch.

Figure 6. Schematic diagram of the Le Liu Hong plain river network.

Figure 7. Water level simulation error values for different data treatment schemes. The black line represents the data that were not treated via data assimilation; the green line represents the data treated via EKF; the red line represents the data treated via EnKF.

Figure 8. Water level simulation errors among different data treatment schemes at (a) Hongqiao station, (b) Lecheng station, and (c) Liushi station. The solid black line represents the data that were not treated via data assimilation; the green dashed line represents the data that were treated via EnKF; the red dashed line represents the data that were treated via EKF.

Figure 9. Comparison of RMSE values of EnKF and EKF at (a) Hongqiao station, (b) Lecheng station, and (c) Liushi station.

Table 1. Combination of standard deviation and observation noise levels for parameter analysis.

Observation Noise Levels/m	Standard Deviation σ/m
Observation Noise Levels/m	0.0	0.0001	0.0005	0.001	0.005	0.01	0.05	0.1
0	√	×	×	×	×	×	×	×
0.0001	√	√	√	√	√	√	√	√
0.0005	√	√	√	√	√	√	√	√
0.001	√	√	√	√	√	√	√	√
0.005	√	√	√	√	√	√	√	√
0.01	√	√	√	√	√	√	√	√
0.05	√	√	√	√	√	√	√	√

Note: √ means the calculation was successful; × means the calculation was unsuccessful due to the failure of convergence and data assimilation.

Table 2. Combination of ensemble size, standard deviation, and observation noise level.

Ensemble Size m	δ/m and σ/m
Ensemble Size m	δ = 0.0001, σ = 0.0001	δ = 0.0001, σ = 0.1	δ = 0.05, σ = 0.0001	δ = 0.05, σ = 0.1
50	√	√	√	√
100	√	√	√	√
150	√	√	√	×
200	√	√	√	×

Note: √ means the calculation was successful; × means the calculation was unsuccessful due to the failure of convergence and data assimilation.

Table 3. RMSE values with different ensemble sizes at fixed δ and σ values for Station 1.

Ensemble Size m	δ/m and σ/m
Ensemble Size m	δ = 0.0001, σ = 0.0001	δ = 0.0001, σ = 0.1	δ = 0.05, σ = 0.0001	δ = 0.05, σ = 0.1
50	0.00877	0.00036	0.15169	0.01161
100	0.00573	0.00049	0.15169	0.00697
150	0.00304	0.00024	0.15169	/
200	0.00240	0.00025	0.15169	/

Table 4. RMSE values with different ensemble sizes at fixed δ and σ values for Station 2.

Ensemble Size m	δ/m and σ/m
Ensemble Size m	δ = 0.0001, σ = 0.0001	δ = 0.0001, σ = 0.1	δ = 0.05, σ = 0.0001	δ = 0.05, σ = 0.1
50	0.01015	0.00038	0.05223	0.01627
100	0.00643	0.00051	0.05223	0.00807
150	0.00305	0.00023	0.05223	/
200	0.00233	0.00029	0.05223	/

Table 5. Computational cost in seconds with different ensemble sizes at fixed δ and σ values.

Ensemble Size m	δ/m and σ/m
Ensemble Size m	δ = 0.0001, σ = 0.0001	δ = 0.0001, σ = 0.1	δ = 0.05, σ = 0.0001	δ = 0.05, σ = 0.1
50	43.42	44.05	44.61	43.73
100	95.19	96.34	98.21	98.77
150	151.92	153.06	153.84	/
200	219.14	222.66	222.48	/

Table 6. Basic information of the Kootenai river.

Section Number	River Length/m	River Width/m	Average Slope	Manning’s Roughness Coefficient
1	1595	265–432	0.0004	0.035
2	4634	173–319		0.030
3	10,327	188–343		0.035
4	1567	151–289		0.035
5	1543	106–197		0.035
6	4627	155–299		0.035

Table 7. Comparison of evaluation metrics for EKF and EnKF in Case 1.

Treatment Scheme	Calculation Cost (s)	RMSE (m)	Filtering Time (s)
Without data assimilation	2.8	0.0841	/
EKF	16.7	0.0276	Instant
EnKF	13.5	0.0266	Instant

Table 8. Comparison of evaluation metric results for EKF and EnKF in Case 2.

Treatment Scheme	Calculation Cost (s)	RMSE (m)			Filtering Time (h)
Treatment Scheme	Calculation Cost (s)	Hongqiao	Lecheng	Liushi	Hongqiao	Lecheng	Liushi
Without data assimilation	204.3	0.1735	0.0655	0.1246	/	/	/
EKF	9335.4	0.0113	0.0187	0.0090	20	14	20
EnKF	4243.6	0.0026	0.0016	0.0044	5	6	8
Ratio of EnKF to EKF	45%	23%	9%	49%	25%	43%	40%

Table 9. Impact of observation noise level on data assimilation results based on EKF and EnKF.

Treatment Scheme	Filtering Time (h)		Calculation Cost (s)
Treatment Scheme	Hongqiao	Lecheng	Liushi	Hongqiao	Lecheng	Liushi
	δ = 0.0005 m	δ = 0.005 m	δ = 0.05 m	δ = 0.0005 m	δ = 0.005 m	δ = 0.05 m
EKF	26	20	37	11,378	9335	15,463
EnKF	7	7	30	4248	4244	5415
Ratio of EnKF to EKF	27%	35%	81%	37%	45%	35%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, Y.; Cao, F.; Meng, X.; Cheng, W. Water Level Simulation in River Network by Data Assimilation Using Ensemble Kalman Filter. Appl. Sci. 2023, 13, 3043. https://doi.org/10.3390/app13053043

AMA Style

Chen Y, Cao F, Meng X, Cheng W. Water Level Simulation in River Network by Data Assimilation Using Ensemble Kalman Filter. Applied Sciences. 2023; 13(5):3043. https://doi.org/10.3390/app13053043

Chicago/Turabian Style

Chen, Yifan, Feifeng Cao, Xiangyong Meng, and Weiping Cheng. 2023. "Water Level Simulation in River Network by Data Assimilation Using Ensemble Kalman Filter" Applied Sciences 13, no. 5: 3043. https://doi.org/10.3390/app13053043

APA Style

Chen, Y., Cao, F., Meng, X., & Cheng, W. (2023). Water Level Simulation in River Network by Data Assimilation Using Ensemble Kalman Filter. Applied Sciences, 13(5), 3043. https://doi.org/10.3390/app13053043

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Water Level Simulation in River Network by Data Assimilation Using Ensemble Kalman Filter

Abstract

Featured Application

Abstract

1. Introduction

2. One-Dimensional Hydrodynamic Model of a River

3. General Form of a River Network Hydrodynamic System

4. Application of Ensemble Kalman Filter

5. Parameter Analysis

5.1. Observation Noise Level and Standard Deviation of the Initial Ensemble

5.2. Ensemble Size

6. Experimental Evaluation

6.1. Data Description

6.2. Parameter Settings and Evaluation Metrics

6.3. Algorithms for Comparison

6.4. Results and Discussion

6.4.1. Comparison of the EnKF and EKF Algorithm

6.4.2. Parameter Sensitivity

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI