1. Introduction
A watershed is a delimited land area, and river networks serve as primary pathways for transporting sediment, water, and other environmental fluxes on them [
1,
2,
3,
4]. Many physical–mathematical models have been proposed to simulate water storage and flux in watersheds [
5,
6]. Most of these hydrological models account for the rainfall over a watershed as an input to simulate surface and subsurface runoff, ultimately drained into a common downstream water body, such as a long stream river, lake, reservoir, or estuary [
5,
6].
It is worth mentioning that intense rainfall in smaller watersheds located in mountainous regions may cause flash flood (FF) events in plainer downstream areas [
7]. FF develop quickly, and significantly impact human lives and infrastructures [
8,
9,
10,
11].
The temporal and spatial effect of rainfall input data in hydrological simulations has become a subject of interest since early studies on the water cycle [
12]. In the past few decades, with a considerable improvement in estimating rainfall from remotely sensed data, flood scenarios could be more accurately reconstructed based on the spatial–temporal patterns of storms [
9]. A better understanding of hydrological scales indicated that modeling the water cycle in smaller watersheds (<500 km
) requires rainfall data with higher spatial–temporal resolution [
7,
13]. In this context, weather radar data provide reflectivity data, converted to rainfall rates and integrated for a while to estimate the volume of water that falls in the watershed. Therefore, it provides a more accurate estimation of accumulated rainfall for smaller areas than satellite-based rainfall missions. On the other hand, accurate rainfall data from rain gauges/pluviometers require a high-density network of pluviometers to cover the watershed area, which is usually unavailable.
Tracking localized storms, which could trigger a flash flood event, was essential in any operational flood forecasting system. For this purpose, improving flash flood predictions is fundamentally attached to the spatial and temporal representation of rainfall events and the unique geo-morphological aspects of watersheds, such as size, shape, and slope [
7,
14]. The difficulty in representing the spatial heterogeneity of soil properties and land use in physical-based hydrological models prompted the hydrological community to adopt empirical models [
15,
16]. The effects of those new methodologies still need to be clarified. Many natural phenomena are difficult to reproduce using mathematical modeling since it is difficult to associate the applicable physical laws with the phenomena and estimate the many unknown related parameters. Therefore, empirical models have been proposed in all areas to perform forecasts instead of a mathematical model that performs a simulation as shown in hydrology [
17,
18]. Typically, in a machine learning approach, these models employ known
posts mortem data to train a specific algorithm and thus generate a forecast model. Neural networks are a widely exploited approach for machine learning, and many different architectures have been proposed in the last decades.
To the best of our knowledge, after a broader literature review, and based on surveys such as [
9,
17,
18,
19], just three papers were published using machine learning techniques on sub-hourly rainfall and hydrological data: [
20] in Austria, [
21] in Brazil, and [
22] in Romania. The first two papers proposed a neural network and the third a genetic programming-based hydrological model using rain gauge and weather radar data for a small and steep basin.However, none of them analyzed the model’s performance considering different accumulated rainfall in the input layer. In addition, all of them had to use dimensionality reduction in the training process, and none of them presented a spacial analysis of the trained model.
Two key questions in neural network-based hydrological modeling for small and steep watersheds are as follows: How long must the weather radar time series be (i.e., how much earlier) to provide good forecasts? How is the spatial understanding/explicability of the trained model?
This paper presents a neural network-based short-term hydrological model using weather radar data to predict the river level in a small watershed located in the mountainous region of Rio de Janeiro (Brazil). The algorithms were trained and evaluated for two short-term predictions using observational data: 15 and 120 min. Training was performed with accumulated volumes of rainfall (from 1 to 48 h) derived from weather radar data. Each training instance is then composed of an accumulated volume of rainfall and the resulting river level.
2. Data and Method
The small watershed of the Bengalas river is located in the mountainous region of Rio de Janeiro, in the city of Nova Friburgo (
Figure 1). Inadequate urban occupation at the steep hill slopes and floodplain areas make the city highly prone to disasters triggered by extreme rainfall, such as landslides and flash floods [
8]. The drainage area of this watershed is approximately 190 km
2, with a concentration time of less than 2 h, which is the time spent by the water to flow from the riverhead to its outlet.
The weather radar of Pico do Couto, in the city of Petropolis, is operated by the Department of Airspace Control (DECEA), and covers the Bengalas watershed within a circular range of 50 to 100 km (
Figure 1). The radar data employed in this work cover the period from December 2011 to March 2013, with temporal resolution of 15 min, providing CAPPI (Constant Altitute Plan Position Indicator) images for 3 km altitude with a spatial resolution of approximately 1 km, corresponding to a grid of 500 × 500 points. The surrounding quadrant around the watershed corresponds to a mesh of 378 points as shown in
Figure 1. The reflectivity radar data (in dBZ) can be post-processed using the Marshal–Palmer relation [
23] to estimate the rainfall rate (mm/h), and thus the accumulated rainfall up to 48 h.
Empirical or data-oriented models may use a machine learning or a statistical algorithm, which employ known post mortem data to be trained and validated, thus generating a forecast model. A neural network is a machine learning algorithm inspired by the functioning of the human brain. A feedforward neural network is formed by layers containing basic processing units called neurons with the data flowing from the input layer (first layer) to the output layer (last layer). Besides the input and output layers, there are one or more hidden layers intended to extract features from the data. Neurons in the hidden and output layers process the data through activation functions to treat the non-linearity of the problem. Its most common architecture is the multilayer perceptron (MLP), a feedforward neural network with fully connected neurons between layers and nonlinear activation functions, such as that proposed in this work. The standard multilayer perceptron (MLP) architecture with one hidden layer is shown in
Figure 2. A MLP can approximate any function with any degree of accuracy depending on the number of neurons in the hidden layer [
24]. It is one of the most common neural networks in problems of hydrological modeling and prediction.
Data flow from the input to the hidden layers and then to the output layer by means of neuron connections, each one with a weight and a bias that are estimated throughout the training of the neural network. In each training iteration, a sample is input to the network, as the training data are divided into batches of samples. An epoch is completed when all training samples are input to the network, and training proceeds, covering hundreds or thousands of epochs. Every time a batch of samples is input to the network, the corresponding estimation error is calculated by a loss function, using as reference the true value of each sample, which is usually the observed value. A backpropagation scheme allows to adjust neurons weights and biases in order to minimize the error using the gradient of the loss function, which is backpropagated through the layers after every batch. Once the training phase is completed, a validation phase similar to the training phase is performed using a different set of input data, and then it is expected that the neural network would have a good generalization ability to perform estimations/predictions using completely new data [
25,
26].
The proposed MLP was trained to predict the river water level at the watershed outlet, which is the lowest point of the watershed. The true/observed data for the training and validation phases were provided by the Conselheiro Paulino hydrological monitoring station, operated by the State Environmental Institute (INEA) and providing data with a 15 min resolution.
A MLP of the Keras library was implemented in the Python language. Training, validation and testing were performed for 15 and 120 min predictions, thus generating two prediction models. The neural network was configured with three hidden layers of 120, 50, and 10 neurons, and a single-neuron output layer. This MLP topology and its hyperparameters were chosen by experimentation as shown in
Table 1. Input data were normalized for the [0, 1] interval to accelerate the learning process. The training phase stops according to a minimum error-rate threshold, and therefore the number of epochs varies for each training. The adopted validation scheme partitioned the input data by randomly sorting 80% of the instances for training, 10% for validation, and 10% for the neural network test. Instances correspond to 3 km CAPPI image cutouts with a 15 min resolution.
Experiments refer to the proposed neural network predictions of the river level at the outlet of the Bengalas river watershed (Conselheiro Paulino station). In the 15 min model, current radar data at time t are used to predict the river level at time t + 1 (15 min ahead), and in the 120 min model, to predict the river level at time t + 8 (120 min ahead). Both models were generated by the neural network with the same architecture, hyperparameters and partition scheme, and their performance prediction was evaluated for accumulated volumes of rainfall for 1, 2, 6, 12, 24, and 48 h.
Two metrics were used to evaluate the predictive performance of the proposed neural models: the Nash–Sutcliff efficiency index (NSE) [
27,
28] and the root mean square error (RMSE) [
27,
28].
On the one hand, in hydrology, the NASH index (Nash–Sutcliffe efficiency) is a widely utilized statistical measure that assesses the accuracy of hydrological models by comparing simulated and observed data. It considers the model’s estimation error relative to the mean of the observed data. On the other hand, in machine learning, the RMSE is a commonly employed metric to assess the accuracy of predictive models. It calculates the square root of the average squared difference between the predicted and observed values. The RMSE measures the average magnitude of errors.
The respective formulas for calculating the two metrics are given by Equations (1) and (2):
where
T is the total number of timestamps,
is the predicted river level value in timestamp
t,
is the observed river level in timestamp
t, and
is the observed water level mean.
The NSE is a comparison between the predictive performance of the proposed neural model and the predictive skill of the observed water level mean. This index can range from −∞ to 1. NSE values lower than zero, equal to zero or higher than zero indicate that the proposed neural model is a predictor that is respectively worse than, equal to or better than the observed water level mean. In turn, the RMSE is an index that uses the Euclidean distance between the observed water level values and the corresponding predicted water level values to assess the quality of the predictions. So, the closer the NSE is to 1 and the closer the MSE is to zero, the better the predictive performance of the neural model.
3. Results and Discussion
Table 2 presents the prediction tests that were carried out, and the corresponding prediction performance metrics, NSE and RMSE. All test were conducted under exactly the same conditions on training–test data splitting.
The forecast water level time series only archived successful performance (i.e., Nash–Sutcliffe model efficiency coefficient (NSE) > 0.7) when the data inputs considered at least 2 h of accumulated rainfall. This finding suggests a physical association between the input data temporal length and the watershed time of concentration. Under extended periods of accumulated rainfall (>12 h), the framework reached considerably higher performance levels (i.e., NSE > 0.85), which may be related to the ability of ANN to capture the subsurface response as well as past soil moisture states in the watershed. In the case of many days of steady moderate rainfall, the higher soil moisture content is expected to decrease the surface permeability. As a result, most of the upcoming rainfall does not infiltrate the soil. Due to such a moderate rainfall rate case, river levels can increase into a wet watershed in a way that is similar to a higher rainfall rate on dry land. Therefore, operational flash flood forecast systems [
29] must employ both the antecedent rainfall amount and the soil moisture state threshold in order to provide timely forecasts for early warnings. This is the case of the proposed approach shown here.
Figure 3 shows the observed and predicted river levels for the 15 and 120 min forecasts, both obtained by the neural network trained with an accumulated volume time of 12 h between 29th December 2011 and 8th January 2012, when the highest river level was observed.
Figure 4 shows the scatter plot of the predicted and observed values of the Bengalas river level (in meters) at the watershed outlet for 15 and 120 min forecasts, which show a good correlation, mainly for low and medium values.
In order to make a comparison, a smaller cutout of the CAPPI image, considering only the radar cell at the watershed outlet and its eight neighbors, degraded the indexes RMSE/NSE to / for 15 and to / for the 120 min ahead forecasting.
A set of prediction tests was performed for the 12-h accumulated rainfall case, using a 100-sample validation scheme. Data were randomly split into 100 datasets, all of them following the rule 80% for training and cross validation and 20% for testing. The resulting average RSME was 0.0403, with a standard deviation of 0.0056, and the average NASH index was 0.8170, with a standard deviation of 0.0554 for the 15 min ahead forecast. Using the same validation scheme for the 120 min ahead forecast, the average RSME was 0.04031, with a standard deviation of 0.0058, and the average NASH index was 0.8122, with a standard deviation of 0.0613. These results show the robustness of both models (15 and 120 min).
In general, empirical hydrological models require fewer data inputs than physically based distributed hydrological models. Recent studies [
16] show that machine-learning approaches can outperform conventional methods to model the river level at its outlet when a single forecast point is suitable and desirable as a model outcome. Considering the watershed related to this paper, [
8] reached a value of 0.56 for the NSE after an extensive calibration process for a 250 m gridded model, which is a value smaller than that presented in this paper.
After the temporal analysis of the results, we investigate whether it is possible to use the input layer weights of a trained MLP model such as ours to provide information on the hydrological content. In our model, each entry of the MLP corresponds to a cell of the radar data and is associated with a geographic coordinate; therefore, the main objective of the analysis of the weight is to verify the degree of correlation between the geographic positions and hydrology features of the watershed.
The first analysis was performed considering the maximum weight for each input cell (
Figure 5) and the smallest horizontal distance between the center point of the radar cell and the drainage network. As shown on the graphics of
Figure 6, no relevant correlation was found. One can note from
Figure 5 that grid cells with high maximum weight are not necessarily close to the drainage network.
A second analysis was made comparing the minimum, maximum, average and sum of weight values with the ground model HAND (height above nearest drainage), which corresponds to a topographical normalization of the landscape that uses the digital terrain model (DTM) as input and provides as output, a new normalized DTM, which can be classified according to vertical distances relative to watercourses that are close ([
30]). According to
Figure 7, no direct relation between HAND and the input cell weights was found.
4. Conclusions
A neural network-based hydrological model was presented for short-term predictions of river level at the outlet of a watershed using weather radar. The considered watershed is located in the mountainous region of Rio de Janeiro (Brazil) in the city of Nova Friburgo, being subjected to flash floods. In 2011, a major flood affected this city of more of 190,000 inhabitants, causing more than 900 deaths. Therefore, flood risk mitigation in this watershed is a major issue, but there is not any operational warning system. In this scope, the proposed approach proposes a data-oriented deep learning model to perform short-term predictions. The algorithms were trained and evaluated for two short-term predictions (15 and 120 min) using accumulated volumes of rainfall (from 1 to 48 h) derived from weather radar data and the river level at the outlet.
The proposed methodology using weather radar data with a neural network to predict the river level at the outlet of a watershed shows good prediction performance in most cases, for instance, a Nash–Sutcliffe model efficiency coefficient (NSE) of for a 15 min forecast, and NSE of for a 120 min forecast.
Even in a watershed with less than 2 h of concentration time, the accumulated rainfall data for a more extended period significantly improved the neural network performance. This issue suggests that the algorithm emulates physical patterns of the system, which may be related to the soil moisture in previous conditions.
In addition, the use of only radar grid cells closer to the target degrades the prediction performance, and therefore, it is convenient to consider a wider coverage of cells in order to encompass all of the considered watershed.
Considering a spatial point of view, no direct relations were found between the input cell’s weights and horizontal or vertical distances.
Three limitations of this research are that we explored only one watershed, we did it for a not-too-long time series, and we considered only rainfall and river level data in the input layer. As future work, it is intended to increase the times series extension to allow the neural network to learn even more patterns and thus improve the prediction performance, and compare results from different watersheds of similar properties: small size and high slopes.
Finally, to reduce the potential sources of errors, in future studies, we aim to incorporate in the input layer of the empirical model complementary physical data, such as slopes, land use, and drainage directions.