This section presents the datasets and a concise analysis, followed by an explanation of the methodology. Conducting data analysis, we focused on two pivotal time series datasets—water depth and water discharge—gathered from strategically positioned sensor stations spanning multiple years. The ensuing discussion outlines the intricate analytical procedures and frameworks applied to extract meaningful insights into the hydrological dynamics under investigation. The water depth time series delineates variations in depth, enabling an assessment of riverbed topography changes at specified sample stations. Concurrently, the water discharge time series, derived from additional river stations, provides insights into flow dynamics influenced by precipitation, snowmelt, and other hydrological factors. Employing descriptive statistics and visualization techniques, the objective is to uncover patterns and correlations within these datasets, contributing to a nuanced forecast of river navigability. The analysis is rooted in meticulously collected daily measurements from two distinct critical section stations, each representing unique segments. The dataset spans from 1 January 1988 to 12 May 2022, comprising 12,294 and 12,255 records for the respective stations. Recorded at daily intervals, the time series data underwent careful handling of missing values using the forward fill method to ensure data consistency. The time series of water discharge and water depth are represented
Figure 4 and the main indicators of a descriptive analysis are reported in
Table 1.
The stationarity of the time series data was rigorously assessed through the Augmented Dickey–Fuller (ADF) test.
A meticulous seasonal decomposition analysis revealed a distinct 6-month pattern in river depths (see
Figure 8). The trend indicated stable depths with a marginal increase. Notably, elevations in river depth and discharge were observed in May, June, November, and December, emphasizing a discernible seasonal trend during these months.
Exploring the dynamics of water depth and discharge aimed to uncover patterns and trends. This foundational analysis lays the groundwork for designing machine learning models intended to predict navigability, as described in the subsequent subsections.
3.1. Basic Probabilistic Method for Navigability Risks
A first approach used to evaluate the navigability of the Po River was provided using a statistical method. The time series of historical daily survey of water depth, collected for more than 40 yrs, coupled with discharge data available at each monitoring station, can be used to define a good estimation of probability along several critical sections on the Po River. That probability, joined with the predicted discharge data provided by the EWA system, based on hydrological and hydraulic models (cfr.
Section 2), provides a basic method to compute the probability of navigability for each critical section for the next 10–15 days.
Figure 9 shows examples of navigability risks for the vessel classes (based on draught) given the discharge classes. The probability was computed based on the percentage of occurrences of the event
water depth > minimum draught in the historical data. The navigability risk of a stretch is provided as the worst navigability risk value among those obtained for each critical section of the stretch. On the one side, this approach uses a huge amount of data and can define a good statistical correlation between discharges and critical sections for water depth in terms of overall performance.
On the other side, this statistic does not consider morphological variations at each critical point due to artificial (dredging) or natural (floods) actions. To reduce the uncertainty related to those factors, methods that evaluate the daily probability based on the values of the previous days should be elaborated.
3.2. Deep Learning Method for Water Depth Predictions
Deep learning methods are being experimented with to compare or integrate the risk results obtained with the previous method. In particular, the first objective of the workplan was to generate a predictor for each critical point of each stretch of the river. Thus, for each vessel class, given a forecast of the water discharge of the next 10–15 days at each point, the navigability of a stretch would be the worst prediction of the water depth among the critical section results of that stretch.
The Long-Short-Term (LSTM) neural network [
3] has been selected to generate the models. Indeed, it is well known that the LSTM is effective for modeling and predicting sequences, given its ability to retain information over extended periods. Other more recent methods, like Transformer models [
4], are also employed for numerical time series analysis, but as they are especially effective in handling sequential data with complex dependencies (e.g., many variables or global dependencies), they were not the first choice for these experiments. Furthermore, recent works such as [
6] have demonstrated that for river water level prediction, LSTM models trained on water depth measurements and time series alone may outperform other artificial neural network architectures that correlate weather data (e.g., temperature and humidity) with water level observations. This may be because the river’s water discharge prediction at some locations is the result of a much more complex analysis that also considers non-local hydrological and climatic aspects.
Figure 10 illustrates the problem statement of the LSTM method.
From the water depth forecast series, one can easily obtain the prediction of navigability for each type of vessel, described in
Section 2. However, the confidence level of such prediction should be carefully analyzed to avoid both false positive risks (e.g., a vessel is not allowed to move, but then the navigability conditions result to be safe) and false negative risks (e.g., a vessel starts navigation in unsafe conditions).
The research questions posed to the ML-based forecast experiments are as follows:
[RQ1] How does the LSTM navigability predictor perform on the available real data?
[RQ2] What is the accuracy of the navigability results of the best predictor built?
[RQ3] How generalizable is a predictor to other critical sections or stretches of the river?
The TensorFlow Python implementation of LSTM has been used to build the code for training and testing the models. As it is common practice for this type of software development, the best LSTM configuration has been decided empirically. The development process of the models is an iteration over the quality of the predictions on the validation data by using various metrics, including accuracy (ACC score function) and f1. The general process followed is illustrated in
Figure 11.
To answer [RQ1], this process was applied for several critical sections of the same stretch and of different stretches, selected based on the quality of the input data (e.g., completeness). Among the experiments, the setting in
Table 2 led to the most accurate model for more than one critical section.
The model is trained on the two-input series (to learn patterns of correspondence of their values), and the purpose of the resulting model is to take an input sequence of some length for one of the two series (series water discharge in our experiments) and learn to predict the corresponding values of the other series. The trained LSTM model could be used to predict future values of both water discharge and water depth series. In this study, the prediction of discharge is provided by the EWA system, which, as explained in
Section 2, is based on a numerical model.
A preliminary answer to [RQ2] is provided with the following results. An example of the result of a critical section in the last 450 days of the dataset is shown in
Figure 12. Please note that, as specified in
Table 2, the KFold cross-validation technique was used so that the dataset was divided into “k = 20” subsets, and the model was trained and evaluated 20 times, using a different subset as the validation set in each iteration. The best model was then chosen based on accuracy scores. For this critical section, the subseries of real water depth of the last period were interesting as they featured the lowest values of the whole historical series, hence the choice to use the KFold cross-validation technique.
Figure 12 shows the vessel drafts and the f1 and accuracy values for the 140, 160, and 180 cm classes. These lowest classes are the most critical for the confidence level of the predictions for non-navigability, as these predictions overestimate much of the real values and provide wrong negative estimates for the last 100 days (from February to May 2022). Indeed, from the analysis of the considered dataset, in this period, a similar decrease pattern in water depth was not present beforehand. Expert knowledge of the management of the infrastructure and/or a deeper investigation into weather and environmental aspects may help to better interpret these results.
From the data analysis only, better estimates for navigability risks of the same classes of vessels resulted in a critical section of another stretch of the river, as represented in
Figure 13. It can be noticed that there is a higher frequency of low values in the validation dataset.
To address [RQ3], further tests should be made to search for models that can adapt to different critical sections and/or stretches. A first investigation of the datasets leads to the hypothesis that critical sections of the same stretch generally feature similar shapes (values and variability). However, critical sections of different stretches may feature different patterns to those used in
Figure 12 and
Figure 13. The full datasets of the water discharge (blue line) and water depth (brown line) normalized series are displayed in
Figure 14.