Next Article in Journal
The Role of the Real Estate Sector in the Economy: Cross-National Disparities and Their Determinants
Next Article in Special Issue
Sustainability Assessment of Harvesting Rainwater and Air-Conditioning Condensate Water in Multi-Family Residential Buildings under Various Conditions in Israel—A Simulation Study
Previous Article in Journal
When Artificial Intelligence Tools Meet “Non-Violent” Learning Environments (SDG 4.3): Crossroads with Smart Education
Previous Article in Special Issue
An Improved Aggregation–Decomposition Optimization Approach for Ecological Flow Supply in Parallel Reservoir Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Advanced Predictive Modeling for Dam Occupancy Using Historical and Meteorological Data

1
Business School, Kocaeli University, Kocaeli 41001, Turkey
2
Business School, Sakarya University, Sakarya 54050, Turkey
3
Industrial Engineering Department, Istanbul Medeniyet University, Istanbul 34700, Turkey
*
Author to whom correspondence should be addressed.
Sustainability 2024, 16(17), 7696; https://doi.org/10.3390/su16177696
Submission received: 11 July 2024 / Revised: 24 August 2024 / Accepted: 25 August 2024 / Published: 4 September 2024

Abstract

:
Dams significantly impact the environment, industries, residential areas, and agriculture. Efficient dam management can mitigate negative impacts and enhance benefits such as flood and drought reduction, energy efficiency, water access, and improved irrigation. This study tackles the critical issue of predicting dam occupancy levels precisely to contribute to sustainable water management by enabling efficient water allocation among sectors, proactive drought management, controlled flood risk mitigation, and preservation of downstream ecological integrity. Our research suggests that combining physical models of water inflow and outflow “such as evapotranspiration using the Penman–Monteith equation, along with parameters like water consumption, solar radiation, and rainfall” with data-driven models based on historical reservoir data is crucial for accurately predicting occupancy levels. We implemented various prediction models, including Random Forest, Extra Trees, Long Short-Term Memory, Orthogonal Matching Pursuit CV, and Lasso Lars CV. To strengthen our proposed model with robust evidence, we conducted statistical tests on the mean absolute percentage errors of the models. Consequently, we demonstrated the impact of physical model parameters on prediction performance and identified the best method for predicting dam occupancy levels by comparing it with findings from the scientific literature.

1. Introduction

Integrated water resource management (IWRM) promotes a coordinated and holistic approach to managing water resources, considering the interconnections among water, land, and related resources. Dams, by their very function, affect the environment, industries, residential areas, and agricultural lands. Efficient management of a dam can minimize its negative impacts and enhance its positive effects on all these elements. Environmental benefits of efficient management include reducing adverse effects like floods and droughts. For industries and residential areas, it improves energy efficiency and access to clean water. For agricultural lands, it provides more efficient irrigation opportunities. To achieve these benefits, IWRM must be implemented by measuring and forecasting dam levels. This study examines the optimal method for predicting dam occupancy levels precisely in the context of IWRM. We aim to enhance water and environmental sustainability by increasing the operational efficiency of dams and supporting business sustainability by proposing an efficient algorithm that uses less energy and requires less hardware than its alternatives in the scientific literature. To achieve our goal, we first propose combining physical calculations with data-driven models to predict dam occupancy levels accurately and precisely. Second, we implement AI algorithms, such as Extreme Trees and deep learning, to enhance energy and hardware efficiency. This approach enables the implementation of intervention strategies more precisely against scenarios, such as droughts and floods, and the preservation of downstream ecological integrity with an efficient approach.
The occupancy rates of seven different dams in Istanbul were estimated over the past five years. The reason for estimating occupancy is that it normalizes varying water levels across different dams. The inputs for the prediction models consisted of weather data that have a high correlation with occupancy, Penman–Monteith equation parameters [1], industrial and residential water consumption, and historical reservoir data. Thus, reservoir level, rainfall, and evapotranspiration were evaluated during the prediction process. The prediction performance of models with weather data and models without weather data was compared to understand the effect of weather data on the prediction.
We utilized AI algorithms with the proposed dataset to determine the most effective method through comparative analysis. The algorithms considered in this assessment include Orthogonal Matching Pursuit CV, Lasso Lars CV, Extra Trees, Random Forest, Ridge CV, Transformed Target Regressor, and LSTM. Our objective in choosing these algorithms was to assess their performance in water level prediction compared to alternative methods that we believe may offer superior results. To conduct this evaluation from multiple perspectives, we formulated four prediction scenarios: daily, weekly, bi-weekly (15-day periods), and monthly (30-day periods). We assessed the performance of these scenarios using statistical hypothesis tests.
The structure of this paper is as follows: first, the research background on dam level prediction studies and methods for predicting water levels is presented. Next, an in-depth introduction to selected dams in Istanbul is provided. This is followed by a detailed case study, results, and discussion. Finally, this paper concludes with a summary of key findings.

2. Literature Review

This section seeks to present a summary of the current research on the intersection of dam reservoir challenges and sea level prediction methods. It reviews the existing literature on the subject and highlights research gaps that form the foundation for this and future work.
Water is a highly valuable resource that significantly contributes to both the environment and the economy. Therefore, the levels of water resources, such as reservoir levels, directly impact total added value [2] and the net present value of current and future incomes [3]. Water resources serve multiple purposes, including agriculture, energy generation, fish farming, and drinking water [4]. Consequently, both reservoir levels and their intended uses influence how these resources are utilized. Scientific research on water resources thus focuses on areas such as safety, water level measurement technologies, integrated water resource management, and the effects on the environment and agriculture, with a particular emphasis on dams. From a safety perspective, environmental risks and structural health issues, such as dam floods, concrete displacement, and seepage, are investigated at the water level [5,6,7,8] and in terms of water temperature [9,10]. Measuring the water level enables IWRM in dams, as maintaining the water level is essential not only for ensuring the efficiency of dam operations but also for effective reservoir management and a reliable freshwater supply [11]. The measurement of water levels relies on Internet of Things (IoT) technology, which involves placing sensors in a dam [12,13], or on remote sensing using artificial intelligence (AI) with satellite imagery [11,14,15]. Whereas those technologies are essential in different aspects to improve dam management, one step ahead is to predict the water level and develop intervention strategies for enhancing the improvement [16]. The prediction of water levels significantly contributes to IRWM [17]. Particularly, the prediction of climatic factors such as drought and excessive rainfalls is required to develop effective intervention strategies [18,19]. Overall, the strength of the contribution is related to the accuracy of the prediction methods used [20]. The accuracy of predictions depends on the algorithms used, the input parameters, the climate, and the behavior of the reservoir as a system.
In the context of water level prediction, inflow, outflow, historical data of the reservoir, maximum temperature, visibility, humidity, wind speed, cloud cover, and rainfall are utilized as input parameters for models [20,21,22,23,24,25]. Additionally, parameters such as evapotranspiration, solar radiation, and ambient temperature affect the water level of dams [1]. These parameters are utilized by various ML algorithms, including artificial neural networks [21,22], Long Short-Term Memory [16,24], nonlinear autoregressive models [24], multiple linear regression [22], and support vector machines [23]. Overall, the level of water is predicted between 1% and 12% mean absolute percentage error (MAPE) [11,17,21,23], as presented in Table 1. Models provide predictions on a daily or monthly basis, which are one day to thirteen days and one month [17,23,24,25]. At this level, three different input combinations are commonly used. The first combination includes only historical reservoir data [22]. The second uses only weather data [21]. The third combines both weather data and historical reservoir data [21,22]. The level of water in dams commonly having autocorrelation makes historical data of the reservoir an indispensable input parameter; however, the prediction of future states of the dam requires weather events like rainfall variability or weather characteristics having environmental effects such as droughts [26,27].
When the previous studies were reviewed, it was seen that dam water levels were predicted using time series analysis. The literature reveals a lack of research on predicting dam water levels using a hybrid model that integrates a physical model of a dam’s inflow and outflow, including evapotranspiration, with data-driven models incorporating weather data. This study aims to be the first to address this gap.

3. Dams of Istanbul

We considered seven dams in Istanbul as follows: Ömerli, Darlık, Elmalı, Terkos, Alibey, Büyükçekmece, and Sazlıdere. The Ömerli Dam (in Figure 1a), the largest in Istanbul, was constructed primarily to supply the city’s drinking water. This dam, which is of the earth-fill type, has a height of 52 m from the riverbed. At the normal water level, the reservoir volume is 386.50 hm3, and the reservoir area is 23.10 km2. It provides 180 hm3 of drinking and utility water annually. Darlık Dam (in Figure 1b) was built to supply drinking water, with a height of 73 m from the riverbed. At the normal water level, the reservoir volume is 107 hm3, and the reservoir area is 5.56 km2. The dam provides 108 hm3 of drinking and utility water annually. Elmalı Dam (in Figure 1c) was built to supply drinking, utility, and industrial water. The dam, which is of the concrete type, has a body volume of 103,000 m3 and a height of 42.5 m from the riverbed. At the normal water level, the reservoir volume is 10 hm3, and the reservoir area is 2.80 km2. It ensures the supply of 10 hm3 of drinking water annually. Terkos Dam (in Figure 1d) was constructed as a concrete fill-type dam to supply drinking water. The height of the dam from the riverbed is 8.80 m. At the normal water level, the reservoir volume is 186.80 hm3, and the reservoir area is 30.40 km2.
Alibey Dam (in Figure 2a) was constructed as an earth-fill type dam to supply drinking, utility, and industrial water. The dam has a body volume of 1,930,000 m3 and a height of 30 m from the riverbed. At the normal water level, the reservoir volume is 66.80 hm3, and the reservoir area is 4.66 km2. It provides 39 hm3 of drinking water annually. Büyükçekmece Dam (in Figure 2b) was constructed as an earth-fill type dam to supply drinking, utility, and industrial water. The dam has a body volume of 2,020,000 m3 and a height of 13 m from the riverbed. At the normal water level, the reservoir volume is 161.61 hm3, and the reservoir area is 43 km2. The dam provides 102 hm3 of drinking and utility water annually.
Sazlıdere Dam (in Figure 2c) was constructed to obtain drinking water. The dam, which is of the rock-fill type, has a body volume of 1,880,000 m3. Its height from the riverbed is 48 m. At the normal water level, the reservoir volume is 91.60 hm3, and the reservoir area is 11.81 km2. The dam provides 50 hm3 of drinking water annually.
All dams contribute to supplying drinking water, but some dams also provide water for industrial and other utility purposes. Notably, these dams lack hydroelectric plants, and therefore do not contribute to electricity production. As a result, the water from these dams is used directly by consumers, making these dams irreplaceable. Therefore, accurately predicting their water occupancy levels is crucial for effective IWRM. This enables timely intervention strategies to be applied, ensuring better resource management and preventing shortages.
According to the occupancy data shown in Figure 3, some dams are nearing zero occupancy levels, posing a significant risk of drought. Additionally, some dams are observed to be full for several months, which increases the risk of dam flooding in the event of extreme rainfall. As indicated in Figure 3, predictive prevention strategies are necessary for these dams to mitigate the risks of both drought and flooding. The plots in Figure 3 are exceptionally smooth, indicating that sharp fluctuations in the occupancy levels of the dams are rare. The smoothness of plots is beneficial, as AI methods are particularly adept at fitting smooth curves. Consequently, our investigation focuses on identifying key indicators that shed light on water filling and water loss dynamics within the dam.

4. Design of the Dataset

The model’s input data includes weather data, evapotranspiration data, daily water consumption data, and historical reservoir data. In this section, we provide a detailed explanation of each parameter within each dataset and present comprehensive calculations along with Pearson’s correlation values. We chose Pearson’s correlation because of several key advantages. First, it is widely recognized and used across various fields, which enhances the clarity and accessibility of our results. Second, Pearson’s correlation is particularly effective when the normality assumption is met, offering greater efficiency compared with non-parametric methods like Spearman’s or Kendall’s correlations. Third, it is ideal for continuous interval or ratio data, where the differences between values are meaningful.
In the first step, we built a dataset that includes a variety of parameters related to dam occupancy levels. Historical reservoir data, water consumption data, and weather data, collected using industrial-grade sensors, were downloaded from institutional data services. However, we calculated evapotranspiration ( E T 0 ) based on weather data and the geographical information of Istanbul. E T 0 plays a significant role in predicting water levels in dams by providing critical information about water loss; the evapotranspiration data involve the evapotranspiration value calculated using the Penman–Monteith method (as described in Equation (1), as well as the method’s input variables, which include the following: solar radiation, wind speed, and pressure [1]. The E T 0 , the combined process of water evaporation from the soil and transpiration from plants, is influenced by several key factors including solar radiation, temperature, humidity, and wind speed. The E T 0 was calculated by the slope of the vapor pressure curve (Δ), the net radiation at the crop surface ( R n ), the soil heat flux density ( G ), the mean daily air temperature at 2 m height ( T ), the wind speed at 2 m height ( u 2 ), the saturation vapor pressure ( e s ), the actual vapor pressure ( e a ), the saturation vapor pressure deficit ( e s e a ), and the psychrometric constant ( γ ), as given in Equation (1). Evapotranspiration has a noticeable correlation with occupancy levels compared with most weather parameters.
E T 0 = 0.408 R n G + γ 900 T + 273 u 2 ( e s e a ) + γ 1 + 0.34 u 2
The saturation vapor pressure was calculated based on the daily temperature using Equation (2) below. Similarly, the actual vapor pressure was determined using the dew point, as shown in Equation (3).
e s = 0.6108 e 17.27 T T + 273.3
e a = 0.6108 e 17.27 T d e w T d e w + 273.3
The slope of the vapor pressure curve was calculated from the saturation vapor pressure using Equation (4), as shown below.
= 4098 e s T + 273.3 2
To calculate the psychrometric constant, we first determined the atmospheric pressure ( P ) using the latitude ( Z ) of Istanbul, which is 41.0151. Equation (5) represents the atmospheric pressure. In Equation (6), we calculated the psychrometric constant using the specific heat at constant pressure ( C p ) value of 0.001013.
P = 101.3 293 0.0065 Z 293 5.26
γ = C p P 1.5239
The wind speed, net radiation, and temperature were obtained from weather data, while the soil heat flux density was assumed to be zero for the E T 0 calculation, as shown in Figure 4. The figure indicates that occupancy levels generally start decreasing in the spring and summer periods as E T 0 increases.
Secondly, we analyzed the correlations among weather data, water consumption, E T 0 , and dam occupancy levels to identify the most influential parameters for accurate prediction. Since E T 0 and water consumption contribute to outflow, their relationship with dam occupancy levels is expected. Additionally, weather data has been observed to have a strong correlation with dam water levels [21], making it a crucial factor in our correlation analysis. The weather data consisted of temperature, felt temperature, humidity, dew point, cloud cover, rainfall, snow depth, and daylight duration. Table 2 shows that solar radiation, cloud cover, and daylight duration have a significantly stronger correlation with dam occupancy levels compared with the other weather data. Although water consumption and E T 0 data do not exhibit a meaningful correlation with dam occupancy levels, they are included in the dataset because of their relevance to outflow.
Thirdly, we analyzed daily water consumption data and historical reservoir data to identify the autocorrelation that influences occupancy rates. Daily and weekly historical reservoir data exhibited a strong correlation with dam water levels, demonstrating autocorrelation, as illustrated in Table 3. This made it feasible to predict the dam occupancy levels as a time series. To improve prediction accuracy, historical reservoir data were analyzed over periods ranging from 1 to 7 past intervals, with evaluations conducted on 1-, 7-, 15-, and 30-day bases.
Lastly, based on the correlation values (ranging from −1 to 1, represented by white to dark blue) presented in Table 2 and Table 3, we generated the dataset by incorporating weather data, consumption data, evapotranspiration, and historical reservoir data. From the weather data, we selected solar radiation, dew point, daylight duration, and rainfall, as these factors directly influence outflow and inflow. Despite rainfall showing a weak correlation with occupancy rates, it significantly enhanced prediction performance. As a result, the dataset includes our E T 0 calculations and industrial-grade sensor data on weather and dam conditions, which are directly related to dam inflows and outflows, all sourced from institutional data services. It includes daily data spanning approximately five years (1777 days), from 2019 to 2024, with no missing entries. All inputs were confirmed to follow a normal distribution through the KS test, yielding a p-value of 0. Thus, we applied the Z-Score outlier detection method using a Z-value threshold of 3. No outliers were detected in any of the inputs or outputs. As a result, we constructed two datasets. The first dataset included E T 0 , weather data, and consumption data, which were parameters of the physical model for inflow and outflow, along with historical data based on a data-driven modeling approach. The second dataset included only historical reservoir data relying on a data-driven approach. After constructing the dataset, we split it into 80% for training and 20% for testing during the model development process.

5. Prediction Models

Because of the autocorrelation in dam occupancy levels, we selected algorithms commonly used for predictive modeling. The algorithms evaluated include Orthogonal Matching Pursuit CV, Lasso Lars CV, Extra Trees, Random Forest, Ridge CV, and Transformed Target Regressor, all of which are designed to mitigate overfitting. Additionally, we used LSTM, which is commonly used in studies for water level prediction. Finally, we compared the performance of the commonly used methods in water level prediction studies with alternative approaches that we believe may offer higher performance. To conduct this comparison from various perspectives, we designed four prediction scenarios, which were daily, weekly, bi-weekly (15-day periods), and monthly (30-day periods). Our approach allowed us to analyze the performance of algorithms across different prediction horizons and identify which algorithms are more suitable for intervention strategies. In the correlation analysis tables, we abbreviated the following terms: weather data (WD), consumption data (CD), evapotranspiration ( E T 0 ), and historical reservoir data (HRD).

5.1. Long Short-Term Memory

Long Short-Term Memory (LSTM) networks are a type of Recurrent Neural Network (RNN) designed to capture long-term dependencies in sequential data. This capability makes them well-suited for tasks such as time series forecasting as presented in Table 4.
The average MAPE of the LSTM model varies between 0.5% and 3.5% depending on the prediction horizon, typically remaining below 1% for daily predictions. To achieve this performance, we implemented an LSTM network with the following hidden layers: the first layer has 60 nodes with a ReLU activation function, and the second layer consists of 120 nodes, also with a ReLU activation function. We used the Adam optimizer with a learning rate of 0.015, a beta of 0.9, and an epsilon of 1 × 10−7. We used MSE as the loss function during the training process.

5.2. Orthogonal Matching Pursuit CV

Orthogonal Matching Pursuit CV (OMPCV) identifies the best features for a cross-validated estimation process using a sparse approximation algorithm. It employs an orthogonal projection basis to find the optimal matching projections of multidimensional data. We implemented OMPCV with a 5-fold cross-validation parameter. The maximum number of iterations was limited to either 10% of the total number of features or five, whichever was greater. Table 5 presents the performance of the OMPCV model for each combination of features and prediction horizons. While short-term predictions perform exceptionally well, their accuracy significantly decreases as the forecast horizon lengthens. Furthermore, weather data do not significantly affect the prediction performance of OMPCV.

5.3. Lasso Lars CV

Lasso Lars CV (LLCV) is a regression analysis method that simultaneously performs variable selection and regularization. This enhances the prediction accuracy and interpretability of the cross-validated estimation process on multidimensional data. We implemented LLCV with 5-fold cross-validation, similar to OMPCV. The maximum number of iterations was set to 500, and the maximum number of points for computing residuals in the cross-validation was 1000. The machine-precision regularization was 2.22044 × 10−16. Table 6 presents the performance of the LLCV model for each combination of features and prediction horizons. Similarly, with OMPCV, LLVC delivers outstanding short-term predictions. However, its accuracy significantly diminishes as the forecast horizon extends. Additionally, the positive impact of weather data, consumption data, and E T 0 becomes more evident as the term length increases.

5.4. Random Forest

The Random Forest (RF) algorithm combines ensemble learning methods with the decision tree algorithm to create multiple decision trees, each drawn randomly from the data. The results of these trees are averaged to produce a final result, often leading to more accurate predictions and classifications. RF was implemented with mean squared error as the criterion to minimize variance, utilizing a forest of 100 trees. The minimum number of samples required to split an internal node was set to two, and at least one sample was required to form a leaf node. When searching for the best split, only one feature was considered. Table 7 presents the performance of the RF model for each combination of features and prediction horizons. RF delivers outstanding short-term predictions and maintains relatively good performance as the forecast horizon extends. Also, weather data do not significantly affect the prediction performance of RF.

5.5. Extra Trees

The Extra Trees (ET) algorithm, similar to the Random Forests algorithm, generates multiple decision trees. However, unlike Random Forests, Extra Trees uses random sampling without replacement, resulting in a unique dataset for each tree. Additionally, a specific number of features from the total set are randomly selected for each tree. The most distinctive characteristic of Extra Trees is its random selection of splitting values for features. Instead of computing a locally optimal value using criteria like Gini impurity or entropy, the algorithm randomly selects a split value. This approach enhances the diversity and reduces the correlation among the trees. ET was implemented with the same parameters as RF. Table 8 presents the performance of the ET model for each combination of features and prediction horizons. ET is the most accurate method for predicting occupancy levels across all horizons. The impact of weather data on ET’s prediction accuracy is noticeable only for daily occupancy predictions.

6. Results and Discussion

The importance of water for both the environment and its economic value makes it a resource that must be managed carefully. In this context, we evaluated how to implement the most effective IWRM in dams, which are among the most crucial water resources in the world. Our review of studies presented in the scientific literature revealed that efficient water management requires both measuring water levels and predicting these levels for more advanced planning. Accordingly, we investigated the most suitable prediction model for achieving the most effective IWRM. In this context, we explored the advantages of a hybrid model that integrates a physical model-driven approach with a data-driven approach. We examined the correlations among weather data, water consumption, evapotranspiration, historical reservoir data related to inflow and outflow, and dam occupancy levels for seven dams in Istanbul, as shown in Table 2 and Table 3. The correlation analysis indicates that historical reservoir data and dam occupancy levels have a very strong correlation, suggesting that occupancy levels should be treated as a time series. Water consumption and rainfall do not exhibit a strong correlation with dam occupancy levels. Nevertheless, because all dams supply drinking water and rainfall contributes to their water supply, daily water consumption and rainfall are included as features in the proposed models. We concluded that the prediction model should be developed using parameters that account for both the water level in the dam as well as the inflow and outflow of water [1]. Additionally, solar radiation, humidity, cloud cover, daylight duration, and evapotranspiration have a meaningful correlation with dam occupancy levels.
Based on this analysis, we propose that the most effective model for predicting dam occupancy levels should incorporate weather data, water consumption data, evapotranspiration calculations, and historical reservoir data. To validate this approach, we employed RF and LSTM, which have demonstrated successful results in the scientific literature, as well as ET, OMPCV, and LLCV to make predictions for daily, weekly, bi-weekly, and monthly intervals. To assess the impact of weather data, water consumption data, and evapotranspiration on prediction performance, we compared the prediction performance of identical models created with a dataset consisting of weather data and historical reservoir data against models created using only historical reservoir data.
As shown in Figure 5a, E T 0 , weather data, and consumption data have a positive effect on prediction accuracy as the prediction horizon increases. Therefore, the physical model parameters for inflow and outflow have a significant positive impact on average MAPE values. As illustrated in Figure 5b, LSTM, RF, and ET models demonstrate strong performance with low MAPE for long-term predictions as well. The average MAPE of LSTM models ranges from 1% to 3.5% for the monthly basis prediction of each dam’s occupancy. This accuracy is better than levels reported for LSTM in similar scientific studies. Notably, ET consistently achieved a MAPE ranging from 0.3% to 1.4% across all intervals, demonstrating a remarkable performance. Consequently, our research contributes to the scientific literature by proposing AI algorithms such as ET, OMPCV, and LLCV for predicting dam reservoir levels. We demonstrate that ET provides more precise predictions of dam occupancy levels compared with RF and LSTM, which are commonly used in the scientific literature.
To reinforce our proposed model with more robust evidence, we conducted a statistical test on the MAPE values. This analysis aimed to identify the conditions that result in the smallest prediction errors. First, we performed the Kolmogorov–Smirnov (KS) test on the MAPE values for two different input sets. The first input set, which includes E T 0 , weather data, consumption data, and historical reservoir data, fits the normal distribution with a p-value of 4.74 × 10−33. The second input set, which includes only historical reservoir data, also fits the normal distribution with a p-value of 4.73 × 10−33. Based on these results, we then performed a two-sided Z-Test to determine whether the samples were identical. The Z-Test resulted in a p-value of 0.0155. This indicates that the two samples were not identical at the 1% significance level, leading us to accept the null hypothesis. To compare the means of the samples, we performed both the Z-Test and paired T-Test with a less-than-alternative hypothesis. The Z-test produced a p-value of 0.0078, while the T-Test resulted in a p-value of 2.59 × 10−6 with 139 degrees of freedom (DF). Consequently, the null hypothesis was rejected in both tests. Therefore, it can be concluded that E T 0 , weather data, and consumption data turned out to reduce the prediction error rate. After validating our hypothesis with statistical tests, we tried to figure out the best model utilizing the dataset “ E T 0 + WD + CD + HRD”. Firstly, we performed an ANOVA test on the performance data of all models. The test resulted in a p-value of 7.579 × 10−11, which led us to accept the alternative hypothesis. Since the alternative hypothesis of the ANOVA test posits that the samples are not identical, we compared their performances using a paired T-Test with a less-than-alternative hypothesis. First, we compared LLCV and OMPCV. The performance of LLCV was superior to OMPCV, as indicated by a p-value of 0.0008 with 27 degrees of freedom. Next, we compared LSTM and LLCV. The performance of LSTM was superior to that of LLCV, with a p-value of 5.535 × 10−5 and 27 degrees of freedom. Third, we compared RF and LSTM. The performance of RF was superior to that of LSTM, with a p-value of 6.54 × 10−5 and 27 degrees of freedom. Finally, we compared ET and RF. The performance of ET was superior to that of RF, with a p-value of 2.085 × 10−9 and 27 degrees of freedom. As a result of this comparison, the best AI method for predicting dam occupancy levels is ET. The results of all statistical tests leading to this conclusion are summarized in Table 9.
The best model in terms of average MAPE, ET, performs exceptionally well over the medium and long term (from weekly to monthly) using only historical reservoir data. However, on a daily basis, the ET model achieves lower MAPE when incorporating physical model parameters. The highest MAPE for ET, the best-performing model, is observed in the monthly predictions. Figure 6a illustrates ET’s predictions for a one-month period at the Ömerli, Darlık, and Elmalı Dams, while Figure 6b shows the predictions for the Terkos, Alibey, Büyükçekmece, and Sazlıdere Dams. Both Figure 6 and the MSE values in Table 8 clearly indicate that ET surpasses other methods in accurately capturing and adapting to the trend. The average MAPE of the best ET model for monthly predictions is 1.2 × 10−4, as shown in Table 8, compared with 2.35 × 10−4 for the LSTM model in Table 4 and 3.55 × 10−4 for the RF model in Table 7.

7. Conclusions

IWRM is a key function to utilize water resources serving multiple purposes, including agriculture, energy generation, fish farming, and drinking water [4]. One of the significant contributions to the IWRM [17] field is the prediction of water levels. Predicting water levels enables the implementation of intervention strategies to enhance the efficiency and effectiveness of IWRM. However, achieving truly efficient and effective IWRM requires precise water level predictions. In this context, we propose that combining physical model-based calculations measured data related to inflow and outflow water can significantly improve the accuracy of AI-based water level predictions. Physical model parameters, including water consumption data, E T 0 , solar radiation, dew point, daylight duration, and rainfall, contribute positively to reducing the average MAPE of the methods considered. However, for ET, the most effective method, these parameters only have a positive impact on a daily basis. To illustrate this positive effect, we utilized two distinct datasets derived from data collected from seven dams in Istanbul. The first dataset includes “ E T 0 , WD, CD, and HRD” while the second dataset contains only HRD. We applied the LSTM, RF, LLCV, OMPCV, and ET algorithms to these datasets, aiming to validate our hypothesis and evaluate alternative AI algorithms compared to those commonly used in IWRM studies. Finally, we developed occupancy-level prediction models to standardize data from dams with varying depths.
Following the model development phase, we concluded by validating our proposed model. We contributed to the scientific literature with our hybrid model and explored significant input parameters for water level prediction such as solar radiation, dew point, daylight duration, rainfall, daily water consumption, evapotranspiration, and historical reservoir data. Moreover, we considered AI algorithms that are not frequently used in the IWRM field against the algorithms used. Finally, we discovered that ET had superior performance against commonly used algorithms such as LSTM and RF. ET also required less hardware and energy to operate and predicted the occupancy level one month in advance with only a 1% error margin. Our primary contribution to sustainability is predicting dam occupancy levels at this sensitivity with a more efficient approach than deep learning (LSTM), which provides the following benefits:
  • Ensures water is allocated optimally among various users (agriculture, industry, domestic) based on current water availability.
  • Enables proactive management of water resources during periods of drought by accurately predicting available water supplies.
  • Helps in managing dam releases to mitigate downstream flooding risks by maintaining appropriate water levels.
  • Maintains minimum ecological flows downstream to support aquatic habitats and biodiversity.
  • Increases the sustainability of businesses with less hardware requirement and less energy need.
In future studies, we plan to investigate how integrating weather forecasts and hyperparameter optimization can further enhance the performance of AI models. Additionally, we aim to explore novel approaches for incorporating real-time data streams and improving the robustness of predictive models in dynamic environments.

Author Contributions

Conceptualization, A.C.B. and R.Y.; methodology, A.C.B., R.Y., M.R.C. and E.C.; data curation, M.R.C.; software, M.R.C.; validation, E.C.; visualization, E.C.; writing—original draft preparation, M.R.C. and E.C.; writing—review and editing, A.C.B. and R.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data used in this study was collected from https://data.ibb.gov.tr/ and https://www.visualcrossing.com/weather/weather-data-services/Istanbul,Turkey/ (20 August 2024). Evapotranspiration was calculated by the authors based on this data. The corresponding author can provide the evapotranspiration data upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Allen, R.G.; Pereira, L.S.; Raes, D.; Smith, M. Crop Evapotranspiration—Guidelines for Computing Crop Water Requirements—FAO Irrigation and Drainage Paper 56; FAO: Rome, Italy, 1998. [Google Scholar]
  2. Foudi, S.; McCartney, M.; Markandya, A.; Pascual, U. The impact of multipurpose dams on the values of nature’s contributions to people under a water-energy-food nexus framing. Ecol. Econ. 2023, 206, 107758. [Google Scholar] [CrossRef]
  3. Bieber, N.; Ker, J.H.; Wang, X.; Triantafyllidis, C.; van Dam, K.H.; Koppelaar, R.H.; Shah, N. Sustainable planning of the energy-water-food nexus using decision making tools. Energy Policy 2018, 113, 584–607. [Google Scholar] [CrossRef]
  4. Jalilov, S.M.; Keskinen, M.; Varis, O.; Amer, S.; Ward, F.A. Managing the water-energy-food nexus: Gains and losses from new water development in Amu Darya River Basin. J. Hydrol. 2016, 539, 648–661. [Google Scholar] [CrossRef]
  5. Lee, E.H. Proactive dam operation based on inflow prediction by modified long short-term memory for improving resilience. Eng. Appl. Artif. Intell. 2024, 133, 108525. [Google Scholar] [CrossRef]
  6. Li, M.; Ren, Q.; Li, M.; Fang, X.; Xiao, L.; Li, H. A separate modeling approach to noisy displacement prediction of concrete dams via improved deep learning with frequency division. Adv. Eng. Inform. 2024, 60, 102367. [Google Scholar] [CrossRef]
  7. Zin, M.F.M.; Kamal, F.Z.; Ismail, S.I.; Noh, K.S.S.K.M.; Kassim, A.H. Development of dam controller technology water level and alert system using Arduino UNO. Indones. J. Electr. Eng. Comput. Sci. 2023, 31, 1342–1349. [Google Scholar] [CrossRef]
  8. Ziggah, Y.Y.; Issaka, Y.; Laari, P.B. Evaluation of different artificial intelligent methods for predicting dam piezometric water level. Model. Earth Syst. Environ. 2022, 8, 2715–2731. [Google Scholar] [CrossRef]
  9. Tshireletso, T.; Moyo, P.; Kabani, M. Predicting the effects of climate change on water temperatures of roode elsberg dam using nonparametric machine learning models. Infrastructures 2021, 6, 14. [Google Scholar] [CrossRef]
  10. Vishwakarma, D.K.; Ali, R.; Bhat, S.A.; Elbeltagi, A.; Kushwaha, N.L.; Kumar, R.; Rajput, J.; Heddam, S.; Kuriqi, A. Pre- and post-dam river water temperature alteration prediction using advanced machine learning models. Environ. Sci. Pollut. Res. 2022, 29, 83321–83346. [Google Scholar] [CrossRef] [PubMed]
  11. Ouma, Y.O.; Moalafhi, D.B.; Anderson, G.; Nkwae, B.; Odirile, P.; Parida, B.P.; Qi, J. Dam Water Level Prediction Using Vector AutoRegression, Random Forest Regression and MLP-ANN Models Based on Land-Use and Climate Factors. Sustainability 2022, 14, 14934. [Google Scholar] [CrossRef]
  12. Ganesh, R.S.; Sasipriya, S.; Gowtham Balaji, M.; Ashok Karthi, G.; Gokul Dharan, S. An IoT-based Dam Water Level Monitoring and Alerting System. In Proceedings of the International Conference on Applied Artificial Intelligence and Computing, ICAAIC 2022, Salem, India, 9–11 May 2022. [Google Scholar]
  13. Kavitha, R.; Jayalakshmi, C.; Senthil Kumar, K. Dam Water Level Monitoring and Alerting System using IOT. Int. J. Electron. Commun. Eng. 2018, 5, 19–22. [Google Scholar]
  14. Ngebe, S.; Malunda, K.B.; du Plessis, A. Utility of geospatial techniques in estimating dam water levels: Insights from the Katrivier Dam. Water SA 2022, 48, 151–160. [Google Scholar]
  15. Li, W.; Qin, Y.; Sun, Y.; Huang, H.; Ling, F.; Tian, L.; Ding, Y. Estimating the relationship between dam water level and surface water area for the Danjiangkou Reservoir using Landsat remote sensing images. Remote Sens. Lett. 2016, 7, 121–130. [Google Scholar] [CrossRef]
  16. Ibañez, S.C.; Dajac, C.V.G.; Liponhay, M.P.; Legara, E.F.T.; Esteban, J.M.H.; Monterola, C.P. Forecasting reservoir water levels using deep neural networks: A case study of angat dam in the philippines. Water 2022, 14, 34. [Google Scholar] [CrossRef]
  17. Ahmed, E.-S.N.; Amr, E.-S. Daily forecasting of dam water levels using machine learning. Int. J. Civ. Eng. Technol. 2019, 10, 314–323. [Google Scholar]
  18. Yu, W.; Nakakita, E.; Kim, S.; Yamaguchi, K. Improving the accuracy of flood forecasting with transpositions of ensemble NWP rainfall fields considering orographic effects. J. Hydrol. 2016, 539, 345–357. [Google Scholar] [CrossRef]
  19. Zhang, R.; Chen, Z.Y.; Xu, L.J.; Ou, C.Q. Meteorological drought forecasting based on a statistical model with machine learning techniques in Shaanxi province, China. Sci. Total Environ. 2019, 665, 338–346. [Google Scholar] [CrossRef]
  20. Ryu, Y.M.; Lee, E.H. Application of Neural Networks to Predict Daecheong Dam Water Levels. J. Korean Soc. Hazard Mitig. 2022, 22, 67–78. [Google Scholar] [CrossRef]
  21. Dayal, A.; Bonthu, S.; T, V.N.; Saripalle, P.; Mohan, R. Deep learning for Multi-horizon Water level Forecasting in KRS reservoir, India. Results Eng. 2024, 21, 101828. [Google Scholar] [CrossRef]
  22. Üneş, F.; Demirci, M.; Kişi, Ö. Prediction of Millers Ferry Dam Reservoir Level in USA Using Artificial Neural Network. Period. Polytech. Civ. Eng. 2015, 59, 309–318. [Google Scholar] [CrossRef]
  23. Hipni, A.; El-shafie, A.; Najah, A.; Karim, O.A.; Hussain, A.; Mukhlisin, M. Daily Forecasting of Dam Water Levels: Comparing a Support Vector Machine (SVM) Model With Adaptive Neuro Fuzzy Inference System (ANFIS). Water Resour. Manag. 2013, 27, 3803–3823. [Google Scholar] [CrossRef]
  24. Huang, S.; Xia, J.; Zeng, S.; Wang, Y.; She, D. Effect of Three Gorges Dam on Poyang Lake water level at daily scale based on machine learning. J. Geogr. Sci. 2021, 31, 1598–1614. [Google Scholar] [CrossRef]
  25. Larrea, P.P.; Ríos, X.Z.; Parra, L.C. Application of neural network models and anfis for water level forecasting of the salve faccha dam in the andean zone in Northern Ecuador. Water 2021, 13, 2011. [Google Scholar] [CrossRef]
  26. Ayanlade, A.; Radeny, M.; Morton, J.F.; Muchaba, T. Rainfall variability and drought characteristics in two agro-climatic zones: An assessment of climate change challenges in Africa. Sci. Total Environ. 2018, 630, 728–737. [Google Scholar] [CrossRef]
  27. Fowler, H.J.; Kilsby, C.G. A weather-type approach to analysing water resource drought in the Yorkshire region from 1881 to 1998. J. Hydrol. 2002, 262, 177–192. [Google Scholar] [CrossRef]
Figure 1. (a) Ömerli Dam, (b) Darlık Dam, (c) Elmalı Dam, and (d) Terkos Dam.
Figure 1. (a) Ömerli Dam, (b) Darlık Dam, (c) Elmalı Dam, and (d) Terkos Dam.
Sustainability 16 07696 g001
Figure 2. (a) Alibey Dam, (b) Büyükçekmece Dam, and (c) Sazlıdere Dam.
Figure 2. (a) Alibey Dam, (b) Büyükçekmece Dam, and (c) Sazlıdere Dam.
Sustainability 16 07696 g002
Figure 3. Dam occupancy levels for a 5-year period.
Figure 3. Dam occupancy levels for a 5-year period.
Sustainability 16 07696 g003
Figure 4. Evapotranspiration levels for the 5-year period.
Figure 4. Evapotranspiration levels for the 5-year period.
Sustainability 16 07696 g004
Figure 5. (a) MAPE in dataset basis and (b) MAPE of AI methods.
Figure 5. (a) MAPE in dataset basis and (b) MAPE of AI methods.
Sustainability 16 07696 g005
Figure 6. (a) Prediction of the Ömerli, Darlık, and Elmalı Dams’ occupancies. (b) Prediction of the Terkos, Alibey, Büyükçekmece, and Sazlıdere Dams’ occupancies.
Figure 6. (a) Prediction of the Ömerli, Darlık, and Elmalı Dams’ occupancies. (b) Prediction of the Terkos, Alibey, Büyükçekmece, and Sazlıdere Dams’ occupancies.
Sustainability 16 07696 g006
Table 1. Summary of method performance from the scientific literature.
Table 1. Summary of method performance from the scientific literature.
MethodHorizonMAPEReference
Random ForestDaily0.121[11]
Vector autoregressive modelsDaily0.015[11]
Linear regressionDaily0.059477[17]
Long Short-Term MemoryDaily0.01[21]
Long Short-Term MemoryMonthly0.08[21]
Support vector machinesDaily0.0164[23]
Table 2. Correlation among evapotranspiration, water consumption data (CD), weather data, and dam occupancy levels.
Table 2. Correlation among evapotranspiration, water consumption data (CD), weather data, and dam occupancy levels.
E T 0 Weather DataCDDam Occupancy Levels
1. Evapotranspiration2. Solar Radiation3. Sea-Level Pressure4. Wind Speed5. Temperature6. Minimum Felt Temperature7. Humidity8. Dew Point9. Cloud Cover10. Rainfall11. Snow Depth12. Daylight Duration (Hours)13. Water Consumption14. Ömerli Occupancy15. Darlık Occupancy16. Elmalı Occupancy17. Terkos Occupancy18. Alibey Occupancy19. B.Çekmece Occupancy20. Sazlıdere Occupancy
E T 0 1-0.86−0.42−0.120.810.74−0.480.710.86−0.24−0.130.790.690.440.370.190.370.130.250.24
Weather Data20.86-−0.21−0.110.60.52−0.520.470.99−0.34−0.110.70.540.480.430.270.380.20.310.27
3−0.42−0.21-0−0.51−0.490.11−0.5−0.21−0.150.13−0.4−0.29−0.19−0.16−0.04−0.14−0.01−0.08−0.08
4−0.12−0.110-00.010.060.02−0.110.220.070.030.040.050.01−0.04−0.07−0.1−0.06−0.08
50.810.6−0.510-0.98−0.370.950.6−0.17−0.250.650.790.190.11−0.060.17−0.10.040.07
60.740.52−0.490.010.98-−0.290.960.52−0.16−0.270.610.750.140.07−0.090.14−0.130.020.05
7−0.48−0.520.110.06−0.37−0.29-−0.08−0.520.270.12−0.29−0.33−0.19−0.21−0.12−0.18−0.12−0.12−0.12
80.710.47−0.50.020.950.96−0.08-0.47−0.11−0.240.60.740.130.05−0.10.12−0.150.010.04
90.860.99−0.21−0.110.60.52−0.520.47-−0.34−0.110.70.540.480.430.270.380.20.310.27
10−0.24−0.34−0.150.22−0.17−0.160.27−0.11−0.34-0.08−0.17−0.24−0.11−0.09−0.05−0.12−0.03−0.08−0.11
11−0.13−0.110.130.07−0.25−0.270.12−0.24−0.110.08-−0.11−0.13−0.04−0.040−0.030.040−0.06
120.790.7−0.40.030.650.61−0.290.60.7−0.17−0.11-0.490.670.630.380.560.310.450.45
CD130.690.54−0.290.040.790.75−0.330.740.54−0.24−0.130.49-0.12−0.05−0.17−0.01−0.21−0.030.02
Occupancy140.440.48−0.190.050.190.14−0.190.130.48−0.11−0.040.670.12-0.790.710.670.440.510.41
150.370.43−0.160.010.110.07−0.210.050.43−0.09−0.040.63−0.050.79-0.740.750.640.580.51
160.190.27−0.04−0.04−0.06−0.09−0.12−0.10.27−0.0500.38−0.170.710.74-0.910.80.830.75
170.370.38−0.14−0.070.170.14−0.180.120.38−0.12−0.030.56−0.010.670.750.91-0.760.880.8
180.130.2−0.01−0.1−0.1−0.13−0.12−0.150.2−0.030.040.31−0.210.440.640.80.76-0.840.76
190.250.31−0.08−0.060.040.02−0.120.010.31−0.0800.45−0.030.510.580.830.880.84-0.89
200.240.27−0.08−0.080.070.05−0.120.040.27−0.11−0.060.450.020.410.510.750.80.760.89-
Table 3. Correlation between historical reservoir data and dam occupancy levels.
Table 3. Correlation between historical reservoir data and dam occupancy levels.
Historical Data (A Week Before)Historical Data (One Day Before)Current Day
1. Ömerli Previous Week2. Darlık Previous Week3. Elmalı Previous Week4. Terkos Previous Week5. Alibey Previous Week6. B.Çekmece Previous Week7. Sazlıdere Previous Week8. Ömerli Previous Day9. Darlık Previous Day10. Elmalı Previous Day11. Terkos Previous Day12. Alibey Previous Day13. B.Çekmece Previous Day14. Sazlıdere Previous Day15. Ömerli Occupancy16. Darlık Occupancy17. Elmalı Occupancy18. Terkos Occupancy19. Alibey Occupancy20. B.Çekmece Occupancy21. Sazlıdere Occupancy
A Week Before1-0.7960.7020.6630.4490.5170.4190.9900.7920.6730.6740.4200.5160.4180.9870.7910.6680.6760.4150.5150.417
20.796-0.7280.7480.6350.5900.5200.7830.9910.6950.7590.6050.5910.5200.7800.9880.6900.7600.5990.5900.519
30.7020.728-0.8540.8110.8020.7310.7130.7440.9860.8810.7960.8150.7510.7140.7460.9830.8850.7930.8160.754
40.6630.7480.854-0.7110.8780.8030.6420.7270.8170.9940.6700.8620.7950.6370.7220.8110.9930.6630.8590.792
50.4490.6350.8110.711-0.8110.7410.4630.6510.8120.7380.9910.8300.7690.4650.6530.8110.7420.9880.8330.772
60.5170.5900.8020.8780.811-0.8930.5090.5770.7790.8820.7780.9950.8990.5060.5750.7750.8820.7720.9930.899
70.4190.5200.7310.8030.7410.893-0.4090.5060.6990.8010.6980.8760.9950.4060.5030.6930.8000.6900.8720.993
One Day Before80.9900.7830.7130.6420.4630.5090.409-0.7960.7020.6620.4480.5160.4180.9990.7960.6980.6650.4440.5170.418
90.7920.9910.7440.7270.6510.5770.5060.796-0.7270.7460.6340.5880.5170.7950.9990.7230.7490.6290.5880.517
100.6730.6950.9860.8170.8120.7790.6990.7020.727-0.8530.8120.8020.7300.7050.7310.9990.8580.8100.8040.734
110.6740.7590.8810.9940.7380.8820.8010.6620.7460.853-0.7060.8760.8010.6590.7430.8470.9990.7000.8740.800
120.4200.6050.7960.6700.9910.7780.6980.4480.6340.8120.706-0.8080.7370.4510.6370.8130.7110.9990.8120.742
130.5160.5910.8150.8620.8300.9950.8760.5160.5880.8020.8760.808-0.8920.5150.5860.7980.8780.8030.9990.893
140.4180.5200.7510.7950.7690.8990.9950.4180.5170.7300.8010.7370.892-0.4160.5150.7250.8010.7310.8890.999
Current Day150.9870.7800.7140.6370.4650.5060.4060.9990.7950.7050.6590.4510.5150.416-0.7960.7020.6620.4480.5160.417
160.7910.9880.7460.7220.6530.5750.5030.7960.9990.7310.7430.6370.5860.5150.796-0.7270.7460.6340.5870.516
170.6680.6900.9830.8110.8110.7750.6930.6980.7230.9990.8470.8130.7980.7250.7020.727-0.8530.8120.8020.730
180.6760.7600.8850.9930.7420.8820.8000.6650.7490.8580.9990.7110.8780.8010.6620.7460.853-0.7050.8760.800
190.4150.5990.7930.6630.9880.7720.6900.4440.6290.8100.7000.9990.8030.7310.4480.6340.8120.705-0.8080.737
200.5150.5900.8160.8590.8330.9930.8720.5170.5880.8040.8740.8120.9990.8890.5160.5870.8020.8760.808-0.891
210.4170.5190.7540.7920.7720.8990.9930.4180.5170.7340.8000.7420.8930.9990.4170.5160.7300.8000.7370.891-
Table 4. Performance of the Long Short-Term Memory model.
Table 4. Performance of the Long Short-Term Memory model.
DAM E T 0 + WD + CD + HRDHRD
MAPER2MSERMSEMAPER2MSERMSE
Daily PredictionÖmerli0.004969960.9997541.80904 × 10−50.004253280.006284460.9995133.41411 × 10−50.00584303
Darlık0.003376420.9995182.45226 × 10−50.004952030.005875750.9990815.18837 × 10−50.00720303
Elmalı0.01240020.9974190.0001926890.01388120.009576190.9939720.0004021230.020053
Terkos0.006013390.9995652.17048 × 10−50.004658840.006553990.9997331.46284 × 10−50.00382471
Alibey0.0104670.9977879.50388 × 10−50.009748790.00984210.9989244.93963 × 10−50.00702825
B.Çekmece0.00999090.9997123.09754 × 10−50.005565550.009673410.999533.67732 × 10−50.00606409
Sazlıdere0.008467460.9994191.97302 × 10−50.004441860.006566990.9996919.44456 × 10−60.0030732
Weekly PredictionÖmerli0.0147510.9964680.0001720750.01311770.01557570.9974820.0001550520.012452
Darlık0.01528820.9958030.0002071550.01439290.01014460.9985178.08303 × 10−50.00899057
Elmalı0.02805710.9909190.0007009750.02647590.02454440.9905630.0006446260.0253895
Terkos0.01904610.9962930.0001976250.01405790.01838520.9962890.0001890510.0137496
Alibey0.03019140.9954270.0002028110.01424120.02901580.9962040.000172870.013148
B.Çekmece0.02354510.9974130.0001500720.01225040.0285850.9970760.0002033820.0142612
Sazlıdere0.02168370.9962210.0001128670.01062390.03748220.9967990.0001169650.010815
15-Day PredictionÖmerli0.01710610.9931610.0003220590.0179460.02565260.9931550.0004842540.0220058
Darlık0.01884530.9919810.0004432390.02105320.01306140.9984149.16712 × 10−50.00957451
Elmalı0.02966060.9833740.001052340.03243980.03028390.9913720.0006206770.0249134
Terkos0.0190510.9966490.0001689890.01299960.02499670.9973020.0001778950.0133377
Alibey0.03480070.9379530.002772790.05265730.02691810.9939710.000256420.0160131
B.Çekmece0.03978470.991530.0005422640.02328660.03336380.9947740.0003084340.0175623
Sazlıdere0.02619450.9959320.0001225050.01106820.03372670.9934450.0001956270.0139867
Monthly PredictionÖmerli0.0128080.9901790.0004530950.0212860.02243810.9890990.0005049530.0224712
Darlık0.01803950.995140.0002427450.01558030.01183250.9986260.0001104560.0105098
Elmalı0.03619160.9841450.001015640.03186910.01963290.9924560.0004838060.0219956
Terkos0.01886250.9970370.0001455010.01206240.01423860.9987627.7495 × 10−50.00880313
Alibey0.02070370.9955710.0001857970.01363070.0171390.9977419.65338 × 10−50.00982516
B.Çekmece0.02674430.9977340.0001900040.01378420.03023740.9982140.0001133740.0106477
Sazlıdere0.02656390.9944430.0001772350.0133130.02456320.9915770.0002549040.0159657
Table 5. Performance of the Orthogonal Matching Pursuit CV model.
Table 5. Performance of the Orthogonal Matching Pursuit CV model.
DAM E T 0 + WD + CD + HRDHRD
MAPER2MSERMSEMAPER2MSERMSE
Daily PredictionÖmerli0.003122370.9997461.16941 × 10−50.003419670.003057020.9994282.62668 × 10−50.00512511
Darlık0.002906340.9997631.15753 × 10−50.003402250.002927480.9994982.45049 × 10−50.00495024
Elmalı0.007834920.9975090.0001570060.01253020.007731080.9956660.0002735520.0165394
Terkos0.005858590.9996491.71972 × 10−50.004146950.00499440.9997411.27535 × 10−50.00357121
Alibey0.008967040.9994562.29451 × 10−50.004790110.007057910.9991523.57359 × 10−50.00597795
B.Çekmece0.007087750.9996621.90968 × 10−50.004369990.006929810.9996182.15249 × 10−50.00463949
Sazlıdere0.006635540.9997856.35257 × 10−60.002520430.006111860.9996739.82439 × 10−60.00313439
Weekly PredictionÖmerli0.02648990.9873160.0005892720.02427490.02490010.9893110.0004992730.0223444
Darlık0.02345490.9902030.0004782210.02186830.02339370.990070.000485840.0220418
Elmalı0.03615110.9790430.001324870.03639880.04382590.9780490.001401820.0374409
Terkos0.02842420.9910940.0004369250.02090270.03043530.9896840.0005053640.0224803
Alibey0.04548190.9873790.0005306430.02303570.04687750.9854980.0006088090.0246741
B.Çekmece0.0294640.9933870.000373630.01932950.03576980.9923130.000436930.0209029
Sazlıdere0.03515390.9935550.0001910850.01382330.03534010.9931120.0002049820.0143172
15-Day PredictionÖmerli0.05976480.9563840.002031420.04507130.05623930.9596560.001889940.0434734
Darlık0.04756810.9709730.001435490.03788790.04884630.9670320.001613820.0401724
Elmalı0.07493280.9309580.004368550.0660950.08164970.9201740.005064050.0711621
Terkos0.05654620.9767150.00114140.03378460.06278890.9748320.001234850.0351404
Alibey0.09588140.9533790.001967190.0443530.1017440.9516440.002061120.0453995
B.Çekmece0.06492540.9845490.0008773510.02962010.07460380.9825180.001004170.0316886
Sazlıdere0.08632250.9770340.0006801560.02607980.08233960.9783310.0006455840.0254084
Monthly PredictionÖmerli0.1152950.8407630.007304510.08546640.1174240.8323430.007691960.0877038
Darlık0.0960270.8828320.005720440.07563360.1014290.8682350.006444690.0802788
Elmalı0.1392330.786050.01357150.1164970.1523060.7621780.01514420.123062
Terkos0.1006220.9274970.003554490.05961950.1083710.9214440.003868740.0621992
Alibey0.1709930.8347060.006962930.08344420.1706720.811180.008048310.0897124
B.Çekmece0.1529020.9237990.004338340.06586610.1404050.9259020.00418320.0646777
Sazlıdere0.159380.9035940.002858240.05346250.1628440.8933960.003170650.0563085
Table 6. Performance of the Lasso Lars CV model.
Table 6. Performance of the Lasso Lars CV model.
DAM E T 0 + WD + CD + HRDHRD
MAPER2MSERMSEMAPER2MSERMSE
Daily PredictionÖmerli0.002906960.9997281.25172 × 10−50.003537970.003054580.9994552.50039 × 10−50.00500039
Darlık0.002938260.9997381.27733 × 10−50.003573980.002937140.9995462.22023 × 10−50.00471193
Elmalı0.007705510.9974620.000159970.01264790.007781440.9956750.0002729330.0165207
Terkos0.003987310.9998338.20228 × 10−60.002863960.004962610.9997491.23182 × 10−50.00350973
Alibey0.006828970.9996351.54796 × 10−50.003934410.007081080.9991563.55807 × 10−50.00596496
B.Çekmece0.005968530.9998191.02405 × 10−50.003200080.006901640.9996182.15236 × 10−50.00463935
Sazlıdere0.005190530.9998015.93328 × 10−60.002435830.006053720.9997248.29238 × 10−60.00287965
Weekly PredictionÖmerli0.02323630.9896290.0004792340.02189140.02507120.9892690.0005016650.0223979
Darlık0.02168110.9920090.0003891580.01972710.02339640.990090.0004848860.0220201
Elmalı0.03609240.9795850.001290050.03591730.04429110.9779760.001408990.0375365
Terkos0.02808220.9911540.000434570.02084630.0303880.9896520.0005068740.0225139
Alibey0.04218830.9880260.0005040230.02245050.04687750.9854990.0006088080.024674
B.Çekmece0.03038650.9930510.0003931110.0198270.035850.9923560.0004345150.020845
Sazlıdere0.03155550.9936650.0001873030.01368590.03537860.9931090.0002050630.01432
15-Day PredictionÖmerli0.05333750.9640820.001699180.04122110.05623920.9596560.001889940.0434734
Darlık0.04575420.9721840.001378070.03712240.04943450.9668580.001621410.0402667
Elmalı0.07431630.9333340.004234710.06507460.08159540.9197950.005082740.0712933
Terkos0.05402780.9780520.001077490.03282520.06300820.9748240.001235250.0351461
Alibey0.09202880.9595140.00171890.04145960.1017460.9514920.002062170.0454111
B.Çekmece0.06176240.9853750.0008391670.02896840.07465520.9825210.001003620.0316799
Sazlıdere0.07819480.9802410.0005915950.02432270.0826770.9783150.0006457510.0254116
Monthly PredictionÖmerli0.1047820.865070.006191610.07868680.1174240.8323430.007691960.0877038
Darlık0.09062920.8961140.005071090.07121160.1014290.8682350.006444690.0802788
Elmalı0.1323470.7918520.01322860.1150160.1531110.7617770.01515310.123098
Terkos0.09407110.9350740.00319030.05648280.1087590.9213230.00387170.062223
Alibey0.158760.8427550.006647440.08153180.1708720.8113420.008031040.0896161
B.Çekmece0.1151720.9409450.003325850.05767020.1404040.9259020.00418320.0646777
Sazlıdere0.1486630.9118280.002607890.05106750.1628440.8933960.003170650.0563085
Table 7. Performance of the Random Forest model.
Table 7. Performance of the Random Forest model.
DAM E T 0 + WD + CD + HRDHRD
MAPER2MSERMSEMAPER2MSERMSE
Daily PredictionÖmerli0.004783780.9993153.14498 × 10−50.005608010.004631940.9993363.05549 × 10−50.00552765
Darlık0.004510810.9994512.67768 × 10−50.005174630.004834470.9991424.17943 × 10−50.00646485
Elmalı0.007844720.9956140.000276520.01662890.008663330.9955610.0002797630.0167261
Terkos0.007021860.9995592.1677 × 10−50.004655860.007657640.9995222.3537 × 10−50.0048515
Alibey0.009725960.9989674.3398 × 10−50.006587720.01022610.9987955.05776 × 10−50.00711179
B.Çekmece0.008939340.9994423.1638 × 10−50.005624770.009464530.9993263.81665 × 10−50.00617791
Sazlıdere0.007734550.9996749.58492 × 10−60.003095950.008950210.9995841.22647 × 10−50.0035021
Weekly PredictionÖmerli0.01166030.9971920.0001293240.01137210.0127160.9961690.0001766820.0132922
Darlık0.009353860.998139.12824 × 10−50.009554180.01071260.9977250.0001125080.010607
Elmalı0.0166160.9942860.0003622360.01903250.01846590.9936950.000398960.019974
Terkos0.01365930.9978040.0001077340.01037950.01178020.9983268.20781 × 10−50.0090597
Alibey0.02096910.996490.0001473730.01213970.02194770.9962660.0001570220.0125308
B.Çekmece0.01809590.9966820.0001876730.01369940.01651570.9978820.0001197670.0109438
Sazlıdere0.01400140.998394.75184 × 10−50.006893360.01437820.9982865.06218 × 10−50.0071149
15-Day PredictionÖmerli0.01634520.992620.0003383750.0183950.01546850.9923150.0003524750.0187743
Darlık0.01391630.9955210.0002190140.01479910.01533090.9949140.0002482360.0157555
Elmalı0.01410980.9970650.0001860920.01364160.01782810.9950370.0003137110.0177119
Terkos0.01281880.9979460.0001006470.01003230.0115470.9982478.60822 × 10−50.00927805
Alibey0.02567630.9928520.0003016020.01736670.02767650.9936730.0002658110.0163037
B.Çekmece0.01757940.9955090.0002565720.01601790.01832090.991890.0004653230.0215714
Sazlıdere0.01687930.997816.45727 × 10−50.008035710.01833910.997537.27287 × 10−50.00852811
Monthly PredictionÖmerli0.01705160.9887680.0005234020.0228780.01445750.9910730.0004130320.0203232
Darlık0.01955990.9919750.0003979030.01994750.02042590.9876950.0005999090.024493
Elmalı0.01557370.9956310.0002762420.01662050.02002830.9892570.0006786630.0260512
Terkos0.01639630.9965020.0001717370.01310480.01036050.997690.000114090.0106813
Alibey0.0208110.9944790.0002336970.01528710.01980520.9936510.0002688360.0163962
B.Çekmece0.0199810.9971140.0001680740.01296430.01395360.9989586.0931 × 10−50.00780583
Sazlıdere0.02229510.9901820.0002925310.01710360.02022110.9884110.0003471470.0186319
Table 8. Performance of the Extra Trees model.
Table 8. Performance of the Extra Trees model.
DAM E T 0 + WD + CD + HRDHRD
MAPER2MSERMSEMAPER2MSERMSE
Daily PredictionÖmerli0.003885080.9993962.77199 × 10−50.005264970.004094980.9993213.11635 × 10−50.00558243
Darlık0.003162410.9995062.41255 × 10−50.004911770.004199550.999084.47778 × 10−50.00669163
Elmalı0.007875040.9957190.000269780.0164250.008171410.9956340.0002751090.0165864
Terkos0.005390170.999522.35185 × 10−50.004849580.006892470.9994582.66004 × 10−50.00515756
Alibey0.008986730.998924.5326 × 10−50.006732460.009405760.998844.86806 × 10−50.00697715
B.Çekmece0.007927770.999224.40525 × 10−50.006637210.009016840.9991834.62988 × 10−50.00680433
Sazlıdere0.005850240.9996919.11219 × 10−60.003018640.008135670.9995211.41438 × 10−50.00376083
Weekly PredictionÖmerli0.008470440.9986486.28527 × 10−50.007927970.008400950.9987755.71785 × 10−50.00756165
Darlık0.008378190.9982718.51257 × 10−50.009226360.00854350.9987176.50426 × 10−50.0080649
Elmalı0.0118280.9973040.0001703950.01305350.01457460.9959540.0002566980.0160218
Terkos0.008106740.998795.91995 × 10−50.007694120.007904640.9989765.0213 × 10−50.00708611
Alibey0.01373940.9981917.60133 × 10−50.008718560.01472470.9982687.28824 × 10−50.00853712
B.Çekmece0.01145540.9989386.02968 × 10−50.00776510.01037760.998946.06342 × 10−50.00778679
Sazlıdere0.01000150.9990832.71342 × 10−50.005209050.009663690.9990462.81089 × 10−50.00530178
15-Day PredictionÖmerli0.0092110.9979019.69083 × 10−50.00984420.008859780.9967610.0001486060.0121904
Darlık0.01056880.9973290.0001340360.01157740.009656560.9985967.19347 × 10−50.00848143
Elmalı0.01180760.9969290.0001954660.01398090.01210110.9967130.0002074530.0144032
Terkos0.007924140.9991923.98185 × 10−50.006310190.008467560.9989015.38046 × 10−50.00733517
Alibey0.01677460.996910.0001298630.01139570.01561820.9976859.71879 × 10−50.00985839
B.Çekmece0.009922550.9990535.40958 × 10−50.007354990.01106760.9978370.000123360.0111068
Sazlıdere0.01055290.9986424.02893 × 10−50.006347380.01103140.9985344.32387 × 10−50.00657561
Monthly PredictionÖmerli0.01190180.9950070.0002321660.0152370.00656340.9978829.80702 × 10−50.00990304
Darlık0.01312450.9951450.0002409170.01552150.01128840.9948350.0002518980.0158713
Elmalı0.01158280.9975370.0001564580.01250830.01111120.9974980.0001582870.0125812
Terkos0.008725030.9989635.12378 × 10−50.007158060.006478180.9992673.62939 × 10−50.00602444
Alibey0.0157730.9978189.30908 × 10−50.009648360.01335730.9979268.80544 × 10−50.00938373
B.Çekmece0.01000210.9994393.2496 × 10−50.005700520.007234220.9995972.27595 × 10−50.00477069
Sazlıdere0.01454240.9961830.000113510.01065410.01373830.9937120.000187060.013677
Table 9. Summary of statistical tests analysis.
Table 9. Summary of statistical tests analysis.
TestDatasetp-ValueStatisticDescription
Kolmogorov–Smirnov E T 0 , WD, CD4.74 × 10−330.5012Evaluation of the fit to the normal distribution
Kolmogorov–Smirnov 4.73 × 10−330.5012
Z-Test (two-sided)MAPE of the E T 0 , WD, CD HRD model, MAPE of the HRD model0.0155−2.4197Assessment of performance equivalence
Z-Test (less than)MAPE of the E T 0 , WD, CD HRD model, MAPE of the HRD model0.0078−2.4197To validate that the additional data reduced the error rate
Paired T-Test (less than)MAPE of the E T 0 , WD, CD HRD model, MAPE of the HRD model2.59 × 10−6−4.742 (DF: 139)To validate that the additional data reduced the error rate
Paired T-Test (less than)MAPE of the LLCV model and MAPE of the OMPCV model0.0008−3.464 (DF:27)To compare the performance of LLCV and OMPCV
Paired T-Test (less than)MAPE of the LSTM model and MAPE of the LLCV model5.535 × 10−5−4.52 (DF:27)To compare the performance of LSTM and LLCV
Paired T-Test (less than)MAPE of the RF model and MAPE of the LSTM model6.54 × 10−5−4.457 (DF:27)To compare the performance of RF and LSTM
Paired T-Test (less than)MAPE of the ET model and MAPE of the RF model2.085 × 10−9−8.493 (DF:27)To compare the performance of ET and RF
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Badem, A.C.; Yılmaz, R.; Cesur, M.R.; Cesur, E. Advanced Predictive Modeling for Dam Occupancy Using Historical and Meteorological Data. Sustainability 2024, 16, 7696. https://doi.org/10.3390/su16177696

AMA Style

Badem AC, Yılmaz R, Cesur MR, Cesur E. Advanced Predictive Modeling for Dam Occupancy Using Historical and Meteorological Data. Sustainability. 2024; 16(17):7696. https://doi.org/10.3390/su16177696

Chicago/Turabian Style

Badem, Ahmet Cemkut, Recep Yılmaz, Muhammet Raşit Cesur, and Elif Cesur. 2024. "Advanced Predictive Modeling for Dam Occupancy Using Historical and Meteorological Data" Sustainability 16, no. 17: 7696. https://doi.org/10.3390/su16177696

APA Style

Badem, A. C., Yılmaz, R., Cesur, M. R., & Cesur, E. (2024). Advanced Predictive Modeling for Dam Occupancy Using Historical and Meteorological Data. Sustainability, 16(17), 7696. https://doi.org/10.3390/su16177696

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop