1. Introduction
Small reservoirs play an important role in the management and utilization of water resources, through water storage [
1], power generation [
2], and flood control [
3], amongst others. Estimates suggest that globally there are currently over 800,000 small reservoirs (capacity < 10 million m
3), accounting for over 95% of all reservoirs worldwide [
4]. With climate change and population growth [
5], the number of small reservoirs is expected to continue to increase for reasons such as agricultural irrigation and to further help mitigate the impact of floods [
6]. However, these reservoirs may also affect wider catchments by altering river connectivity and flow [
7]. This in turn can impact sediment and nutrient transport [
8], ultimately leading to the degradation of downstream environments and ecosystems [
9].
To better manage and operate these small reservoirs both efficiently and safely, access to timely and accurate knowledge of their hydrological dynamics is essential. However, there remains a lack of in situ observations collated over large geographic scales, such as in most global datasets (e.g., GeoDAR [
10]) or satellite-based data products (e.g., ICESat-2). This can be partly attributed to confusion over reservoir ownership, insufficient funding to collect data, or various technical difficulties [
11]. Such information gaps not only prevent obtaining reservoir operation conditions and determining optimal water management strategies [
12], but also limit our wider scientific understanding of material transport, energy balance, and the resource environment within small reservoirs [
13,
14]. Therefore, there is an urgent need to improve our monitoring of hydrological parameters in small reservoirs, among which the water level is an important parameter in many contexts. In arid regions, reservoir levels directly correlate with the available water volume [
15], whereas in flood-prone areas, monitoring reservoir levels becomes essential for appropriately adjusting various functions such as power generation, water supply, and flood relief [
16].
Compared with costly and difficult-to-implement water level monitoring systems across large scales, satellite-based remote sensing technology can provide accurate and high-frequency information at much lower costs [
17,
18,
19]. Currently, the use of satellite altimetry to obtain bathymetric data for inland water bodies is popular and reliable [
20,
21,
22,
23]. For example, Ryan, et al. [
24] used ICESat-2 data to investigate water level changes in 3712 reservoirs worldwide, identifying different regional patterns of reservoir level changes based on water availability and management strategies. An issue raised here though is where existing altimetry satellites (e.g., ICESat-2) take measurements with large 3.3 km horizontal spacings [
25]. Such wide spacings ensure that most of what we would consider to be small reservoirs may not be scanned or captured [
26]. This ensures that the widespread application of satellite altimetry and associated methods for retrieval to monitor water depth in small reservoirs remains a challenge [
27]. At the same time, the long repeat period of altimetry satellites (91 days for ICESat-2 [
28]) prevented the establishment of a more complete water level monitoring sequence for small reservoirs. As a function of water depth, it is also a common and useful practice to establish area–storage–depth relationships for individual reservoirs [
29,
30]. Yigzaw, et al. [
31], for example, iteratively selected the best geometry for a given reservoir from five possible regular geometries to establish area–storage–depth relationships to form a new global reservoir bathymetry dataset. A potential issue here being that some small reservoir geometries are more cylindrical in reality. For example, unlike the Lyons reservoir (LYS) shown in
Figure 1, the surface area of the reservoir does not vary with storage volume or water level in the Gerle Lake reservoir (GLL). It is not feasible for these reservoirs to determine water level based on changes in water surface area.
In recent years, deep learning-based algorithms have been widely used to obtain information about water bodies [
32,
33,
34] by automatically extracting features from satellite images or relevant input variables. This involves continuously adjusting its internal parameters to create intricate, non-linear functional relationships between these input variables and observed water level information. For example, Yang, et al. [
35] implemented water level inversion of lakes on the Qinghai-Tibet Plateau based on common machine learning models (backpropagation, support vector machines, and random forest). Many studies leveraging deep learning models for water level inversion incorporate additional variables like precipitation, air temperature, and reservoir inlet and outlet flows, in addition to reservoir levels. The inclusion of these supplementary variables poses challenges, particularly for small reservoirs with limited monitoring capabilities. While the fusion of satellite imagery and deep learning has predominantly been employed for water body detection and classification [
36,
37], there is a recent trend among researchers to extend this combination to regression problems, specifically the estimation of water body bathymetry. Lumban-Gaol, et al. [
38] used Sentinel-2 Level 2A images to provide reflectance values, establishing their relationship to water depth. Najar, et al. [
39] used Sentinel-2 imagery with a deep learning model to retrieve coastal bathymetry. Changes in water level impact the penetration and reflection of optical satellites in the water body. SAR imagery’s water body detection mechanism relies on comparing backscattered signal intensities between the liquid surface, land, and vegetation. Deep learning could effectively capture the intricate relationship between these reflection/backscattering values and water level changes, thus enabling the direct inversion of the water level from satellite images. To the best of our knowledge, this approach has not been applied to the study of water levels in small reservoirs yet, which shows promise for the inversion of water levels in numerous small reservoirs.
In response to this, this study aims to develop a practical method for retrieving water levels in small reservoirs by combining accurate deep learning inversion algorithms and high-resolution remote sensing images to retrieve and study the water level dynamics in small reservoirs. Specifically, in this paper, we will: (1) establish spatiotemporal relationships between remote sensing images and water levels using deep learning algorithms, (2) assess the impact of different data sources and sampling methods on a water level inverse model, and (3) apply coupling attention mechanisms to the model to test for improvements to our inversions.
2. Materials
Four small reservoirs located in California, United States, were selected for this study as shown in
Figure 2. Due to its proximity to the ocean, the study area experiences higher precipitation during the winter and predominant dry conditions during the summer [
40]. In recent years, California has experienced frequent droughts, leading to enormous water stress [
41]. This has resulted in more frequent reservoir scheduling and an increased demand for water level monitoring compared to reservoirs in other regions. Similar to California, global small reservoirs are often situated in the mid to low latitudes. The selected reservoirs are distributed across different sub-catchments, with storage capacities ranging from 1.3 × 10
6 m
3 to 7.6 × 10
6 m
3 (
Table 1). These are typically regarded as small reservoirs but with a range of uses including water storage, irrigation, and power generation.
To further validate the effectiveness of our proposed framework for water level inversion in small reservoirs across diverse regions, we selected three additional reservoirs with climatic and geographic characteristics that differ significantly from those in California. These reservoirs include Mackay Creek Reservoir (MRR), characterized by a cold climate leading to freezing of the water surface in winter; Arrow Reservoir (ARR), situated in a region with a humid climate and abundant precipitation; and Costilla Reservoir (CTR), positioned at a higher elevation with a typical alpine climate. Refer to
Table 2 for details about the reservoirs.
We obtained daily water level heights for the studied reservoirs in California between 2015 and 2022 from the California Data Exchange Center (
https://cdec.water.ca.gov/dynamicapp/selectQuery (accessed on 1 August 2022)), operated by the California Department of Water Resources. The water level data between 2015 and 2022 for three additional reservoirs were sourced from the United States Geological Survey’s National Water Information System (
https://waterdata.usgs.gov/nwis (accessed on 1 December 2023)) and the Canada Water Agency (
https://wateroffice.ec.gc.ca/ (accessed on 1 December 2023)).
The remote sensing image data were from the European Space Agency’s (ESA) Sentinel-1 and Sentinel-2 satellites (
https://scihub.copernicus.eu/ (accessed on 1 August 2022)). Launched in April 2014, Sentinel-1 is an active microwave remote sensing satellite that can provide imagery regardless of weather conditions such as clouds and rain, making it a valuable complement to optical remote sensing satellite. It offers a raw resolution of 20 × 22 m [
42], via images in dual-polarisation mode (VV and VH bands). Sentinel-2 is a multispectral satellite launched in June 2015. It is a passive optical remote sensing satellite providing 13 spectra in the green (B3), red (B4), and near-infrared (B8) bands with a high resolution of 10 m × 10 m.
Reservoir information was obtained from the Global Surface Water database (
https://global-surface-water.appspot.com/map (accessed on 1 August 2022)) with the Joint Research Center (JRC) Global Surface Water Mapping Layers used to identify the scope of the small reservoirs selected. These mapping layers provide a clear delineation of the extent and changes in water surface area and were further checked for alignment with satellite images from the Google Earth Engine platform (
https://earthengine.google.com/platform/ (accessed on 1 August 2022)), as shown in
Figure 3 for the GLL reservoir.
3. Methodology
Our methodology for processing raw data for use in the models is presented in
Figure 4. Each step is further described in the following sections.
3.1. Data Processing
The Sentinel series satellite data were acquired and processed using the Google Earth Engine platform. Sentinel-1 GRD data were utilized, and preprocessing on the GEE platform included updating orbital metadata, eliminating GRD boundary noise, removing thermal noise, radiometric calibration, and terrain correction. The Level-1 images were filtered based on polarisation mode, resulting in VV and VH. Sentinel-2 satellite imagery employed Level-2A data, pre-processed by ESA for radiometric calibration, atmospheric correction, etc. Using the QA60 band to mark and remove clouds, anomalous images were subsequently deleted, retaining only the B3, B4, and B8 bands. Finally, the remaining images were organized corresponding to collected water level data to produce an aligned time series.
3.2. Dataset Construction (Sampling)
The observed water level data were split into subsets, with 80% for the training set and 20% for the test set. To explore the impact of both temporal correlation and peak values in the data on the model, two sampling approaches were employed to do this split. Chronological Sampling (CS) involved setting the first 80% of the time series and corresponding remote sensing images in chronological order as the training set, with the remaining 20% of the data as the testing set. Random Sampling (RS) involved sorting the reservoir water levels in descending order, dividing them into four quartile intervals, then randomly selecting 80% of the data from each interval as the training set and the remaining data as the test set. Both datasets from the two processing methods were normalized and the raw water level was linearly transformed so that the resultant values were mapped between 0 and 1. To help reduce the uncertainty of the results, the evaluation metrics obtained after running the model 20 times were averaged and used to evaluate the model.
3.3. Model Architecture
After preprocessing the collected remote sensing and water level data, our constructed convolutional neural network (CNN) model could be applied to the water level inversion of the selected reservoirs. Here we select the RESNET34 model [
43] as a deep learning model for water level inversion in reservoirs, with the discussed Sentinel satellite images used as inputs and the inversed reservoir water level as the output. RESNET34 includes cross-layer connections that reduce the dimensionality of the grid and minimize network parameters and computational complexity. The unique design of residual blocks in RESNET enables training deeper networks while achieving higher accuracy.
The structure of the convolutional neural network is shown in
Figure 5. The benchmark model (
Figure 5a) comprises a multi-band remote sensing image of a single reservoir being fed into a 7 × 7 convolutional layer with 64 output channels and a step size of 2 (Conv-1D). The second convolutional group consists of a 3 × 3 maximal pooling layer spanning 2 with a down-sampling padding of 1 (Max pool), and three residual modules (Layer1) connected to the previous convolutional group (Layer2–4) for down-sampling. These consist of four, six, and three residual modules (the backbone of each residual module is composed of two 3 × 3 convolutional layers), respectively. Finally, the water level was output through the average pooling layer and fully connected layer. The additional models of
Figure 5b,c show the processed images are fed into the subsequent neural network after passing through the different attention modules. These are the Channel Attention Module (CAM;
Figure 5b) and the Channel and Spatial Attention Module (CSAM;
Figure 5c).
Some previous studies on remote sensing images using CNN have demonstrated differences in the weights of different bands of an image across various application scenarios. The utilization of attentional mechanisms allows for improved focus on these weights [
44,
45]. Given that various bands or polarization modes in remote sensing data may respond differently to changes in water level, the incorporation of these attention mechanisms into the models could be a valuable addition. The channel attention mechanism and spatial attention mechanism are introduced as additional layers preceding the RESNET34 network in the network structure. For input images from different satellite bands, global max pooling and global average pooling, based on width and height, are applied to generate the final feature map in the shared multilayer perceptron (MLP) network. The channel attention mechanism enhances the significance of channels influencing water level inversion (specifically, different bands of the same satellite) in the RESNET34 network while diminishing the importance of channels with a lesser impact on water level inversion [
46]. Simultaneously, the spatial attention mechanism compresses the feature map corresponding to each individual band through global maximum pooling and global average pooling. Weight coefficients, obtained through a convolutional layer with an activation function, are multiplied with the input feature map to generate a new feature map, ensuring the model focuses on regions of the image that exert a greater impact on water level inversion. A schematic illustrating both the channel attention mechanism and the spatial attention mechanism is presented in
Figure 6.
The model framework was developed using PyTorch lib [
47] and trained on an NVIDIA GeForce RTX 3090 with the hyperparameter settings shown in
Table 3.
3.4. Model Regionalization Approach
Efforts were undertaken to regionalize the model, which involved incorporating additional data from other reservoirs sharing similar characteristics within the region to estimate water levels in reservoirs with limited training data. For reservoirs with relatively short training datasets, data from one or more other similar reservoirs were integrated to augment the training dataset. To minimize the impact of variations in datum height among different reservoirs, the original water level data were converted into water depth data before being input into the model.
3.5. Model Performance Indicators
In this study, we used three evaluation metrics commonly used for regression inversion to quantify model performance: R-square (
) given as Equation (1), root-mean-squared error (RMSE) as Equation (2), and mean absolute error (MAE) given as Equation (3).
R2 is employed to evaluate the model’s capacity to capture the variability in the observed data. Higher values denote a better fit, with
R2 = 1 indicating a perfect fit. RMSE is similar to MAE in that it provides a measure of the overall accuracy of the model, with lower values indicating better model accuracy, when a value of 0 indicates an exact match between observed and simulated values. However, RMSE gives greater weight to larger errors and penalizes larger discrepancies.
where
is the observed variable;
is the computed variable;
is the total number of observations; and
is the overall mean observed variable.
4. Results
Our results demonstrating model performance are structured in six contexts: a comparison of sampling approaches (
Section 4.1), a comparison of imagery data sources (
Section 4.2), the quantified optimal combinations of sampling and imagery for each reservoir (
Section 4.3), the model performance with attention mechanisms (
Section 4.4), the model performance of regionalized approach (
Section 4.5), and finally, the validation of the model’s application in different regions (
Section 4.6).
4.1. Comparative Performance of Sampling Approaches
Here we consider the two sampling approaches: CS (chronological) and RS (random) as defined in
Section 3.2. The RS approach outperformed CS across all three average
R2, average RMSE, and average MAE metrics. Additional analyses were conducted on the information in
Table 4; the average
R2 values of the models using the RS method and the models using the CS method are 0.73 and 0.26, the average RMSE values are 0.65 and 0.84, and the average MAE values are 0.41 and 0.59.
Figure 7 shows that the models trained using the RS method consistently exhibited better metric values compared to those trained using the CS method, varying depending on the combined data sources. The performance of models trained with both sampling methods was close for the LYS, BIL, and GLL reservoirs. However, the CS method was not applicable to the BHC reservoir, resulting in negative metric values. This could be related to the fact that the BHC historical water level sequences were not distributed consistently in the training and test sets. Overall, the use of the RS approach in determining training/testing data ensures better model robustness compared to the CS method, mainly because the RS method improves the representativeness of the training data and allows the model to learn from the peak water level data.
For example, BHC-1-R is the Brush Creek Reservoir using Sentinel-1 imagery and Random Sampling.
4.2. Comparative Performance of Imaging Data Sources
The use of Sentinel-1 data was not suitable for the GLL and BHC reservoirs where the maximum
R2 value of the models obtained from either sampling approach did not exceed 0.2 (
Table 4). This may be attributed to the fact that dense vegetation around these two reservoirs affects the penetration and reflection of readings from the Sentinel-1 satellite. We have excluded these in our comparison between data sources. For the remaining LYS and BIL reservoirs, when sampled using the RS method, the model using Sentinel-2 data achieved better results at both LYS and BIL. When using the CS method, the opposite is observed and the model using Sentinel-1 data achieves better results than the model using Sentinel-2 data in the inversion. One of the explanations for the two opposite conclusions is that Sentinel-1 uses active remote sensing using synthetic aperture radar (SAR), which is unaffected by cloud cover, and thus increases the image availability. This enables the model to better learn the mapping relationship between remote sensing imagery and water level data observed over a range of changes seen in a longer time series.
4.3. Optimal Combination of Data Source and Sampling Approach
We have further examined the best combination of satellite data sources and sampling approaches for each reservoir. Specifically, LYS and BHC reservoirs achieved the highest inversion accuracy using a combination of Sentinel-2 data and the RS method. At the GLL reservoir, the best results were achieved using a combination of Sentinel-2 data and the CS method, whereas the BIL reservoir predictions were the best when using the Sentinel-1 data combined with the CS approach.
Focusing on the performance of the models on time-series inversion under the optimal combinations,
Figure 8 compares the observed and predicted water levels and their associated relative errors. We see the relative errors of LYS and BHC inversions are larger, with the same random sampling approach and larger ranges of observed water levels throughout the time series. LYS had 9 days in the test set where the inversion water level error exceeded 1 m, with an average inversion error value of 0.5 m, while BHC had 13 days where the inversion error exceeded 1 m, with an average inversion error value of 0.62 m. The remaining reservoirs, GLL and BIL, demonstrated the effectiveness of using temporal sequential data to capture smaller fluctuations in water level, with both reservoirs having only 1 day in which the inversion water level error exceeded 1 m, with average inversion error values of 0.2 and 0.23 m, respectively. Models using the CS method exhibit an advantage in inversing water level time series. This is due to the fact that the CS method improves the model’s understanding of the link between satellite imagery and water level data, as mentioned above.
4.4. Model Performance with the Addition of Different Attention Mechanisms
Figure 9 provides a comparison of results from the three models without the addition of attention module (
Figure 9a), with the addition of CAM (
Figure 9b), and with the addition of CSAM (
Figure 9c). Both visually and across our given performance metrics (
R2, RMSE, MAE) we see that with the addition of the attention mechanism, the model’s inversion results can be improved. The model with only the CAM added (
Figure 9b) has the highest inversion accuracy of the four reservoirs, with an average improvement of 8.6% in the
R2 value and an average reduction in both the RMSE (21.8%) and MAE (23.8%) when compared to the inversions before any attention mechanism was added. However, the performance at the BHC reservoir is still poor, as indicated by its comparatively low maximum
R2 value of 0.62 and corresponding RMSE and MAE values of 0.73 m and 0.53 m, respectively. This could be mainly attributed to the fact that the training data at the BHC reservoir were the smallest among the four reservoirs (
n = 275). Meanwhile, the significant height differences in the terrain surrounding BHC may give rise to occlusion effects, resulting in images with pronounced shadows. This poses a challenge in effectively extracting feature values through the model.
The addition of CSAM improved the inversion accuracy on three of the reservoirs compared to the model without this attention mechanism. These show an average increase of 5.9% in the R2 value, an average reduction of 12.4% in the RMSE value, and an average reduction of 19% in the MAE value, with only one reservoir (BIL) showing a slight decrease in inversion accuracy.
Overall, the model with CSAM did not significantly improve the inversion accuracy compared to the model with only CAM. This suggests that the variation in band sensitivities to reservoir level changes indeed exists, and the channel attention mechanism can enhance model performance by assigning higher weights to the more sensitive bands. The addition of the spatial attention mechanism had a negative impact on model performance, which could be attributed to the spatial attention mechanism excessively emphasizing less important features or neglecting critical ones. Alternatively, such an increase in model complexity might require a larger dataset to enhance the model’s generalization capability.
Focusing on the LYS and BHC reservoirs, where we have greater water level fluctuations and saw poorer performance in
Figure 8, we have further examined whether there is an improvement with the addition of the attention mechanism. As depicted in
Figure 10, this is promising with the relative error in the inversion of at LYS decreasing. Model-added CAM and Model-added CSAM had 6 and 7 days in the test set where the inversion water level error exceeded 1 m, with average inversion error values of 0.3 m and 0.36 m, respectively. However, for the BHC reservoir, the overall inversion accuracy of the model remains poor compared to the other three reservoirs with no pronounced improvement in results under these conditions with large changes in water levels. Model-added CAM and Model-added CSAM had 13 and 15 days in the test set where the inversion water level error exceeded 1 m, with average inversion error values of 0.5 m and 0.6 m, respectively.
4.5. Model Performance of the Regionalized Approach Applied on Selected Reservoirs
To address the issue of a potentially small training dataset for the BHC reservoirs, the regionalization approach was employed. The BIL reservoirs, given their proximity to the BHC reservoirs, similar reservoir depths, and geographical features, were chosen for data splicing with the BHC training dataset. As indicated in
Table 5, this strategy led to a slight improvement in the accuracy of BHC water level inversion.
Expanding on this, data from additional reservoirs were similarly integrated, and the water level inversion for the BHC reservoir was conducted after incorporating data from 2 to 4 reservoirs. Notably, when merging data from three reservoirs into a single training set, only BHC, BIL, and GLL, which are geographically proximate, were considered.
Figure 11 illustrates the impact of using data from different numbers of reservoirs on the inversion accuracy of BHC. It is evident that the accuracy of water level inversion decreases when the model is trained with data from 3 or 4 reservoirs. One plausible explanation for this trend is that augmenting the training dataset, while providing more data for learning, also introduces disparities in other feature information between the two reservoirs that cannot be adequately accounted for using the existing variables. As the dataset length increases, these disparities become more pronounced, resulting in a significant reduction in accuracy.
4.6. Further Validation of the Model for Application in Diverse Regions
To explore the applicability of our proposed framework in different regions, we conducted validation on three additional randomly selected reservoirs. Initially, the CNN benchmark model was employed, and the results, as shown in
Table 6, revealed
R2 values exceeding 0.8 when using the optimal combination of data sources and sampling methods. However, the RMSE and MAE for CTR and ARR were relatively large, indicating noticeable differences compared to the four reservoirs in California. This suggests that the model can well explain the variability of the data and robustly invert water levels. In certain instances, a notable discrepancy exists between the inversed and observed values. This is mainly due to the smaller size of the available training sets for CTR and ARR (
n = 92,
n = 245). Additionally, constructing the training set using the RS method ensures that the model maintains good performance when the water level sequence is incomplete (e.g., CTR water level observation sequence partially missing during 2018–2020, and MCR lacking observed water levels due to winter freeze).
Building upon this, an assessment was conducted for the model with the addition of attention mechanisms. The outcomes remained consistent with those presented in
Section 4.4, detailed in
Table 7. For MCR, the model with the incorporation of CSAM displayed a decrease in performance, while other cases exhibited varying degrees of improvement. Models featuring CAM consistently outperformed those incorporating CSAM.
The validation on three additional reservoirs illustrates the validity of our proposed water level inversion framework for small reservoirs across different regions and climatic conditions, it could be further expanded to cover more small reservoirs globally.
5. Discussion
In this study, a deep learning framework based on convolutional neural networks is proposed for inversing reservoir water levels. The results show that the deep learning model with the addition of CAM exhibits the best performance followed by the model incorporating CSAM and, finally, the model without any attention mechanism.
The accuracy of our water level inversion model was compared with some large-scale studies. Chen, et al. [
48] utilized ICESat-2 to detect global reservoir dynamics, achieving
R2 values ranging from 0.60 to 0.99 and RMSE values from 0.37 m to 1.01 m for 40 random reservoirs’ water levels (essentially large reservoirs). Donchyts, et al. [
49] compared water level measurements from 768 small and medium-sized reservoirs in Spain, India, South Africa, and the United States, establishing a relationship between reservoir surface area and water level. It was shown that 67% of the reservoirs achieved
R2 values higher than 0.7. In our study, the best-performing model inversed water levels for four California reservoirs and three reservoirs in other regions, achieving
R2 values ranging from 0.62 to 0.97, RMSE values from 0.19 m to 0.77 m, and a mean
R2 value of 0.86. This underscores the remarkable accuracy with our proposed framework, offering a practical solution to address the limitations of water level monitoring in small reservoirs.
In general, the use of Sentinel-2 data outperforms the use of Sentinel-1 data. Although Sentinel-1 offers the advantage of active radar sensing, which can overcome cloud cover limitations, it is susceptible to interference from terrain and vegetation. Variations in terrain relief and vegetation coverage can alter signal reflection and scattering properties, as noted in previous studies [
50,
51]. The inherent scattering effect in Sentinel-1 images introduces noise, further affecting the accuracy of water level monitoring. Contrastingly, the higher spatial resolution of Sentinel-2, although susceptible to shadowing effects in mountainous areas, along with its multispectral bands (both near-infrared and visible), exhibits greater sensitivity to water transparency and bottom features. This allows effective capture of water reflectance and spectral characteristics.
Models trained using the RS approach demonstrate better performance compared to those trained using the CS approach. A possible reason is that following the preprocessing of the remote sensing data, there no longer exhibits a strong temporal correlation due to the dynamic nature of the reservoir operational rules. Consequently, the overall temporal characteristics of the data become less evident, making it challenging for deep learning models to capture them. By randomly splitting the training and validation sets, the model effectively learns the image features associated with high and low water levels, improving its generalization capability and mitigating the occurrence of overfitting.
This study selected four typical small reservoirs within California to construct water level inversion models and further verified the portability in new and diverse geographical locations. The impact of the training set size on the model’s accuracy is a critical consideration. Therefore, different sizes of training sets should also be tested to quantify the model inversion accuracy as input data availability changes.
Furthermore, the model frameworks presented currently only learn to inverse water levels at the specific individual reservoirs and cannot be easily transferred to simultaneously inverse the water levels of a large number of unknown small reservoirs. The basic regionalization method employed yielded mediocre results. A more sophisticated regional model could improve the accuracy of water level inversion and has the potential to be useful in inversing water levels in small reservoirs by assimilating more general rules from satellite images where observed water level data are limited. This may require further changes in the architecture of the model to accommodate regionalized inversions and any additional considerations of more generic feature information about reservoir water levels, such as spatial characteristics and morphological differences of reservoirs.