1. Introduction
Freshwater are renewable yet finite resources that are vital for life. Due to increasing risks of climate change, surface freshwater available in the form of lakes, rivers, reservoirs, snow, and glaciers are becoming significantly threatened [
1].
Freshwater only accounts for about 3% of all the water found on earth. However, most of it is not readily available, as around 69% of freshwater is found in the form of ice in glaciers and polar ice caps, and 30% is found in the form of groundwater. Despite its importance to all life forms, freshwater is unequally distributed, and water volumes are not constant due to unequal volumes of water replenishment and water depletion. Freshwater is replenished through direct rainfall, whereas its consumption is mostly the sum of evaporation, ground seepage, outlet flow, and anthropogenic activities, such as irrigation. Therefore, for proper management of freshwater from lakes, rivers, and reservoirs, the monitoring of water volumes and water levels is crucial.
The most accurate way to monitor water levels and, by extension, water volumes is through the use of water level readings from operational networks that monitor water volume changes by recording the elevation of surface water levels through time. Nonetheless, the cost of installation and maintenance of such stations has led to the decline in their numbers worldwide [
2]. Currently, the vast majority of lakes remain ungauged, especially those located in hard-to-access areas or in developing countries [
3]. In this context, there is a need to develop alternate methods for the global monitoring of inland freshwater bodies, such as with remote sensing data.
In the last four decades, 18 satellite missions were deployed using different radar altimeters to monitor continental or ocean water levels, and 7 are currently in operation (i.e., SARAL, Jason-2, Cryosat-2, HY-2A, Sentinel-3, Jason-3, and Jason-CS/Sentinel 6). Water level estimation with radar altimetry is obtained through dedicated algorithms, known as retrackers, that are developed to accurately derive the radar range (i.e., the distance between the satellite and the surface) over the different kinds of surfaces on Earth (e.g., ocean, continental ice sheets, or sea ice [
4]). Many studies assessed the accuracy of the different radar altimetry missions over different lakes worldwide. For example, the study by Shu et al. [
3], which assessed the performance of 11 major radar altimetry missions that have flown since the 1990s over 12 lakes, found a mean RMSE after bias removal as low as 0.04 m (range between 0.03 and 0.05 m) over the five Great Lakes (Superior, Huron, Erie, Ontario, and Michigan) using the ice sheet retracker from the Sentinel-3 mission, and as high as 0.27 m (range between 0.25 and 0.31 m) using the sea ice retracker from ERS-1. In the study of Schwatke et al. [
5], the performances of ENVISAT and SARAL over the Great Lakes were evaluated, and their findings indicated that both missions achieved very low RMSE which ranged from 0.02 to 0.06 m. Finally, the study of Frappart et al. [
6], that assessed the performances of eight radar altimeter missions over ten of the largest Swiss lakes, found similar performances between Sentinel-3A and B missions operating in the synthetic aperture radar acquisition mode and the open-loop tracking mode [
7]. An unbiased RMSE lower than 0.07 m was obtained over the studied lakes with an almost constant bias of −0.17 ± 0.04 m.
In addition to radar altimetry for water level retrieval, light detection and ranging (LiDAR) sensors have also been assessed in their capabilities for such a task. Currently, two spaceborne LiDAR sensors are in operation: the Advanced Topographic Laser Altimeter System (ATLAS) onboard the second-generation Ice, Cloud, and Land Elevation Satellite (ICESat-2), and the Global Ecosystem Dynamics Investigation (GEDI) onboard the International Space Station (ISS). The ATLAS and GEDI instruments, albeit being LiDAR-based sensors, differ in their mode of operation. On the one hand, ATLAS is equipped with a single 532 nm wavelength laser (and one backup laser) that emits six beams that are then arranged into three pairs (~3 km spacing between pairs with a pair spacing of 90 m), a footprint diameter of ~17 m, and a footprint spacing of 0.7 m [
8]. On the other hand, GEDI is a full waveform system with three operating 1064 nm lasers, where the output of one of the full-power lasers is split in two (coverage lasers), and beam dithering units (BDU) that rapidly deflect light by 1.5 mrads (~600 m) are used in order to produce eight tracks of data. These tracks represent on the ground a series of 25 m-wide footprints spaced 60 m along the track and 600 m across [
9].
Several studies assessed the accuracy of ICESat-2 and GEDI in their ability to estimate water levels. Regarding ICESat-2, the study of Frappart et al. [
6] showed that of the ten largest Swiss lakes, the accuracy of ATLAS was better than 0.06 m (RMSE), with an almost constant bias of 0.42 ± 0.03 m. The results presented in the study of Zhang et al. [
10] showed that ICESat-2 observations were highly correlated with gauge measurements over Tibetan plateau lakes with an RMSE of 0.10 m. In the Study of Yuan et al. [
11], the altimetric capabilities of ICESat-2 over reservoirs in China with areas greater than 10 km
2 were within 0.06 m in terms of relative altimetric error, with a standard deviation (SD) better than 0.02 m. The study of Ryan et al. [
12], using the first 12 months of ICESat-2 acquisitions, showed that ICESat-2 provided good accuracy overall (±0.14 m) for the 3712 global reservoirs that were studied, with surface areas ranging from 1 to 10,000 km
2. Yet, GEDI elevation estimates were found to be less accurate than ICESat-2 elevation estimates from any of the radar altimeter missions. Indeed, for the first version of GEDI data, Frappart et al. [
6], in their performance analysis of different radar and LiDAR altimetry missions, found that GEDI was the worst performing sensor, with an RMSE on the estimation of water levels ranging from ~0.20 to ~0.50 m and a bias ranging from ~−0.15 to ~0.20 m. In the study of Xiang et al. [
13], which assessed the performance of ICESat-2, ICESat-1, and GEDI to retrieve water levels over the five Great Lakes and the lower Mississippi River, obtained similar findings (i.e., GEDI was the least accurate). Indeed, an RMSE on the estimation of water levels of 0.28 m (biases = −0.10 ± 0.23 m) over the Great Lakes was obtained, and an RMSE of 0.40 m (biases = −0.24 ± 0.24 m) when estimating river water levels using GEDI data. In contrast, using ICESat-2 data, RMSEs of 0.06 (biases = −0.01 ± 0.05 m) and 0.12 m (biases = −0.08 ± 0.07 m) were obtained for water level estimates over the five Great Lakes and the lower Mississippi river, respectively. Finally, in the study of Fayad et al. [
14], the comparison between the second version of GEDI elevation estimates and gauge station readings over Lake Geneva showed an estimation bias of 0.63 ± 0.24 m.
Although GEDI performances are lower than almost all existing altimeter missions (either radar or LiDAR) in terms of altimetric accuracy, in its 27 months of operation (April 2019 until August 2021), GEDI has acquired more than 36 billion shots globally. Therefore, coupled with its small footprint size, GEDI data could prove to be a valuable additional source of information for the monitoring, over the duration of GEDI’s lifespan, of inland water bodies, regardless of their surface area. Nonetheless, for GEDI to be beneficial for the estimation of water levels, sources of errors affecting the performance of the elevation estimates need to be accounted for. In fact, the accuracy of acquired GEDI elevations are subject to numerous factors that introduce errors on the estimated elevations by modifying the physical shape of the waveforms. These factors can be classed into three main groups: (1) instrumental, (2) atmospheric, and (3) water surface state factors. Instrumental factors include, but are not limited to, the viewing angle (VA) [
15], the acquiring laser or beam (i.e., coverage or full-power lasers in the case of GEDI) [
14], the amplitude of the echoed water surface return [
14], the width of the echoed water surface return [
14], and the signal to noise ratio (SNR) [
14]. Atmospheric factors mostly include the type and composition of clouds at the acquisition time [
16,
17]. Finally, water surface state factors can include air and water temperatures, atmospheric pressure, humidity, wind and gust speeds, wind–waves information (i.e., height, direction, and period), and swell–waves information (i.e., height, direction, and period) [
18]. Therefore, our objectives in this study are to (1) assess the accuracy of GEDI water level estimates over the five Great Lakes; (2) propose a series of models that will estimate the errors on the acquired GEDI waveforms as function of the previously mentioned factors, in order to correct GEDI elevation estimates; (3) assess the influence of each group of factors on the uncertainty of the acquired GEDI elevations; (4) assess the generalizability of our approach.
This paper is organized into five sections. A description of the studied lakes and datasets is given in
Section 2. The results are given in
Section 3, followed by a discussion in
Section 4. Finally, the main conclusions are presented in the last section.
3. Methodology
In order to improve the altimetric capabilities of GEDI acquisitions, we propose a series of models that estimate the difference between GEDI and in situ elevations by means of a random forest regressor (RF). These models take a given set of factors (e.g., instrumental, atmospheric, wave surface state) as input for each GEDI acquisition, and produce the estimated difference between GEDI and in situ elevations as output (i.e., the error). Adding the estimated error to the original GEDI elevation estimates thus produces a corrected GEDI elevation estimate.
Therefore, the proposed methodology could be separated into three parts. First, an assessment of GEDI elevation accuracy was made over the five Great Lakes over the period from April 2019 through October 2020. Second, several error budget models were proposed in order to reduce the inaccuracy (in terms of both the bias and random errors) on GEDI elevation estimates by accounting for one or more of the three sources of errors (namely instrumental, atmospheric, and water surface state). Third, the proposed models were assessed based on their ability to improve the accuracy of GEDI elevations over the spatial (training a correction model over a given lake and validating it over another lake) and temporal (training a correction model over a given year and applying it over another) domains, and as a function of available input factors (i.e., instrumental, atmospheric, and water surface state).
The choice to use RF was motivated by the fact that RF is able to mix quantitative and categorical variables, offers a measure of importance of the variables, and is known to be very performant [
23]. Compared with linear models, RF is advantageous for being able to also model nonlinear relationships between the variable to explain and the explanatory variables. For this study, the number of trees in the RF was set to 500 trees, with a tree-depth equal set to the square root of the number of available factors.
3.1. Experimental Settings and Models Validation
3.1.1. Exploring Temporal Dependencies
In order to assess how our proposed model, which estimates the errors of GEDI’s elevation estimates, generalizes to future GEDI acquisitions for a given lake, model validation should be performed on data that are temporally far from the training data. Therefore, our first evaluation of the random forest regression model that uses the entirety of the factors (Instrumental
I, Atmospheric
C, and water surface state
W) (
Table 2) as predictors was based on splitting the dataset into two parts, the first part comprised acquisitions made in 2019 and the second comprised data acquired in 2020 (
Table 3). As such, the model was trained over the 2019 dataset and validated on the 2020 dataset, and the process was repeated in reverse, where the model was trained on the 2020 dataset, and then validated on the 2019 dataset. This procedure was carried out for each lake independently. The choice to test the models on lake-by-lake basis was due the unequal number of GEDI acquisitions over the five lakes, for example, ~44% of acquisitions were taken over Lake Superior, while only ~7% of shots were acquired over each of Lake Erie and Lake Ontario. As such, by testing a single model using the combined data from all the lakes, the results would be biased towards the lake with the highest GEDI count, and intrinsic differences of the GEDI shots across the lakes might be lost.
3.1.2. Exploring GEDI Elevation Error Budget
The proposed models in
Section 3.1.1 present a best-case scenario, where a large number of factors corresponding to instrumental, atmospheric, and water surface state for each GEDI footprint are present. Nonetheless, such large and rich dataset is hard to obtain for different lakes globally. Therefore, it is essential to test the loss of accuracy by omitting certain factors from the proposed model in
Section 3.1.1 and reperforming the same training/validation test. In this study, as stated previously, the used factors were organized into three main categories. The first category are instrumental factors (
I,
Table 2), and these variables are available for all acquired GEDI shots. The second category are factors related to atmospheric conditions and cloud composition at acquisition time (
C,
Table 2). These variables, while harder to obtain than
I, could be acquired from the different satellites orbiting around the Earth. Finally, water surface state factors (
W,
Table 2) are scarce, and are only available for heavily monitored water bodies, such as those presented in this study. Therefore, in this study, three variants of the random forest regression model were tested. The first model only used instrumental factors. The second model used instrumental and atmospheric factors. Finally, the third model used instrumental and water surface state factors. As with
Section 3.1.1, model training/validation was based on the year of acquisition (i.e., model trained on data acquired in 2019 and validated on data acquired in 2020, and vice versa).
Table 3 summarizes the models that were tested in this study.
3.1.3. Exploring Geographical Location Effect
To evaluate whether the proposed model could be transferred from a study region to another, a random-forest-based model using all the available factors (
I,
C, and
W,
Table 2) as predictors was independently trained on a given lake and validated on the remaining four (
Table 3). Moreover, to assess the performance of the spatially transferred models, for each of the five lakes, an additional random forest model was trained and validated within the lake with fivefold cross-validation (CV). Moreover, for the fivefold cross-validation, two constraints were imposed: (1) footprints belonging to the same track were assigned exclusively to one of the data partitions (training or test) with the aim to avoid possible spatial bias in the evaluation procedure; (2) the tracks were chosen randomly from either the 2019 or 2020 datasets in order to reduce any potential errors due to temporal dependencies.
3.2. Models Performance Evaluation
Models performance was assessed using the mean difference (bias) between GEDI-derived elevation estimates (either original or corrected after application of models detailed in
Section 3.1) and in situ elevations, the unbiased root mean squared error (ubRMSE), the root mean squared error (RMSE), and the coefficient of determination (R
2). ubRMSE, RMSE, and
R2 are defined as follows:
where
is the observed value (in situ),
is the predicted value (GEDI),
is the mean of the predicted values, and
N is the sample size.
5. Discussion
The results presented in this study show a severe limitation of uncorrected GEDI’s altimetric capabilities for the estimation of water levels. Indeed, from the 12 million GEDI shots acquired over a span of 18 months (April 2019 through October 2020), the assessed uncertainties of GEDI elevation estimates are three times higher than the worst performing radar retracker, namely the ERS-1 sea ice retracker, in the study of Shu et al. [
3]. For example, with the ERS-1 sea ice retracker, the RMSE on the altimetry-derived water level estimates ranged from 0.25 m (Lake Superior) to 0.31 m (Lake Huron), with a bias ranging from 0.53 m (Lake Superior) to 0.78 m (Lake Superior). In contrast, with uncorrected GEDI estimates, the RMSE on the water level estimates ranged from 0.57 m (Lake Erie) to 0.68 m (Lake Ontario), with a bias that ranged from 0.43 m (Lake Erie) to 0.61 m (Lake Ontario). On the other hand, the uncertainties obtained with GEDI were consistent with other studies, such as the study of Xiang et al. [
13] over the five Great Lakes, or the studies of Fayad et al. [
14,
18] and the study of Frappart et al. [
6] over several lakes in France and Switzerland. Therefore, model-free GEDI elevation estimates are not recommended for the retrieval of water surface levels.
The proposed correction models that take instrumental, atmospheric, and wave surface variables into account in order to correct GEDI elevation estimates appear to greatly reduce uncertainties on these elevation estimates. Using the full model (based on instrumental, atmospheric, and wave surface: ), fivefold cross-validated results showed an error (RMSE) on the estimation of water levels that ranged between 0.10 m and 0.14 m for lakes Erie, Huron, Michigan, and Superior, and 0.28 m for Lake Ontario. Moreover, the bias seemed to be mostly eliminated, with a bias ranging from −0.06 m (Lake Erie) to 0.01 m (Lake Superior). The proposed full model also generalizes well across the temporal and spatial domains. The temporal validation (training of model on a given year and validation on another year) of shows errors (RMSE) on the retrieved water levels that ranged from 0.14 m (Lake Michigan) to 0.29 m (Lake Ontario), with a bias ranging from −0.07 m (Lake Erie) to 0.03 m (Lake Ontario).
Nonetheless, accounting for the fact that the tested full model showed an increase in accuracy by at least 92% with more than 74% of the variance explained, the remaining unexplained variance could be due to two factors. First, a large number of dependent variables were used for the modelling of GEDI elevation errors. However, given the complex interaction between LiDAR and the atmosphere and the water surface, not all the factors affecting the precision of LiDAR altimetry could be accounted for. Second, the available dependent variables used in this study were not error-free, because they were not direct measurements but resulted from models. For example, the accuracy on significant wave heights was around 10%, with a bias of less than 5%, or ~30 cm [
24]. These two factors might explain some of the remaining uncertainty on the corrected GEDI elevation estimates.
Regarding the spatial validation (training of model on a given lake and validation on another lakes), the uncertainty of GEDI’s corrected water level estimates differed based on the training and validation sites. Data over Lakes Michigan, Superior, and Huron were able to retrieve water levels well, with a mean RMSE ranging from 0.17 m ( trained with Michigan data) to 0.19 m ( trained with either Ontario or Superior data). Lakes Erie and Ontario showed lower generalization capabilities, with a mean RMSE of 0.26 m ( trained with either Erie or Ontario data). Moreover, Lake Ontario was the lake where model generalization was the least accurate (corrected water level estimates using a model trained from other lakes), since the RMSE on water level estimates over Lake Ontario ranged from 0.21 m (model trained using Superior or Michigan data) to 0.24 m (model trained using Erie data).
The generalization capabilities of the proposed full model (
), either spatially or temporally, seems to be greatly affected by the number of GEDI acquisitions that the model was trained on. Indeed, the three most generalizable models were the models trained over Lake Superior (~4.84 M shots over 337 dates,
Figure 13), Lake Huron (~2.25 M shots over 230 dates,
Figure 13) and Lake Michigan (~2.16 M shots over 249 dates,
Figure 13), whereas the models trained over Lakes Ontario and Erie showed less accurate results due to a smaller amount of training data (~0.88 M shots each, 113 dates over Lake Erie, and 116 over Lake Ontario,
Figure 13). Indeed, by comparing the Jensen–Shannon divergence (JSD) between the original distribution of GEDI errors over Lake Huron and the distribution of modeled GEDI elevations errors obtained from models trained over Lakes Erie, Ontario, Michigan, and Superior (
Figure 14), the modeled distribution with the least divergence was obtained from the model trained over Lake Michigan (JSD of 0.11), followed by the model trained over Lake Superior (JSD of 0.16). The distribution of GEDI elevation errors obtained from Lakes Erie and Ontario had the highest divergence to the original distribution of GEDI elevation errors with JSDs of, respectively, 0.29 and 0.37.
Although model generalizability seems correlated to the number of available acquisitions, since a higher number of training data increases the probability of the model being trained with shots acquired over different conditions, another factor that can have an influence on the accuracy of a generalization model is the range of distributions of GEDI elevation errors used for training. As such, a model trained over a given lake can only estimate the differences between GEDI and in situ water levels (i.e., errors) within its sampling distribution, leading to more uncertainties when correcting GEDI elevation estimates when the original differences between GEDI and in situ elevations are outside of this range. The effect of this factor can be seen in the distribution of modeled GEDI elevation errors in
Figure 14. For example, the range of the distribution of GEDI elevation errors over Lake Huron extends from −3.1 m to 3.9 m (
Figure 13), while a model trained using Lake Erie data can only estimate GEDI elevation errors from −0.4 m to 2.6 m (
Figure 14a). This range (−0.4–2.6 m) corresponds to the range of the original distribution of errors over Lake Erie (
Figure 13). Similarly, a model trained over Lake Michigan can only estimate errors within the range −3.0–2.6 m (
Figure 14b), and −2.3–2.7 m for Lake Superior (
Figure 14c). Conversely, a model trained over Lake Ontario, given its original distribution of GEDI elevation errors that extends from −4.0 to 4.0 m (
Figure 13), can estimate errors within the full range of errors over Lake Huron (
Figure 14d). This could also explain, for example, why the models trained over Lakes Huron, Ontario, Michigan, and Superior presented lower uncertainties in the estimation of the differences between GEDI and in situ elevations over Lake Erie, while a model trained over Lake Erie was less accurate when validated over the other four lakes.
The tested
model variants, while very spatially and temporally accurate, given the used set of variables, is practically infeasible over many other sites, as information on waves and other surface variables are hard to acquire. Indeed, the wave variables were obtained from NOAA’s Great Lakes waves’ model, that is based on the third-generation wave model, where data is only available over a selected number of lakes. Moreover, a significant number of lakes do not have significant waves. Therefore, from an operational standpoint, a model for the correction of GEDI elevation estimates using instrumental and atmospheric variables without water surface information is easier to train and deploy than a full model. In this study, the random forest models using instrumental only and instrumental and atmospheric variables showed high accuracy in the correction of GEDI’s elevation estimates. Moreover, while a model trained using only instrumental variables (
) was the least accurate, the obtained results with such a model improved the original GEDI elevation estimates by a factor of two. Regarding the
model, only a small degradation in performance was observed in a comparison to the full model (
). Moreover, in this study, we used atmospheric data from the GOES-R satellites that only cover the American continent; globally, atmospheric data collected from other satellites can be used, such as the Meteosat Second Generation (MSG) series of satellites that cover Europe, Africa, and the Indian Ocean, and Himawari 8, that covers the Mid-Pacific. Quantitatively, to measure the effect of each variable on the accuracy of the model, a variable importance test was carried out. Variable importance is based on the mean decrease in the mean squared errors (MSE) and is measured as follows: first, the MSE
f of the full model (model using all the variables from all the lakes) is calculated; next, the variable to calculate its importance is permuted through N iterations, and at each iteration, the model accuracy (MSE
v) is calculated; finally, the variable importance is the difference between the MSE
f and the average of the MSE
v. The variable importance test showed that the group of variables with the highest effect (percentage increase in the mean square error of the regressions—%IncMSE) are the instrumental and atmospheric variables with a %IncMSE of, respectively, 46.2% and 45.8%. The %IncMSE of water surface state variables was 26.8%. Moreover, the %IncMSE of individual variables shows that the viewing angle (VA) was the most important variable, with an importance factor that is at least two times higher than the remaining input variables. Following VA is the cloud height (CTH), followed by the period of wind-generated waves. The cloud height and the height of wind-generated waves have almost the same importance. The variable importance of all the variables can be found in
Figure A11.
Finally, the proposed models in this article were developed to decrease the inaccuracy of GEDI elevation estimates at the footprint scale. As such, specific emphasis on uncertainties originating from in situ measurements and impacts of autocorrelation between successive GEDI shots were not take account. In addition, over a given lake on a given date, a high number of measurements from each track are acquired (mean of 2000 GEDI footprints by track in this study), whereas only a few sampling points over a lake’s surface could prove enough. Therefore, going forward, an aggregate of corrected GEDI elevation estimates from each track could be calculated to further decrease the inaccuracy of these estimates. Aggregation could simply be the result of averaging the elevations of corrected GEDI elevation estimates within each track, if elevation estimates are independent in space and time; in that case, elevation errors could be quantified using the standard deviation of the mean. Alternatively, in the presence of autocorrelation between successive GEDI shots, autocorrelation must be modeled and estimated using techniques such as block-kriging, with uncertainty at the lake scale being computed accordingly. The relevance of such an approach was developed in the study of Abdallah et al. [
25].
6. Conclusions
Our analysis of the accuracy of the original GEDI water level estimates over the five Great Lakes, which showed uncertainties (RMSE) ranging from 0.54 to 0.68 m, confirms that GEDI elevation estimates are not accurate enough for them to be an additional source of information for the retrieval and monitoring of inland water levels. Therefore, modeling of the uncertainties to correct these elevation estimates is required in order to benefit from the extensive dataset acquired by GEDI over the Earth’s surface.
In this study, random-forest-based models were used in order to estimate the differences between the original GEDI elevation estimates and in situ measurements (i.e., errors). The proposed models used a combination of factors related to instrumental, atmospheric, and water surface state factors as inputs, as these factors could have an effect on the accuracy of GEDI acquisitions. The first proposed random forest regression model (), which uses all of these factors, showed high accuracy on the estimation of the GEDI’s errors, thus greatly improving GEDI elevation estimates. Indeed, temporal validation of the model () trained on data from a given year and validated on data from another year showed an RMSE on the corrected elevation estimates (resp. R2) ranging between 0.14 and 0.24 m (R2 ranging between 0.70 and 0.92). Concerning the spatial validation (, trained over a lake and validated over the remaining four), the results varied based on the data of the lake used for training. For instance, the most accurate spatially validated model showed an RMSE that ranged between 0.16 and 0.21 m (R2 ranging between 0.86 and 0.93), while the least accurate model showed an RMSE that ranged between 0.16 and 0.29 m (R2 ranging between 0.63 and 0.85).
From an operational standpoint, the proposed full model was hard to train and deploy because it used factors of water surface states that are only available for a select few lakes worldwide. Therefore, three additional models with a reduced number of input factors were used. The first model used instrumental factors only, as they are always available, while the second model () used atmospheric factors as well as instrumental factors, and the third model used water surface state factors () in addition to instrumental factors. The results showed that, using only the instrumental factors, a correction with a factor of two could be obtained in comparison with the original GEDI elevation estimates. Regarding the two remaining models ( and ), they showed similar correction capabilities, with the model being slightly more accurate. Moreover, only a small degradation in the correction capabilities was observed with the model (RMSE ranged between 0.19 and 0.35 m, R2 ranged between 0.69 and 0.89) in comparison with the full model (). Therefore, even in the absence of water surface state variables, our proposed methodology can be used in order to greatly reduce the uncertainty of GEDI elevation estimates globally.