Next Article in Journal
Determining the Role of Urban Greenery in Soil Hydrology: A Bibliometric Analysis of Nature-Based Solutions in Urban Ecosystem
Next Article in Special Issue
Optimizing Irrigation and Nitrogen Application for Greenhouse Tomato Using the DSSAT–CROPGRO–Tomato Model
Previous Article in Journal
The Use of Hierarchical Temporal Memory and Temporal Sequence Encoder for Online Anomaly Detection in Industrial Cyber-Physical Systems
Previous Article in Special Issue
Agricultural Practices to Improve Irrigation Sustainability
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robustness of Actual Evapotranspiration Predicted by Random Forest Model Integrating Remote Sensing and Meteorological Information: Case of Watermelon (Citrullus lanatus, (Thunb.) Matsum. & Nakai, 1916)

by
Simone Pietro Garofalo
1,*,
Francesca Ardito
2,
Nicola Sanitate
1,
Gabriele De Carolis
1,
Sergio Ruggieri
1,
Vincenzo Giannico
2,
Gianfranco Rana
1 and
Rossana Monica Ferrara
1,*
1
Research Center for Agriculture and Environment, Council for Agricultural Research and Economics, via Celso Ulpiani, 5, 70126 Bari, Italy
2
Department of Soil, Plant and Food Science, University of Bari Aldo Moro, 70126 Bari, Italy
*
Authors to whom correspondence should be addressed.
Water 2025, 17(3), 323; https://doi.org/10.3390/w17030323
Submission received: 15 December 2024 / Revised: 15 January 2025 / Accepted: 22 January 2025 / Published: 23 January 2025

Abstract

:
Water scarcity, exacerbated by climate change and increasing agricultural water demands, highlights the necessity for efficient irrigation management. This study focused on estimating actual evapotranspiration (ETa) in watermelons under semi-arid Mediterranean conditions by integrating high-resolution satellite imagery and agro-meteorological data. Field experiments were conducted in Rutigliano, southern Italy, over a 2.80 ha area. ETa was measured with the eddy covariance (EC) technique and predicted using machine learning models. Multispectral reflectance data from Planet SuperDove satellites and local meteorological records were used as predictors. Partial least squares, the generalized linear model and three machine learning algorithms (Random Forest, Elastic Net, and Support Vector Machine) were evaluated. Random Forest yielded the highest predictive accuracy with an average R2 of 0.74, RMSE of 0.577 mm, and MBE of 0.03 mm. Model interpretability was performed through permutation importance and SHAP, identifying the near-infrared and red spectral bands, average daily temperature, and relative humidity as key predictors. This integrated approach could provide a scalable, precise method for watermelon ETa estimation, supporting data-driven irrigation management and improving water use efficiency in Mediterranean horticultural systems.

1. Introduction

Water scarcity, exacerbated by climate change and demands for limited freshwater resources, has highlighted the importance of efficient irrigation management in modern agriculture [1]. In regions characterized by hot, dry summers and high evaporation demand—such as the Mediterranean basin—water management strategies are essential to support sustainable agricultural production [2,3]. In this context, accurately quantifying crop water consumption is a key step in optimizing irrigation scheduling and improving overall water use efficiency [4]. The evapotranspiration (ET) rate represents the amount of water transferred from soil and plant surfaces to the atmosphere [5]. Traditional methods used to measure the ET involve different techniques, such as lysimeters and eddy covariance (EC). Generally, lysimeters measure the amount of water lost through evapotranspiration by monitoring changes in soil moisture within a defined volume of soil; they provide direct measurements of ET and are particularly suitable for small-scale studies [6,7]. However, lysimeters can be expensive to install and maintain, and their limited spatial representativeness may not capture the variability of large fields and agricultural contexts. The eddy covariance technique involves measuring vertical turbulent fluxes of water vapor (eddies) above the canopy to estimate ET; this method provides continuous, real-time data over larger areas [8,9,10,11]. Nonetheless, EC systems require complex instrumentation and specialized technical personnel, and they may be excessively expensive to be extensively applied by farmers at the field scale [12]. In practical field applications, a common approach involves calculating the reference evapotranspiration (ETo) using equations such as the Penman–Monteith equation and adjusting the ETo with crop-specific coefficients (Kcs) [13,14,15]. The Penman–Monteith equation, recommended by the Food and Agriculture Organization, integrates meteorological variables to provide a standardized measure of ETo for a hypothetical reference crop. Therefore, Kcs are applied to adjust the ETo considering crop-related traits (species, growth stage, management practices) and local conditions [13]. However, this last method, although widely spread and used, is highly inaccurate under mainly arid and semi-arid conditions [16].
Satellite methods have emerged as powerful alternatives for estimating ET and monitoring vegetation dynamics and surface energy balances, especially over large areas. Many of these approaches incorporate thermal infrared (TIR) bands to estimate land surface temperature (LST), a key parameter in models such as the Surface Energy Balance Algorithm for Land (SEBAL) and Mapping Evapotranspiration at High Resolution with Internalized Calibration (METRIC) [17,18]. These models calculate ET by dividing the energy available at the land surface into sensible and latent heat fluxes, using the LST derived from the TIR data to estimate the sensible heat flux and then deriving the latent heat flux as a residual [19,20]. Although TIR-based methods provide useful information on ET by capturing temperature-related changes in evapotranspiration, they may often be limited by the spatial and/or temporal resolution of the available satellite platforms [21,22]. For instance, the Landsat program and the Moderate Resolution Imaging Spectroradiometer (MODIS) aboard NASA’s Terra and Aqua satellites provide TIR imagery with a spatial resolution of 100 m and 1 km2, respectively [23,24]. Recent advances in remote sensing may offer possible alternatives to these methods. Spectral data from higher-resolution satellite platforms with a high frequency of revisit time (e.g., Sentinel, Planet) make it possible to monitor vegetation dynamics on multiple spatial and temporal scales, detecting key indicators of crop status, canopy structure, and leaf area development [25,26,27,28,29]. Also, integrating multispectral remote sensing observations and machine learning algorithms could improve ET monitoring [30]. Among these platforms, Sentinel-2 is a useful data source, as it offers advantages such as free accessibility and wide spectral coverage. However, compared to satellite platforms with daily acquisition (Planet), its 5-day acquisition frequency, combined with potential cloud cover, may limit the number of usable observations for crops with short growth cycles, such as watermelon, ultimately reducing the dataset available for model training.
Machine learning is a field of artificial intelligence that focuses on developing algorithms able to learn from data to make predictions without being directly programmed; through the identification of patterns within the data, machine learning models can perform several tasks, including classification, regression, and clustering [31]. Machine learning is becoming a powerful tool in modern agriculture, improving several aspects of crop management by analyzing data from different sources (e.g., data from proximal sensors, images from satellites and drones) to monitor plant physiological and nutritional status or plant pests, enabling timely intervention from farmers [32,33,34,35,36]. Moreover, machine learning methods are turning out to be highly and suitably automatable to be employed in efficient precision agriculture applications [37].
This study aims to develop a predictive model for estimating ETa in watermelon crops by integrating high-resolution, multispectral satellite imagery from Planet constellation with local meteorological data under semi-arid conditions (southern Italy); we chose Planet data to take advantage of their high temporal resolution (daily acquisition), which allows for a sufficient number of observations despite cloudy days. Different models and machine learning algorithms have been compared, including Elastic Net, the generalized linear model, partial least squares, Random Forest, and Support Vector Machine, to identify the most robust one to predict watermelon ETa.

2. Materials and Methods

2.1. Crop Management and Field Data

The trial was conducted in 2023 at the experimental farm of the Council for Agricultural Research and Economics in Rutigliano, southern Italy (41°01′ N, 17°01′ E, 147 m a.s.l.). Watermelon (Citrullus lanatus (Thunb.) Matsum. & Nakai, 1916), cultivar Lion King, mulched by a biodegradable film (model PC 100 d8, BASF 67056 Ludwigshafen, Germany, 1 m width), was grown over an area of 2.80 ha, with a density of 3200 plants ha1 (spacing: 2.70 m × 1.00 m) (Figure 1). From the transplanting (9 June), the irrigation was performed 3 times per week through drip lines with a flow rate of 2.10 L h−1 and drippers spaced at 0.60 m; the irrigation ended on 24 August. The marketable fruits were harvested between 28 and 31 August 2023. After harvesting, on 25 September 2023, the fresh plant residues, unharvested fruits, and the mulching film were chopped by a tractor shredder and plowed on 2 and 13 October 2023. During the growing season, the watermelon leaf area index (LAI, m2 m−2) was determined through a Licor 3100 (Li-COR Inc., Lincoln, NE, USA).
The area’s climate is typically Mediterranean, with hot and dry summers and mildly cold winters. The climate is CSa following the Köppen and Geiger classification [38]. The main rainfall occurs in autumn and winter (over the past thirty years, the average annual rainfall has been 622 mm); the average annual temperature is 16.5 °C and the hottest and coldest months are August and January, respectively [38]. The soil is classified as “Lithic Rhodoxeralf” and features a clay texture, stable structure, shallow profile (0.60–1.10 m), and rapid drainage due to an underlying cracked limestone subsoil parent material. Meteorological data were collected using an automatic weather station equipped with standard sensors installed within the experimental farm.

2.2. Eddy Covariance Measurements

The eddy covariance (EC) method was employed for measuring the hourly fluxes of water vapor (H2O), corresponding to actual evapotranspiration (ETa, mm) [39]. It is considered the most direct and least error-prone approach for long-term monitoring of H2O fluxes at the field scale [8,40,41]. The EC method is based on the covariance between the instantaneous values of the vertical velocity component and the scalar concentration (H2O), measured by a three-dimensional sonic anemometer (uSonic 3 Scientific, Metek GmbH, 25337 Elmshorn, Germany) and a fast response open-path infrared gas analyzer (LI-7500, Li-COR Inc., Lincoln, NE, USA), respectively. On 25 May 2023, the EC flux tower was established in the center of the field at 1.5 m above the soil surface, reaching a maximum of 1.75 m, following the growth of the crop. Micrometeorological data were recorded every 0.1 s (10 Hz) by the MeteoFlux software (Servizi Territorio, S.n.c., Cinisello Balsamo, Italy) (https://meteorflux.org/). The EddyPro® software (v7.0.9, LiCor, 68504 Lincoln, NE, USA) [42] was used for calculating the ETa fluxes. All details can be found in Ferrara et al. [43].

2.3. Satellite Images

In this work, high-resolution spectral images acquired by the SuperDove satellites were used (Planet Images©, 2023). SuperDove satellites are part of the PlanetScope constellation operated by Planet Labs PBC company; they acquire spectral images with a 3 m spatial resolution and eight bands with a spectral resolution from visible to red-edge and near-infrared [44]. The images were downloaded from the online tool Planet Explorer as orthorectified and atmospherically corrected TIFF files. Only cloud-free images were downloaded for the investigated period (from the transplanting on 9 June of the watermelon to the end of the growing cycle on 3 October), for a total of 68 images. Per each spectral band, the mean reflectance value of the watermelon field was calculated by averaging the reflectance value of the pixels within the field (Figure 1B). Images were processed in R environment (RStudio IDE, version 2024.04.2+764 for Windows) using the packages “sf” [45] and “raster” [46].

2.4. Data Processing and Machine Learning Analyses

In the pre-processing phase, the daily ETa data, obtained as an integral of hourly positive EC values, for the entire watermelon growth cycle were filtered to consider only those days on which cloud-free SuperDove images were available (0% cloudiness for the field area at the time of image acquisition). Consequently, the ETa measurements of days without corresponding cloud-free images were excluded from the dataset. The coefficient of variance (CV) has been calculated before and after data processing as the ratio of standard deviation and mean. For the remaining set of data, the mean reflectance values for each spectral band of the SuperDove images were matched to the corresponding daily ETa measurements. The resulting dataset was used to develop a machine learning model to predict the daily ETa of the watermelon (target) based on the following predictors: (i) reflectance values of the SuperDove spectral bands (see Table 1), (ii) the average daily temperature, and (iii) the average daily RH; the last two variables were acquired by a standard agro-meteorological station, installed in a reference grass field close to the experimental plot.
For the prediction of watermelon ETa, partial least squares (PLS), the generalized linear model (GLM) and three machine learning algorithms—Elastic Net (ENet), Random Forest (RF), Support Vector Machine (SVM)—were evaluated and compared. In this study, we have excluded deep learning approaches (e.g., CNN) since they require a large amount of data in the training phase [47,48]. ENet is a regularized regression technique that aims to combine the L1 (Lasso) and L2 (Ridge) regularization methods to improve the predictive accuracy; integrating both L1 and L2 penalties, ENet can handle datasets with highly correlated predictors [49]. The GLM extends traditional linear regression by allowing the response variable to follow various distributions. It consists of three components: a random component specifying the distribution of the response variable, a linear predictor formed by a linear combination of explanatory variables, and a link function that relates the mean of the response variable to the linear predictor [50]. PLS models the relationship between data by projecting them on a new space of latent variables. PLS focuses on maximizing the covariance between the predictors and the target. By extracting the latent variables that capture the most significant information, PLS provides a robust framework for predictive modeling and data analysis [51]. RF operates by building multiple decision trees on random subsets of the data, thus averaging the results of each tree. The final prediction is obtained by averaging the outputs of all individual trees, which enhances predictive accuracy and reduces overfitting [52]. The versatility of RF in handling complex datasets has also been demonstrated in downscaling studies. For example, Agarwal et al. [53] applied RF to downscale GRACE-derived groundwater storage data to a resolution of 5 km in California’s Central Valley. This highlights the potential of RF in modeling non-linear relationships and improving the applicability of satellite data in environmental and resource management at the local scale. SVM identifies the optimal hyperplane to separate the data points into distinct classes, maximizing the margin between them. Utilizing kernel functions, SVM can handle both linear and non-linear distributions of data [54]. All the analyses have been performed in Python environment (v. 3.11.5) and Spyder IDE (Spyder©, v. 5.4.3 for Windows) using the Scikit-Learn library (v. 1.5.2).
To enhance the performance of the models, it is essential to fine-tune their hyperparameters. In this study, we utilized the Optuna library (v. 4.0.0) for automatic hyperparameter optimization. Optuna explores the parameter space to identify the best performing setups. For each machine learning model, 100 optimization trials were conducted, utilizing Bayesian optimization with the Tree-structured Parzen Estimator (TPE) sampler [55]. The specific hyperparameters fine-tuned for each algorithm are reported in Table 2.
To evaluate and compare the performance of the models, we performed a five-fold cross-validation repeated three times. This method involves dividing the dataset into five subsets, training the model on four subsets (80% of the dataset) and validating it on the remaining one (20% of the dataset). The process is performed five times, each time with a different subset serving as the validation set, ensuring that all data points are used for validation in the process. This approach is widely used for estimating the generalizability of the models [56]. Repeating the entire procedure three times with different partitions of the data enhances the reliability of performance estimates by reducing variability due to specific data partitions [57].
Model performances were compared by calculating the coefficient of determination (R2), the root mean squared error (RMSE), and the mean bias error (MBE), based on the validation folds of the cross-validation.
The best performing model has been applied to model the daily trend of watermelon ETa. Predicted and observed ETa mean values have been compared through Student’s t-test with a significance level set at α = 0.05.

Machine Learning Explainability

Machine learning interpretability is increasingly recognized as a key aspect of building and using machine learning systems [58]. Three methods were used to explain model predictions in this study: permutation importance, SHapley Additive exPlanations (SHAP), and Local Interpretable Model-agnostic Explanations (LIME). Permutation importance measures the most influential features in a model’s predictions. This involves training a model on the original dataset to create baseline performance and then randomizing values for one feature, interrupting its relationship with the target. Thus, the model is evaluated on this new dataset, and the importance of the feature is defined by the change in performance after permutation [59]. SHAP, on the other hand, draws on the Shapley value from cooperative game theory, which evaluates a player’s contribution by looking at all possible coalitions [60]. In machine learning, SHAP calculates the influence of each feature on a model’s predictions, offering clear and detailed insights into how individual features contribute to the outcomes [61]. LIME was introduced as a complementary and localized interpretability method to explain predictions on a specific instance. LIME approximates the model locally by creating perturbed samples around the instance to be explained and fitting an interpretable surrogate model (e.g., linear regression) highlighting the most influential features and their values for the prediction of that specific instance; in this study, LIME was applied to the instance where the model made its most accurate prediction [62].

3. Results

3.1. Weather Conditions

During the watermelon growing season, the highest average temperature was recorded on 27 July (34.1 °C), while the lowest was on 24 September (18.4 °C). The seasonal average temperature was 23.8 °C. The highest maximum temperature was recorded on 24 July (41.8 °C), while the lowest minimum temperature was recorded on 11 October (12.8 °C). RH showed a fluctuating trend during the season, with the lowest values during the second half of July (22 and 25 July: 33.2% and 38.2%, respectively) (Figure 2a). The driest months were July and August; the cumulated rainfall up to the incorporation of the crop residuals was 94.2 mm. The cumulated irrigation water applied was 582.8 mm (Figure 2b).

3.2. Measured Actual Evapotranspiration and Predictors

The full time series of the watermelon’s ETa measured daily is shown in Figure 3, and the subset of ETa, coinciding with available satellite data, on clear days, is also indicated; this last subset effectively captures the most ETa variability (full time series CV = 0.509; subset time series CV = 0.455).
The ETa trend follows the typical evapotranspiration pattern of an irrigated crop under the Mediterranean climate, starting with relatively low values, indicating a moderate level of evapotranspiration in June (except for 14 June 2023, ETa = 4.6 mm). A progressive increase in ETa was observed until late July, with a peak on 25 July 2023 (5.5 mm). From the end of August, after the fruits’ harvesting, ETa accounts only for the soil evaporative term, showing a steady decline (except for 21 September 2023 and 26 September 2023, ETa = 3.0 mm and 2.5 mm, respectively), reaching the lowest value (0.6 mm) on 10 October 2023.
Figure 4 shows the correlation matrix of the variables included in the analyses. The nir spectral band had the strongest positive correlation with ETa (r = 0.74), and the daily average temperature had a positive correlation (r = 0.57) as well. Other variables showed negative correlations with ETa, e.g., red and yellow spectral bands (r = −0.71 and r = −0.56, respectively). Coastal blue, blue, green, and green_I spectral bands were highly correlated with each other (r > 0.80). For instance, the correlation between coastal blue and blue was 0.89, and between green_I and green it was 0.95, suggesting multicollinearity between the predicting variables. Nir and red spectral bands showed a negative correlation (r = −0.86). Daily average temperature and daily RH also had a negative correlation (r = −0.79). In the Supplementary Materials (Table S1), we report the descriptive statistics of the variables included in the analyses.

3.3. Models’ Regression Performance

The results of the machine learning analysis (Table 3) indicate marked differences in the performance of the evaluated algorithms in predicting the watermelon ETa. In the Supplementary Materials (Figure S1), we show the scatterplot of the field-measured and predicted ETa per each model. Among the models, RF was found to be the best performing algorithm, resulting in an average R2 of 0.747 (±0.076). This result is further confirmed by the RMSE of 0.577 (±0.106) and the low MBE of 0.034 (±0.145). The scatter plot for RF shows a strong alignment of field-measured and predicted values, particularly in the middle range of ETa values, with only slight deviations for the extreme values. ENet was the second-best algorithm, with an R2 of 0.684 (±0.142) and an RMSE of 0.666 mm (±0.693). However, the slightly higher RMSE and greater variability compared to RF suggest that ENet was less effective in modeling the inherent non-linear relationships within the data. The scatterplot of ENet predictions reflects this trait, showing greater dispersion, particularly at higher ETa values, which the model tends to underestimate. PLS performed similarly to ENet, with an R2 of 0.631 (±0.197) and an RMSE of 0.673 mm (±0.139). The scatter plot for PLS shows a moderate fit of the predictions with the observed values, although it shows spread, especially for the high values of ETa. The GLM demonstrated a weaker performance compared to ENet and PLS, with an R2 of 0.614 (±0.188) and an RMSE of 0.693 mm (±0.148). The scatterplot for GLM predictions highlights these limitations, showing a significant spread at lower ETa values. SVM, in contrast, performed poorly in predicting ETa. The model achieved a mean R2 of 0.257 (±0.232) and had the highest RMSE. The scatterplot for SVM shows the weak fit of the predictions.

3.4. Predicted Watermelon Actual Evapotranspiration and Model Explainability

The RF model, found to be the best performing one among the others, has been applied to predict the daily watermelon ETa. The resulting predictions are shown in Figure 5, where they are also compared with the observed ETa values, and the trend of the field-measured LAI is also reported. From the beginning of the growing season in mid-June until the peak evapotranspiration period at the end of July, the RF model closely reflects the observed increases in ETa, corresponding to the vegetative and reproductive growth phases of the crop. However, slight deviations are evident on specific dates, corresponding to the peaks. In the second part of the growing cycle, the ETa predicted by the RF model fairly follows the decrease in the observed ETa until the end of September. Table 4 reports the cumulated and average daily ETa for the observed and predicted ETa; no statistically significant difference has been detected between the observed and predicted daily ETa (p = 0.64). In the Supplementary Materials (Figure S2), we report the boxplots showing the distribution of the observed and predicted ETa.
The analysis of the feature importance, SHAP, and PDPs of the RF model provided insights into the contributions and impacts of the individual predictors in the prediction of the ETa of watermelon. According to the permutation-based features’ importance, the Nir spectral band was the most important, with an average decrease in accuracy close to 0.4 when permuted. The red spectral band, daily average temperature, and yellow spectral band followed in importance (Figure 6). The SHAP summary graph quantifies the direction and magnitude of each feature’s contribution to the model’s predictions. Nir had the highest SHAP values, confirming the results of feature importance. Higher Nir values consistently contributed positively to ETa, while lower values had a negative impact on the model output. The red band shows an inverse relationship, with higher reflectance values reducing ETa predictions. The daily average temperature contributed positively to the prediction, with higher temperatures associated with higher ETa values. RH had a predominantly negative impact, reducing the Eta-predicted value as humidity increased. In the SHAP plot, the yellow reflectance exhibits a negative impact on ETa predictions. Higher values of yellow reflectance are associated with lower SHAP values, indicating that increased yellow reflectance reduces the predicted ETa (Figure 7). The LIME analysis provides localized insights by focusing on the most accurate prediction, identifying specific feature ranges contributing to its accuracy. Nir reflectance values between 0.33 and 0.43 and average temperature values above 26.66 °C had the greatest importance in the accuracy of the RF model. Similarly, the red, coastal blue and RH features showed positive but smaller contributions (Figure 8). PDPs provide insight into how each variable influences the expected ETa values while holding the other variables constant. The plot for the Nir spectral band shows a strong increase in ETa when Nir reflectance increases above 0.35. In contrast, the red spectral band reflectance shows an inverse relationship, with a strong decrease in ETa when the red reflectance rises above 0.11. The daily average temperature had a positive and almost linear relationship with ETa, particularly above 26 °C. As for RH, the graph shows a gradual decrease in ETa as relative humidity increases. The other spectral bands, such as coastal blue, green, and blue, show flatter relationships with ETa, indicating their limited predictive power compared to the other predictors. In the Supplementary Materials, we report the PDPs illustrating the relationships between the predictors and watermelon actual evapotranspiration (Figure S3).

4. Discussion

In this study, the application of the RF algorithm to predict daily ETa in watermelon using Planet satellite imagery and meteorological information achieved the best result among the compared models. The performance of these models varies significantly, reflecting their methodological differences and suitability for modeling complex environmental processes [63]. The RF algorithm’s capacity to handle multicollinearity among predictors is another significant advantage [64]; in our dataset, different spectral bands exhibited high correlations, which can be a challenge for multicollinearity-sensitive algorithms. RF addresses this problem by selecting random subsets of features during the construction of each tree, thus reducing the dominance of highly correlated variables and improving the robustness of the model [65,66]. These findings support the use of integrating machine learning techniques with remote sensing data for accurate estimation of ETa in horticultural crops (see, for example, the review by Ferreira et al. [67]). Interpretability is a key aspect of machine learning applications in environmental and agricultural sciences [68]. The permutation importance analysis found that the Nir spectral band was the most influential predictor, causing a significant decrease in model accuracy when permuted. This result is in line with the known sensitivity of the Nir band to vegetation biomass and water content, which are directly related to ETa [69,70,71]. The SHAP analysis provided a detailed overview of how each predictor impacted the model results. High values of Nir reflectance contributed positively to ETa predictions; on the contrary, higher reflectance values in the red band negatively affected ETa predictions, probably indicating areas of reduced chlorophyll content or plant stress, resulting in lower transpiration rates [72,73]. A high average daily temperature also had a positive effect on ETa, consistent with the physical processes that regulate evapotranspiration, where higher temperatures increase the vapor pressure deficit and drive transpiration [74]. Yellow band reflectance also negatively affected ETa predictions. High yellow band reflectance may be associated with some stress conditions or senescence of vegetation, resulting in reduced chlorophyll content and altered leaf pigment composition [75]. This may result in reduced photosynthetic activity and transpiration rates. SHAP values indicate that higher values of yellow reflectance contribute to lower ETa predictions, consistent with this physiological understanding. Relative humidity showed a negative relationship with ETa in the RF model. Higher RH values corresponded to lower predicted ETa, as indicated by the negative SHAP values. This inverse relationship is physically explicit, as high humidity levels reduce the vapor pressure deficit between the leaf surfaces and the phyllosphere, thus decreasing the transpiration rate [76]. SHAP analysis confirmed that days with lower RH contributed positively to ETa predictions.
The presented results are the first related to the estimation of watermelon ETa in the Mediterranean environment by integrating high-resolution, high-frequency satellite data with agro-meteorological information. The use of satellite imagery has become increasingly important in precision agriculture, offering a noninvasive and scalable approach to monitor crop water use and status [28,35]. Traditional methods, such as lysimeters, while accurate, are limited by high costs and spatial limitations. The estimation of ET through remote sensing has generally been based on physical models involving thermal infrared (TIR) data. Furthermore, in addition to TIR data, crop-specific information is essential for an accurate estimation of ETa using models such as SEBAL and METRIC. This study focused on predicting ETa by training the models on data obtained by the eddy covariance technique, which is considered a micrometeorological standard approach for measuring fluxes of long-lived and inert gases [41] in the atmosphere and a reliable direct method to measure ETa [77,78]. Moreover, in the methodology proposed in this paper, the RF model was trained using agro-meteorological parameters and satellite band reflectance directly as predictors; the main advantage consists of the high frequency of Planet image acquisition, which can enable time-efficient monitoring of the crop ETa. This approach enables ETa to be estimated without relying on TIR data, overcoming the operational limitations of TIR-based models. In addition, the RF-based approach could offer further advantages besides overcoming the requirements of TIR data. By learning directly from different predictors, such as agro-meteorological parameters and spectral reflectance, it can capture complex, non-linear relationships that might be simplified too much when using traditional physical models. This flexibility reduces the need for rigid assumptions about surface properties or weather conditions and enables the model to adapt to different environmental and management contexts. Furthermore, enhancing the RF model would mainly require new input data rather than recalibrating multiple physical parameters, making this data-driven strategy easier to implement when frequent, high-resolution satellite imagery is available.
Through the method outlined in this work, growers and technicians could fine-tune irrigation schedules to match crop water demands, thereby improving water use efficiency and sustaining yields under increasingly variable weather conditions. The potential for remotely monitoring ETa at a high temporal frequency and at the field scale is not only useful for traditional full irrigation approaches but could also be particularly advantageous for the application of water-saving practices, such as deficit irrigation [79,80,81,82,83]. Since these strategies apply less water than total crop requirements, relying on remotely predicted ETa rather than Kc-approach calculations may reduce some sources of uncertainty—assumptions that are not valid in all agricultural environments or the need for some parameters that are difficult to measure consistently [16,84,85,86]. Nevertheless, some important limitations can be highlighted, e.g., the requirement for sufficiently frequent and cloud-free satellite imagery, the need for reliable ground-based measurements for calibration and validation, and the complexities associated with scaling this approach beyond individual test fields. Another key factor to consider is the cost of satellite data: while some platforms such as Sentinel provide imagery for free, dependence on high-resolution Planet imagery, which carries an associated cost, may limit wider adoption and accessibility to resource-limited regions or smaller farms. Furthermore, since this study was limited to a single field and a single growing season, the results could be influenced by the specific agro-environmental conditions of the context. To increase the generalizability and robustness of the proposed modeling framework, future research should incorporate data from multiple fields extending to different agricultural contexts and to multiple growing seasons.

5. Conclusions

This study demonstrated the effectiveness of integrating high-resolution satellite images with meteorological data to accurately estimate effective daily evapotranspiration in watermelon crops under Mediterranean conditions. Direct measurements of evapotranspiration were performed using the eddy covariance technique, recognized for its accuracy in quantifying fluxes in agricultural environments. Among the machine learning models evaluated, the Random Forest algorithm showed the best performance. These results highlight the potential of combining remote sensing and machine learning techniques for accurate and non-invasive monitoring of water use in crops, offering valuable insights for optimizing irrigation practices and improving water management in agricultural systems. Future research should focus on integrating data from multiple fields, different crops, and growing seasons in various agro-environmental settings to improve the generalizability of the model.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/w17030323/s1, Figure S1: Scatterplot of the predicted and observed daily actual evapotranspiration (ETa) per each model optimized with Optuna. Figure S2: Boxplots showing the distribution of the observed and predicted watermelon actual evapotranspiration (ETa). The red and black lines within the boxes represent the mean and median values, respectively. Figure S3: Partial dependence plots illustrating the relationships between the predictors and watermelon actual evapotranspiration as modeled by the Random Forest algorithm. Table S1: Descriptive statistics of the target (ETa = actual evapotranspiration, after pre-processing) and the predictors included in the machine learning analyses (s.d. = standard deviation; AT = average temperature; RH = relative humidity) for all available data (n = 68).

Author Contributions

Conceptualization, S.P.G.; Data curation, S.P.G., N.S., G.D.C., S.R., G.R. and R.M.F.; Formal analysis, S.P.G., G.R. and R.M.F.; Investigation, S.P.G., S.R., G.R. and R.M.F.; Methodology, S.P.G., V.G., G.R. and R.M.F.; Resources, G.R. and R.M.F.; Software, S.P.G., G.R. and R.M.F.; Supervision, G.R. and R.M.F.; Validation, G.R. and R.M.F.; Visualization, S.P.G. and G.R.; Writing—original draft, S.P.G.; Writing—review and editing, F.A., V.G., G.R. and R.M.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the AgriDigit-Agromodelli project (DM n. 36502 20/12/2018), funded by the Italian Ministry of Agricultural, Food and Forestry Policies and Tourism (MIPAAFT).

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Materials. Further inquiries can be directed to the corresponding authors.

Acknowledgments

The authors would like to express their gratitude to Planet Labs PBC for providing the satellite imagery used in this research through Planet’s Education and Research Program.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Alkhalidi, A.; Assaf, M.N.; Alkaylani, H.; Halaweh, G.; Salcedo, F.P. Integrated innovative technique to assess and priorities risks associated with drought: Impacts, measures/strategies, and actions, global study. Int. J. Disaster Risk Reduct. 2023, 94, 103800. [Google Scholar] [CrossRef]
  2. Katerji, N.; Rana, G.; Ferrara, R.M. Actual evapotranspiration for a reference crop within measured and future changing climate periods in the Mediterranean region. Theor. Appl. Climatol. 2017, 129, 923–938. [Google Scholar] [CrossRef]
  3. Maldera, F.; Garofalo, S.P.; Camposeo, S. Ecophysiological Recovery of Micropropagated Olive Cultivars: Field Research in an Irrigated Super-High-Density Orchard. Agronomy 2024, 14, 1560. [Google Scholar] [CrossRef]
  4. Katerji, N.; Mastrorilli, M.; Rana, G. Water use efficiency of crops cultivated in Mediterranean region: Review and analysis. Eur. J. Agron. 2008, 28, 493–507. [Google Scholar] [CrossRef]
  5. Rana, G.; Katerji, N. Measurement and estimation of actual evapotranspiration in the field under Mediterranean climate: A review. Eur. J. Agron. 2000, 13, 125–153. [Google Scholar] [CrossRef]
  6. Kandra, B.; Tall, A.; Gomboš, M.; Pavelková, D. Quantification of Evapotranspiration by Calculations and Measurements Using a Lysimeter. Water 2023, 15, 373. [Google Scholar] [CrossRef]
  7. Shi, W.; Zhang, X.; Xue, X.; Feng, F.; Zheng, W.; Chen, L. Analyzing Evapotranspiration in Greenhouses: A Lysimeter-Based Calculation and Evaluation Approach. Agronomy 2023, 13, 3059. [Google Scholar] [CrossRef]
  8. Aubinet, M.; Vesala, T.; Papale, D. (Eds.) Eddy Covariance: A Practical Guide to Measurement and Data Analysis; Springer Science & Business Media: Dordrecht, The Netherlands, 2012. [Google Scholar]
  9. Baldocchi, D.D. How eddy covariance flux measurements have contributed to our understanding of Global Change Biology. Glob. Change Biol. 2020, 26, 242–260. [Google Scholar] [CrossRef]
  10. Fong, B.N.; Reba, M.L.; Teague, T.G.; Runkle, B.R.; Suvocarev, K. Eddy covariance measurements of carbon dioxide and water fluxes in US mid-south cotton production. Agric. Ecosyst. Environ. 2020, 292, 106813. [Google Scholar] [CrossRef]
  11. Anapalli, S.S.; Pinnamaneni, S.R.; Reddy, K.N.; Singh, G. Eddy covariance quantification of corn water use and yield responses to irrigations on farm-scale fields. Agron. J. 2022, 114, 2445–2457. [Google Scholar] [CrossRef]
  12. Rafi, Z.; Merlin, O.; Le Dantec, V.; Khabba, S.; Mordelet, P.; Er-Raki, S.; Ferrer, F. Partitioning evapotranspiration of a drip-irrigated wheat crop: Inter-comparing eddy covariance-, sap flow-, lysimeter-and FAO-based methods. Agric. For. Meteorol. 2019, 265, 310–326. [Google Scholar] [CrossRef]
  13. Allen, R.G.; Pereira, L.S.; Raes, D.; Smith, M. Crop Evapotranspiration—Guidelines for Computing Crop Water Requirements—FAO Irrigation and Drainage Paper 56. 1998. Available online: https://www.fao.org/4/x0490e/x0490e00.htm (accessed on 14 September 2018).
  14. Ndulue, E.; Ranjan, R.S. Performance of the FAO Penman-Monteith equation under limiting conditions and fourteen reference evapotranspiration models in southern Manitoba. Theor. Appl. Climatol. 2021, 143, 1285–1298. [Google Scholar] [CrossRef]
  15. Meraz-Maldonado, N.; Flores-Magdaleno, H. Maize Evapotranspiration Estimation Using Penman-Monteith Equation and Modeling the Bulk Canopy Resistance. Water 2019, 11, 2650. [Google Scholar] [CrossRef]
  16. Katerji, N.; Rana, G. FAO-56 methodology for determining water requirement of irrigated crops: Critical examination of the concepts, alternative proposals and validation in Mediterranean region. Theor. Appl. Climatol. 2013, 116, 515–536. [Google Scholar] [CrossRef]
  17. Bastiaanssen, W.G.M.; Noordman, E.J.M.; Pelgrum, H.; Davids, G.; Thoreson, B.P.; Allen, R.G. SEBAL model with remotely sensed data to improve water-resources management under actual field conditions. J. Irrig. Drain. Eng. 2005, 131, 85–93. [Google Scholar] [CrossRef]
  18. Allen, R.G.; Tasumi, M.; Trezza, R. Satellite-based energy balance for mapping evapotranspiration with internalized calibration (METRIC)—Model. J. Irrig. Drain. Eng. 2007, 133, 380–394. [Google Scholar] [CrossRef]
  19. Derardja, B.; Khadra, R.; Abdelmoneim, A.A.A.; El-Shirbeny, M.A.; Valsamidis, T.; De Pasquale, V.; Deflorio, A.M.; Volden, E. Advancements in Remote Sensing for Evapotranspiration Estimation: A Comprehensive Review of Temperature-Based Models. Remote Sens. 2024, 16, 1927. [Google Scholar] [CrossRef]
  20. Hendrickx, J.M.; Hong, S.H. Mapping sensible and latent heat fluxes in arid areas using optical imagery. In Targets and Backgrounds XI: Characterization and Representation; SPIE: Bellingham, WA, USA, 2005; Volume 5811, pp. 138–146. [Google Scholar]
  21. Bhattarai, N.; Wagle, P. Recent Advances in Remote Sensing of Evapotranspiration. Remote Sens. 2021, 13, 4260. [Google Scholar] [CrossRef]
  22. García-Santos, V.; Sánchez, J.M.; Cuxart, J. Evapotranspiration Acquired with Remote Sensing Thermal-Based Algorithms: A State-of-the-Art Review. Remote Sens. 2022, 14, 3440. [Google Scholar] [CrossRef]
  23. Available online: https://www.usgs.gov/faqs/what-are-band-designations-landsat-satellites (accessed on 3 December 2024).
  24. Available online: https://lpdaac.usgs.gov/data/get-started-data/collection-overview/missions/modis-overview/#modis-temporal-and-spatial-resolution (accessed on 3 December 2024).
  25. Garofalo, S.P.; Modugno, A.F.; De Carolis, G.; Sanitate, N.; Negash Tesemma, M.; Scarascia-Mugnozza, G.; Tekle Tegegne, Y.; Campi, P. Explainable Artificial Intelligence to Predict the Water Status of Cotton (Gossypium hirsutum L., 1763) from Sentinel-2 Images in the Mediterranean Area. Plants 2024, 13, 3325. [Google Scholar] [CrossRef]
  26. Bossung, C.; Schlerf, M.; Machwitz, M. Estimation of canopy nitrogen content in winter wheat from Sentinel-2 images for operational agricultural monitoring. Precis. Agric. 2022, 23, 2229–2252. [Google Scholar] [CrossRef]
  27. Segarra, J.; Buchaillot, M.L.; Araus, J.L.; Kefauver, S.C. Remote Sensing for Precision Agriculture: Sentinel-2 Improved Features and Applications. Agronomy 2020, 10, 641. [Google Scholar] [CrossRef]
  28. Helman, D.; Bahat, I.; Netzer, Y.; Ben-Gal, A.; Alchanatis, V.; Peeters, A.; Cohen, Y. Using Time Series of High-Resolution Planet Satellite Images to Monitor Grapevine Stem Water Potential in Commercial Vineyards. Remote Sens. 2018, 10, 1615. [Google Scholar] [CrossRef]
  29. Breunig, F.M.; Galvão, L.S.; Dalagnol, R.; Dauve, C.E.; Parraga, A.; Santi, A.L.; Chen, S. Delineation of management zones in agricultural fields using cover–crop biomass estimates from PlanetScope data. Int. J. Appl. Earth Obs. Geoinf. 2020, 85, 102004. [Google Scholar] [CrossRef]
  30. Costa, T.S.; Filgueiras, R.; dos Santos, R.A.; Cunha, F.F.d. Actual evapotranspiration by machine learning and remote sensing without the thermal spectrum. PLoS ONE 2023, 18, e0285535. [Google Scholar] [CrossRef]
  31. Sharma, A.; Jain, A.; Gupta, P.; Chowdary, V. Machine learning applications for precision agriculture: A comprehensive review. IEEE Access 2021, 9, 4843–4873. [Google Scholar] [CrossRef]
  32. Araújo, S.O.; Peres, R.S.; Ramalho, J.C.; Lidon, F.; Barata, J. Machine Learning Applications in Agriculture: Current Trends, Challenges, and Future Perspectives. Agronomy 2023, 13, 2976. [Google Scholar] [CrossRef]
  33. Garofalo, S.P.; Giannico, V.; Lorente, B.; García, A.J.; Vivaldi, G.A.; Thameur, A.; Salcedo, F.P. Predicting carob tree physiological parameters under different irrigation systems using Random Forest and Planet satellite images. Front. Plant Sci. 2024, 15, 1302435. [Google Scholar] [CrossRef] [PubMed]
  34. Liakos, K.G.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D. Machine Learning in Agriculture: A Review. Sensors 2018, 18, 2674. [Google Scholar] [CrossRef]
  35. Campi, P.; Modugno, A.F.; De Carolis, G.; Pedrero Salcedo, F.; Lorente, B.; Garofalo, S.P. A Machine Learning Approach to Monitor the Physiological and Water Status of an Irrigated Peach Orchard under Semi-Arid Conditions by Using Multispectral Satellite Data. Water 2024, 16, 2224. [Google Scholar] [CrossRef]
  36. Barbedo, J.G.A. Detecting and Classifying Pests in Crops Using Proximal Images and Machine Learning: A Review. AI 2020, 1, 312–328. [Google Scholar] [CrossRef]
  37. Patrício, D.I.; Rieder, R. Computer vision and artificial intelligence in precision agriculture for grain crops: A systematic review. Comput. Electron. Agric. 2018, 153, 69–81. [Google Scholar] [CrossRef]
  38. Available online: https://en.climate-data.org/europa/italia/puglia/rutigliano-14028/ (accessed on 3 December 2024).
  39. Lee, X.; Massman, W.; Law, B.E. Handbook of micrometeorology: A guide for surface flux measurement and analysis. In Atmospheric and Oceanographic Sciences Library; Springer: Dordrecht, The Netherlands, 2004. [Google Scholar]
  40. Foken, T.; Aubinet, M.; Leuning, R. The eddy covariance method. In Eddy Covariance: A Practical Guide to Measurement and Data Analysis; Aubinet, M., Vesala, T., Papale, D., Eds.; Springer: Dordrecht, The Netherlands, 2012; pp. 1–19. [Google Scholar] [CrossRef]
  41. Aubinet, M.; Grelle, A.; Ibrom, A.; Rannik, Ü.; Moncrieff, J.; Foken, T.; Kowalski, A.S.; Martin, P.H.; Berbigier, P.; Bernhofer, C.; et al. Estimates of the annual net carbon and water exchange of forests: The EUROFLUX methodology. Adv. Ecol. Res. 2000, 30, 113–175. [Google Scholar]
  42. Eddy Covariance Software—EddyPro, FluxSuite, and Tovi|LI-COR Environmental. Available online: https://www.licor.com/products/eddy-covariance/eddypro (accessed on 3 December 2024).
  43. Ferrara, R.M.; Azzolini, A.; Ciurlia, A.; De Carolis, G.; Mastrangelo, M.; Minorenti, V.; Montaghi, A.; Piarulli, M.; Ruggieri, S.; Vitti, C.; et al. Carbon and Water Balances in a Watermelon Crop Mulched with Biodegradable Films in Mediterranean Conditions at Extended Growth Season Scale. Atmosphere 2024, 15, 945. [Google Scholar] [CrossRef]
  44. Available online: https://assets.planet.com/docs/Planet_PSScene_Imagery_Product_Spec_letter_screen.pdf (accessed on 3 December 2024).
  45. Pebesma, E.; Bivand, R. Spatial Data Science: With Applications in R; Chapman and Hall/CRC: London, UK, 2023. [Google Scholar]
  46. Hijmans, R.J.; Van Etten, J.; Mattiuzzi, M.; Sumner, M.; Greenberg, J.A.; Lamigueiro, O.P.; Shortridge, A. Raster Package in R. Version. 2013. Available online: https://cran.r-project.org/package=raster (accessed on 3 December 2024).
  47. Barbedo, J.G.A. Impact of dataset size and variety on the effectiveness of deep learning and transfer learning for plant disease classification. Comput. Electron. Agric. 2018, 153, 46–53. [Google Scholar] [CrossRef]
  48. Arafa, D.A.; Moustafa, H.E.D.; Ali, H.A.; Ali-Eldin, A.M.; Saraya, S.F. A deep learning framework for early diagnosis of Alzheimer’s disease on MRI images. Multimed. Tools Appl. 2024, 83, 3767–3799. [Google Scholar] [CrossRef]
  49. Zou, H.; Hastie, T. Regularization and Variable Selection Via the Elastic Net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
  50. Haberman, S.; Renshaw, A.E. Generalized Linear Models and Actuarial Science. J. R. Stat. Soc. Ser. D (Stat.) 1996, 45, 407–436. [Google Scholar] [CrossRef]
  51. Guebel, D.V.; Torres, N.V. Partial Least-Squares Regression (PLSR). In Encyclopedia of Systems Biology; Dubitzky, W., Wolkenhauer, O., Cho, K.H., Yokota, H., Eds.; Springer: New York, NY, USA, 2013. [Google Scholar] [CrossRef]
  52. Khan, M.A.; Memon, S.A.; Farooq, F.; Javed, M.F.; Aslam, F.; Alyousef, R. Compressive Strength of Fly-Ash-Based Geopolymer Concrete by Gene Expression Programming and Random Forest. Adv. Civ. Eng. 2021, 2021, 6618407. [Google Scholar] [CrossRef]
  53. Agarwal, V.; Akyilmaz, O.; Shum, C.K.; Feng, W.; Yang, T.Y.; Forootan, E.; Uz, M. Machine learning based downscaling of GRACE-estimated groundwater in Central Valley, California. Sci. Total Environ. 2023, 865, 161138. [Google Scholar] [CrossRef]
  54. Guido, R.; Ferrisi, S.; Lofaro, D.; Conforti, D. An Overview on the Advancements of Support Vector Machine Models in Healthcare Applications: A Review. Information 2024, 15, 235. [Google Scholar] [CrossRef]
  55. Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’19), Anchorage, AK, USA, 4–8 August 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 2623–2631. [Google Scholar] [CrossRef]
  56. Oyedele, O. Determining the optimal number of folds to use in a K-fold cross-validation: A neural network classification experiment. Res. Math. 2023, 10, 2201015. [Google Scholar] [CrossRef]
  57. Allgaier, J.; Pryss, R. Cross-Validation Visualized: A Narrative Guide to Advanced Methods. Mach. Learn. Knowl. Extr. 2024, 6, 1378–1388. [Google Scholar] [CrossRef]
  58. Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artif. Intell. 2019, 267, 1–38. [Google Scholar] [CrossRef]
  59. Strobl, C.; Boulesteix, A.L.; Kneib, T.; Augustin, T.; Zeileis, A. Conditional variable importance for random forests. BMC Bioinform. 2008, 9, 307. [Google Scholar] [CrossRef]
  60. Shapley, L.S. 17. A Value for n-Person Games. In Contributions to the Theory of Games (AM-28); Princeton University Press: Princeton, NJ, USA, 1953; pp. 307–318. [Google Scholar] [CrossRef]
  61. Loi, C.L.; Wu, C.C.; Liang, Y.C. Prediction of Tropical Cyclogenesis Based on Machine Learning Methods and Its SHAP Interpretation. J. Adv. Model. Earth Syst. 2024, 16, e2023MS003637. [Google Scholar] [CrossRef]
  62. Akkem, Y.; Biswas, S.K.; Varanasi, A. Streamlit-based enhancing crop recommendation systems with advanced explainable artificial intelligence for smart farming. Neural Comput. Appl. 2024, 36, 20011–20025. [Google Scholar] [CrossRef]
  63. Vergni, L.; Todisco, F. A Random Forest Machine Learning Approach for the Identification and Quantification of Erosive Events. Water 2023, 15, 2225. [Google Scholar] [CrossRef]
  64. Harrison, J.W.; Lucius, M.A.; Farrell, J.L.; Eichler, L.W.; Relyea, R.A. Prediction of stream nitrogen and phosphorus concentrations from high-frequency sensors using Random Forests Regression. Sci. Total Environ. 2021, 763, 143005. [Google Scholar] [CrossRef]
  65. Lee, H.; Wang, J.; Leblon, B. Using Linear Regression, Random Forests, and Support Vector Machine with Unmanned Aerial Vehicle Multispectral Images to Predict Canopy Nitrogen Weight in Corn. Remote Sens. 2020, 12, 2071. [Google Scholar] [CrossRef]
  66. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html#sklearn.ensemble.RandomForestRegressor (accessed on 5 December 2024).
  67. Ferreira, C.S.S.; Soares, P.R.; Guilherme, R.; Vitali, G.; Boulet, A.; Harrison, M.T.; Malamiri, H.; Duarte, A.C.; Kalantari, Z.; Ferreira, A.J.D. Sustainable Water Management in Horticulture: Problems, Premises, and Promises. Horticulturae 2024, 10, 951. [Google Scholar] [CrossRef]
  68. Novielli, P.; Magarelli, M.; Romano, D.; de Trizio, L.; Di Bitonto, P.; Monaco, A.; Amoroso, N.; Stellacci, A.M.; Zoani, C.; Bellotti, R.; et al. Climate Change and Soil Health: Explainable Artificial Intelligence Reveals Microbiome Response to Warming. Mach. Learn. Knowl. Extr. 2024, 6, 1564–1578. [Google Scholar] [CrossRef]
  69. Johnson, J.B. Rapid Prediction of Leaf Water Content in Eucalypt Leaves Using a Handheld NIRS Instrument. Eng 2023, 4, 1198–1209. [Google Scholar] [CrossRef]
  70. Galleguillos, M.; Ulloa-Pino, J.; Munoz-Toro, N.; Perez-Quezada, J.F. Actual evapotranspiration and its relation with floristic composition and topographical features in an arid watershed. In Proceedings of the 2014 IEEE Geoscience and Remote Sensing Symposium, Quebec City, QC, Canada, 13–18 July 2014; pp. 3494–3497. [Google Scholar] [CrossRef]
  71. Abbasi, N.; Nouri, H.; Didan, K.; Barreto-Muñoz, A.; Chavoshi Borujeni, S.; Opp, C.; Nagler, P.; Thenkabail, P.S.; Siebert, S. Mapping Vegetation Index-Derived Actual Evapotranspiration across Croplands Using the Google Earth Engine Platform. Remote Sens. 2023, 15, 1017. [Google Scholar] [CrossRef]
  72. Li, H.; Lascano, R.J.; Barnes, E.M.; Booker, J.; Wilson, L.T.; Bronson, K.F.; Segarra, E. Multispectral Reflectance of Cotton Related to Plant Growth, Soil Water and Texture, and Site Elevation. Agron. J. 2001, 93, 1327–1337. [Google Scholar] [CrossRef]
  73. Zahir, S.A.D.M.; Jamlos, M.F.; Omar, A.F.; Jamlos, M.A.; Mamat, R.; Muncan, J.; Tsenkova, R. Review–Plant nutritional status analysis employing the visible and near-infrared spectroscopy spectral sensor. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2024, 304, 123273. [Google Scholar] [CrossRef]
  74. Koehler, T.; Wankmüller, F.J.P.; Sadok, W.; Carminati, A. Transpiration response to soil drying versus increasing vapor pressure deficit in crops: Physical and physiological mechanisms and key plant traits. J. Exp. Bot. 2023, 74, 4789–4807. [Google Scholar] [CrossRef]
  75. Adams, M.L.; Philpot, W.D.; Norvell, W.A. Yellowness index: An application of spectral second derivatives to estimate chlorosis of leaves in stressed vegetation. Int. J. Remote Sens. 1999, 20, 3663–3675. [Google Scholar] [CrossRef]
  76. Gao, Y. Influencing factors for transpiration rate: A numerical simulation of an individual leaf system. Therm. Sci. Eng. Prog. 2022, 27, 101110. [Google Scholar] [CrossRef]
  77. Ha, W.; Kolb, T.E.; Springer, A.E.; Dore, S.; O’Donnell, F.C.; Martinez Morales, R.; Koch, G.W. Evapotranspiration comparisons between eddy covariance measurements and meteorological and remote-sensing-based models in disturbed ponderosa pine forests. Ecohydrology 2015, 8, 1335–1350. [Google Scholar] [CrossRef]
  78. Gao, G.; Zhang, X.; Yu, T.; Liu, B. Comparison of three evapotranspiration models with eddy covariance measurements for a Populus euphratica Oliv. forest in an arid region of northwestern China. J. Arid. Land 2016, 8, 146–156. [Google Scholar] [CrossRef]
  79. Attia, A.; El-Hendawy, S.; Al-Suhaibani, N.; Alotaibi, M.; Tahir, M.U.; Kamal, K.Y. Evaluating deficit irrigation scheduling strategies to improve yield and water productivity of maize in arid environment using simulation. Agric. Water Manag. 2021, 249, 106812. [Google Scholar] [CrossRef]
  80. Wang, Z.; Yu, S.; Zhang, H.; Lei, L.; Liang, C.; Chen, L.; Li, X. Deficit mulched drip irrigation improves yield, quality, and water use efficiency of watermelon in a desert oasis region. Agric. Water Manag. 2023, 277, 108103. [Google Scholar] [CrossRef]
  81. Yavuz, D.; Seymen, M.; Yavuz, N.; Çoklar, H.; Ercan, M. Effects of water stress applied at various phenological stages on yield, quality, and water use efficiency of melon. Agric. Water Manag. 2021, 246, 106673. [Google Scholar] [CrossRef]
  82. Pereira, L.S.; Paredes, P.; Jovanovic, N. Soil water balance models for determining crop water and irrigation requirements and irrigation scheduling focusing on the FAO56 method and the dual Kc approach. Agric. Water Manag. 2020, 241, 106357. [Google Scholar] [CrossRef]
  83. Wu, R.; Liu, Y.; Xing, X. Evaluation of evapotranspiration deficit index for agricultural drought monitoring in North China. J. Hydrol. 2021, 596, 126057. [Google Scholar] [CrossRef]
  84. Chen, L.-H.; Chen, J.; Chen, C. Effect of Environmental Measurement Uncertainty on Prediction of Evapotranspiration. Atmosphere 2018, 9, 400. [Google Scholar] [CrossRef]
  85. Mata-González, R.; McLendon, T.; Martin, D.W. The Inappropriate Use of Crop Transpiration Coefficients (Kc) to Estimate Evapotranspiration in Arid Ecosystems: A Review. Arid. Land Res. Manag. 2005, 19, 285–295. [Google Scholar] [CrossRef]
  86. Subedi, A.; Chávez, J.L. Crop evapotranspiration (ET) estimation models: A review and discussion of the applicability and limitations of ET methods. J. Agric. Sci. 2015, 7, 50. [Google Scholar] [CrossRef]
Figure 1. (A) Rutigliano, southern Italy (OpenStreetMap contributors); (B) aerial view of the experimental field where watermelon was cultivated in 2023.
Figure 1. (A) Rutigliano, southern Italy (OpenStreetMap contributors); (B) aerial view of the experimental field where watermelon was cultivated in 2023.
Water 17 00323 g001
Figure 2. (a) Trend of mean temperature and relative humidity during the growing season; (b) daily rainfall and irrigation water applied during the growing season.
Figure 2. (a) Trend of mean temperature and relative humidity during the growing season; (b) daily rainfall and irrigation water applied during the growing season.
Water 17 00323 g002
Figure 3. Comparison between the complete set of daily watermelon actual evapotranspiration data collected throughout the study season by eddy covariance method (black circles) and the subset corresponding to dates with available satellite imagery used in the analyses (red triangles).
Figure 3. Comparison between the complete set of daily watermelon actual evapotranspiration data collected throughout the study season by eddy covariance method (black circles) and the subset corresponding to dates with available satellite imagery used in the analyses (red triangles).
Water 17 00323 g003
Figure 4. Heatmap of the correlation matrix among the dataset variables. The values within the cells represent Pearson correlation coefficients. ETa = actual evapotranspiration after pre-processing, T_mean = mean air temperature, RH = air relative humidity.
Figure 4. Heatmap of the correlation matrix among the dataset variables. The values within the cells represent Pearson correlation coefficients. ETa = actual evapotranspiration after pre-processing, T_mean = mean air temperature, RH = air relative humidity.
Water 17 00323 g004
Figure 5. Trend of field-measured daily watermelon evapotranspiration (observed ETa) and the predicted watermelon ETa using the Random Forest-based model. The trend of watermelon leaf area index (LAI, m2 m−2).
Figure 5. Trend of field-measured daily watermelon evapotranspiration (observed ETa) and the predicted watermelon ETa using the Random Forest-based model. The trend of watermelon leaf area index (LAI, m2 m−2).
Water 17 00323 g005
Figure 6. Permutation-based feature importance for the Random Forest model predicting watermelon actual evapotranspiration. Error bars indicate the standard deviation of feature importance across permutations.
Figure 6. Permutation-based feature importance for the Random Forest model predicting watermelon actual evapotranspiration. Error bars indicate the standard deviation of feature importance across permutations.
Water 17 00323 g006
Figure 7. SHAP summary plot showing the impact of each feature on the Random Forest model’s predictions for watermelon actual evapotranspiration.
Figure 7. SHAP summary plot showing the impact of each feature on the Random Forest model’s predictions for watermelon actual evapotranspiration.
Water 17 00323 g007
Figure 8. LIME explanation for the most accurate prediction. The bar plot highlights the contribution of specific feature ranges to the Random Forest prediction, showing their positive impact on model accuracy.
Figure 8. LIME explanation for the most accurate prediction. The bar plot highlights the contribution of specific feature ranges to the Random Forest prediction, showing their positive impact on model accuracy.
Water 17 00323 g008
Table 1. The eight SuperDove spectral bands considered in the analyses with their central wavelength (nm) and range (full width at half maximum, fwhm).
Table 1. The eight SuperDove spectral bands considered in the analyses with their central wavelength (nm) and range (full width at half maximum, fwhm).
BandCentral Wavelength (fwhm)
Coastal Blue443 (20)
Blue490 (50)
Green I531 (36)
Green565 (36)
Yellow610 (20)
Red665 (31)
Red Edge705 (15)
NIR865 (40)
Table 2. Hyperparameters fine-tuned through Optuna per each machine learning algorithm used in this study.
Table 2. Hyperparameters fine-tuned through Optuna per each machine learning algorithm used in this study.
AlgorithmParameters
ENetL1 ratio, alpha
GLMlink function
PLSn_components
RFn_estimators, max_depth, min_sample_leaf, max_features, min_sample_split
SVRC, epsilon, gamma
Table 3. Performance of the compared models across 3 repetitions of the 5-fold cross-validation (mean and standard deviation).
Table 3. Performance of the compared models across 3 repetitions of the 5-fold cross-validation (mean and standard deviation).
AlgorithmR2RMSEMBE
ENet0.684 (±0.142)0.666 (±0.693)0.005 (±0.137)
GLM0.614 (±0.188)0.693 (±0.148)0.051 (±0.096)
PLS0.631 (±0.197)0.673 (±0.139)0.044 (±0.101)
RF0.747 (±0.076)0.577 (±0.106)0.034 (±0.145)
SVR0.257 (±0.232)0.984 (±0.170)−0.034 (±0.250)
Table 4. Average observed and Random Forest-predicted watermelon evapotranspiration (ETa); the standard deviation is reported within the parentheses. Cumulated value of observed and Random Forest-predicted watermelon ETa. Observed ETa refers to the subset used for the machine learning analyses.
Table 4. Average observed and Random Forest-predicted watermelon evapotranspiration (ETa); the standard deviation is reported within the parentheses. Cumulated value of observed and Random Forest-predicted watermelon ETa. Observed ETa refers to the subset used for the machine learning analyses.
Observed ETaPredicted ETa
Average (mm day−1)2.6 (±1.2)2.6 (±1.0)
Cumulated (mm)179180
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Garofalo, S.P.; Ardito, F.; Sanitate, N.; De Carolis, G.; Ruggieri, S.; Giannico, V.; Rana, G.; Ferrara, R.M. Robustness of Actual Evapotranspiration Predicted by Random Forest Model Integrating Remote Sensing and Meteorological Information: Case of Watermelon (Citrullus lanatus, (Thunb.) Matsum. & Nakai, 1916). Water 2025, 17, 323. https://doi.org/10.3390/w17030323

AMA Style

Garofalo SP, Ardito F, Sanitate N, De Carolis G, Ruggieri S, Giannico V, Rana G, Ferrara RM. Robustness of Actual Evapotranspiration Predicted by Random Forest Model Integrating Remote Sensing and Meteorological Information: Case of Watermelon (Citrullus lanatus, (Thunb.) Matsum. & Nakai, 1916). Water. 2025; 17(3):323. https://doi.org/10.3390/w17030323

Chicago/Turabian Style

Garofalo, Simone Pietro, Francesca Ardito, Nicola Sanitate, Gabriele De Carolis, Sergio Ruggieri, Vincenzo Giannico, Gianfranco Rana, and Rossana Monica Ferrara. 2025. "Robustness of Actual Evapotranspiration Predicted by Random Forest Model Integrating Remote Sensing and Meteorological Information: Case of Watermelon (Citrullus lanatus, (Thunb.) Matsum. & Nakai, 1916)" Water 17, no. 3: 323. https://doi.org/10.3390/w17030323

APA Style

Garofalo, S. P., Ardito, F., Sanitate, N., De Carolis, G., Ruggieri, S., Giannico, V., Rana, G., & Ferrara, R. M. (2025). Robustness of Actual Evapotranspiration Predicted by Random Forest Model Integrating Remote Sensing and Meteorological Information: Case of Watermelon (Citrullus lanatus, (Thunb.) Matsum. & Nakai, 1916). Water, 17(3), 323. https://doi.org/10.3390/w17030323

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop