To assess the efficacy of different remote sensing approaches for estimating spatially distributed ET, three different models were implemented: the Satellite-Based Energy Balance for Mapping Evapotranspiration with Internalized Calibration (METRIC [
32]), the Two-Source Energy Balance (TSEB [
16]), and Evapotranspiration Mapping with Vegetation Index ([
17,
18]). Summaries of and deviations from their standards are described below; complete and authoritative details for METRIC and TSEB can be found in the source citations. Implementation by the authors of METRIC and TSEB has been reported previously [
10], while implementation and comparisons with VISW are new. To provide context for model estimates, three standardized, non-remote sensing ET estimates were included: Penman–Monteith reference evapotranspiration, ET0 [
33,
34]; FAO-56 ET ([
15] and referenced below in
Section 2.3.3); and USDA-SW ET [
35], crop specific consumptive water use estimates derived from gravimetric sampling at 1-foot intervals over multiple seasons within the Salt River valley of Central Arizona. The models obtain ET using distinctly different approaches:
2.3.1. METRIC
ET estimation with METRIC is based on the assumption of surface energy balance:
where ET is represented in its energy flux form, latent heat (
), and equal to net radiation (
), minus soil heat flux (
G) and sensible heat flux (
H). Units for all components are W/m
; sign convention is positive
for incoming radiation, and positive
G,
H,
for fluxes away from the soil/canopy surface.
METRIC estimates
using satellite-based reflectance and thermal infrared data [
32,
36]:
where the radiation components—all estimated using remotely sensed observations—are downwelling shortwave radiation (
), surface albedo (
), downwelling longwave radiation (
), upwelling longwave radiation (
), and land surface emissivity (
). METRIC estimates of
G were derived from a SEBAL empirical relationship with surface albedo and NDVI [
37]:
where
is surface temperature in Celsius. The CAIDD study obtained
and
G using Equations (
4) and (
5) and parameters as described in [
32].
The remaining term on the right-hand side of Equation (
3),
H, is solved using METRIC’s distinctive feature of representing vertical temperature gradients between the surface soil/vegetation canopy and the overlying atmosphere in terms of horizontal temperature gradients. The usual resistance equation for
H is described:
where
is density of moist air,
is specific heat of the moist air,
is the air temperature gradient between the surface and overlying air, and
is a resistance term representing the effectiveness of heat transport between the surface and overlying air. The crux of remotely sensed surface energy balance models is how they resolve
and
. In some models, such as TSEB (described below),
is determined from radiometric surface temperature and ambient air temperature. The
term is estimated from models of surface roughness, wind speed, and atmospheric stability. Following the SEBAL method, METRIC solves for these terms indirectly by selection of so-called ‘cold’ and ‘hot’ pixels that are presumed to represent reference
H end-members. Knowing
H for the extreme conditions allows estimation of
H fluxes at all other pixels. These reference pixels are respectively denoted
and
, and are used together to compute the coefficients needed to convert LST observations into an apparent
:
where the parameters
a and
b are solved used the selected hot and cold pixels:
and
are temperature gradients derived from solutions of Equation (
3) using reference conditions and observations. This approach simplifies and potentially increases accuracy in ET computations because biases due to errors in land surface temperature (LST) estimation are canceled by the hot/cold differencing. Thus, a reference cold pixel is assumed to represent ET at a standardized rate,
, a rate typically corresponding to either a short (0.12 m) or tall (0.5 m) crop ET as defined in [
23]; the additional 5% above reference ET accommodates conditions with evaporation at the soil surface. The surface energy balance equation for cold pixels becomes:
where
is the energy flux equivalent of 1.05
. At the ‘hot’ pixel, METRIC allows residual evaporation based on antecedent surface moisture, and follows the pattern in Equation (
10). For this study, however, antecedent moisture was considered unknown and difficult to accurately model. Thus, the traditional SEBAL approach was used, where
was assumed to be zero:
Having solved
H,
, and
G, Equation (
3) could then be solved, resulting in an estimate of instantaneous
flux. Options to transform this estimate into daily ET include the use of diurnal incoming solar radiation [
38] or use of a reference quantity, for example by assuming a constant evaporative fraction [
39], i.e., presumption that the ratio
is constant over most of the day. These options have been reviewed and discussed in [
40]. The METRIC approach, and in this study, also used for TSEB, is to compute a reference ET fraction (
):
where
in instantaneous ET (mm/s) computed:
where
is latent heat of vaporization of water (J/kg) and
is density of liquid water (kg/m
). Daily ET (
) is then computed by multiplying
by the cumulative
:
A significant obstacle for wide deployment of METRIC is the stated need for trained experts knowledgeable in surface energy balance. This requirement not only limits the number of practitioners, but also introduces subjectivity and non-reproducibility of ET results [
41]. This study assessed the need by implementing an alternative objective and reproducible approach based on the hypothesis that quantile selection of hot and cold pixels can return meaningful ET reference locations. The implementation is similar to that previously described in [
41] who demonstrated the approach over almonds in a California study. Given a 30 m resolution TM scene, choose as cold reference pixels all those within 95–99% Normalized Difference Vegetation Index (NDVI) quantiles and also with 1–10% LST quantiles as illustrated in
Figure 2. While the quantile selection could be fully automated, subjectivity when choosing quantiles was retained in order to balance the need for large (>1000) sample sizes while minimizing outliers. Next, for each selected cold pixel, neighboring pixels within a fixed search radius were scanned for maximal LST. The radius was set to 300 m, a value to approximate the typical CAIDD field size × 2 to improve the probability of finding bare soil pixels. Third, using all previously selected cold/hot value pairs, median hot and cold pixel values were computed and used to estimate the
a and
b coefficients in Equations (
8) and (
9).
2.3.2. TSEB
The TSEB model is a thermal-based remote sensing ET approach that is distinguished by its partitioning of surface fluxes into separate transpiration and evaporation components. This partitioning allows for differing transport resistances between soil, canopy, and atmosphere and accommodates use of radiometric instead of aerodynamic temperature. The separate components are obtained by remote sensing of fractional cover, where canopy and soil fractions are respectively
f and
. In this study,
f was obtained from NDVI observations following [
42,
43]:
where
is the NDVI value corresponding to full canopy cover, and
the value for bare soil.
is a leaf angle distribution function parameter, ranging between 0.8 for planophile to >1.4 for erectophile canopy geometry. For the wheat and alfalfa plots,
was set to 1.4, and, for cotton, 1.0 was used. TSEB modeling of the canopy requires estimation of leaf area index (LAI) to obtain net radiation values using [
42]:
where
is a leaf orientation parameter. Due to uncertain calibration of Equation (
16), this study used
= 0.67, which models randomly oriented leaves. For row crops, LAI estimates need to be reduced to adjust clumped vegetation at sub-pixels scales [
44,
45,
46]. Lacking details for cotton plots at CAIDD, a nominal clumping factor of 0.85 was used. No clumping factor was applied to wheat or alfalfa plots.
Having solved
f and
, the implementation of TSEB used weather data from the AZMET Coolidge station (with the exception for air temperature as noted below) and computed soil and canopy
values. In an iterative solution—required to solve atmospheric stability equations—the soil heat flux (
) was computed:
where
is the soil-air temperature difference, and a single resistance term (used by METRIC) is replaced by two: one for the canopy-air transport, another for soil-air transport. A third resistance term (
) to regulate fluxes between soil and canopy is added to create a series resistance formulation (see [
16] for details). Thus, soil evaporative heat flux (
) is computed by residual within the soil fraction:
For the canopy flux fraction, TSEB models transpiration at close to potential rates as estimated by the Priestley–Taylor parameter
:
where
is the fraction of vegetation that is green (assumed to be 1.0 in this study), and
is net radiation at the canopy. Complementary to the soil flux fraction, TSEB estimates canopy heat flux (
) by residual:
when solutions of TSEB lead to negative
values, an indicator of condensation, the
term is adjusted until
equals or exceeds zero. Having obtained component latent fluxes, the total surface
is:
To obtain ET,
fluxes from the soil and canopy sources are summed and then converted as noted in Equation (
13).
For the CAIDD study, the series network TSEB was implemented using methods in [
16] with two exceptions. The first was to adopt median cold pixels (as computed for METRIC) for the near surface air temperature. This approach has been used in other contexts [
47] and helps accommodate the fact that the nearest air temperature observations lay 50 km away from CAIDD and might have been significantly different. The second amendment was to set plant heights for wheat, cotton, and alfalfa to nominal heights as indicated in
Table 3 and surface roughness over bare soil of 0.02 m. These simplifications were imposed because heights were unknown but approximations are needed to run TSEB.
2.3.3. VISW
The Vegetation Index for the South Western US (VISW) model is an empirical ET estimation approach based on experimental data collected in Arizona [
17,
18,
48,
49,
50], where wheat, cotton, and alfalfa crops were grown under controlled irrigation regimes and frequently monitored with proximal sensing of NDVI. The approach aims to use remotely sensed vegetation indices as a proxy for the crop coefficient
, thereby replacing standardized values with real-time observations. Use of empirically constrained vegetation indices to derive crop coefficients is a concept pursued previously by several researchers, including [
14,
51,
52,
53,
54,
55]. The original concept for a crop coefficient is a single scaling factor:
where
is potential evapotranspiration. In the FAO-56 [
15] approach, the crop coefficient is re-partitioned into transpiration and evaporation components:
where
, a basal crop coefficient, represents transpiration, while
, represents surface evaporation component.
The simplicity and efficacy for using crop coefficients to model, forecast, and manage crop water requirements have been noted for many years, notably documented in [
15,
51,
56]. VISW in this study did not include soil evaporation (
) because doing so would require knowing irrigation dates at plot scales, information unavailable. Consequences for the simplification depends upon irrigation practices, crop type, crop stage, and time of year. The greatest impact could be low total ET estimates of 10–15% on weekly time steps (Hunsaker personal communication). However, the underestimation is less during winter or where crop canopy is full. Examples of
coefficients are illustrated for wheat, cotton, and a hypothetical alfalfa crop with six cuttings:
A chief shortcoming of the crop coefficient methodology is its need for local calibration [
15,
56], which has led to suggestions by [
53,
54,
57,
58] that remote sensing observation of vegetation, via an index such as NDVI, could greatly alleviate this constraint. Adoption of VI-based ET estimation is becoming well-known and to a limited extent, commercialized or facilitated by governmental agencies at local (Crop Circle (Holland Scientific, Lincoln, NE); Greenseeker (Trimble, Sunnyvale, CA)) and regional scales (CIMIS California TOPS Satellite Irrigation Management Support (SIMS), [
6],
https://ecocast.arc.nasa.gov/simsi). Vegetation indices are inherently slow reactors to changing surface conditions such drought or delayed irrigation and hence are not ideal for monitoring, forecasting or managing crop water use under water deficits. Thus, while future water shortfalls may change operations, we believe few farms in the US Southwest practice deficit irrigation. This means that the VI approaches calibrated under standard conditions could be a viable water management tool.
In this study,
is derived from the tall-crop (0.5 m) standardized ET equation as presented in [
23]:
where the notation is changed for convenience in this manuscript from ‘
’ in [
23] to the potential ET notation ‘
’. The new terms are
for the psychrometric ‘constant’,
for saturation vapor pressure,
for ambient vapor pressure,
for 2-m wind speed, and numerator/denominator constants
and
accommodating different computational time steps and crop heights (citation
Table 1 is partially reproduced here as
Table 4). Computation of
in Equation (
24) is done by following the procedures in [
23] for hourly time steps.
To provide a basis for comparison of ET results between models, the FAO-56 parameters for wheat, cotton, and alfalfa (
Table 5) were defined based on documentation and common planting date practices in Central Arizona. The FAO-56 methodology is illustrated for wheat and cotton in
Figure 3, but not for alfalfa: it is a multi-year crop with multiple and difficult-to-predict cutting times.
Transformation of vegetation indices to
is done empirically based on replicated field experiments of different crops and reflectance measurements with accurately calibrated radiometers or imagers. Due to variations in sensor spatial resolution, spectral band placement, sensor calibration, atmospheric clarity, and soil moisture conditions, raw NDVI values were not consistent between experiments without a normalization procedure. For the calibrations used in this study, normalization re-scales original NDVI values to lie between 0 and 1:
where TM5 NDVI values were re-scaled to new values (
) that lie between empirically chosen minimum (
) and maximum (
) limits, roughly defined as bare-soil and full-cover.
For the three crops considered in this study, the transformation equations used are polynomials as shown in
Figure 4.
For wheat, a cubic equation [
18,
50] is used to accommodate significantly increased ET while NDVI approaches saturation:
For cotton, a quadratic was adequate for experimental results [
17]:
For alfalfa, a quadratic was also used (Hunsaker, personal communication, based on experiments reported in [
48]):
NDVI normalization is needed to ensure that the full range of crop coefficients can be represented while viewing a range of different background soil colors and moisture [
18]. To avoid biasing results based on local conditions, this study adopted a quantile selection approach, where all plot-mean NDVI values were aggregated by crop type—Wheat, Cotton, Alfalfa—and their resulting distributions were modeled with Beta functions. The Beta function is specified by two shape parameters, by convention identified as
and
. Because the Beta distribution domain spans 0–1, NDVI values were transformed from a nominal range found for the NDVI 2008 CAIDD data set, 0.2 to 0.85, to 0.0 to 1.0. This transformation step simplifies the function fitting process. Using empirical histograms for each crop, Beta shape parameters were determined in multiple steps. Beta shape parameters were computed by first determining sample means and variances, and then estimating
and
terms via the method of moments (Equation (
29)):
These estimates initialized their final determination with the maximum likelihood method, implemented via the R ‘fitdistr’ function found in the MASS package. Quantiles at the 10% and 90% probability levels were then computed and transformed back to source NDVI equivalents. Selection of these levels was subjective but not arbitrarily chosen: probabilities spanning a larger range of 5% and 95% were found to produce anomalously low crop coefficient values for wheat and alfalfa, while probabilities spanning a lesser range fail to reproduce representative crop coefficients for sparse cover. The normalization procedure is illustrated for wheat crops (
Figure 5). Using instances with early January planting (green lines,
Figure 5A), an empirical histogram was generated (
Figure 5B), bins were re-scaled to lie between 0 and 1, a Beta distribution function was fit, quantiles at 10% and 90% probabilities identified, then indices transformed back to original NDVI space by inverting Equation (
25). After the inversion, the recomputed quantiles are respectively the values for
and
. Resulting NDVI normalization parameters are shown in
Table 6.