Combined Drought Index Using High-Resolution Hydrological Models and Explainable Artificial Intelligence Techniques in Türkiye

Başakın, Eyyup Ensar; Stoy, Paul C.; Demirel, Mehmet Cüneyd; Ozdogan, Mutlu; Otkin, Jason A.

doi:10.3390/rs16203799

Open AccessArticle

Combined Drought Index Using High-Resolution Hydrological Models and Explainable Artificial Intelligence Techniques in Türkiye

by

Eyyup Ensar Başakın

^1,2,

Paul C. Stoy

^2,*

,

Mehmet Cüneyd Demirel

¹

,

Mutlu Ozdogan

³ and

Jason A. Otkin

⁴

¹

Hydraulics Division, Civil Engineering Department, Istanbul Technical University, Istanbul 34469, Türkiye

²

Department of Biological Systems Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA

³

Department of Forest and Wildlife Ecology & Nelson Institute for Environmental Studies, University of Wisconsin-Madison, Madison, WI 53706, USA

⁴

Space Science and Engineering Center, University of Wisconsin-Madison, Madison, WI 53706, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(20), 3799; https://doi.org/10.3390/rs16203799

Submission received: 21 August 2024 / Revised: 8 October 2024 / Accepted: 10 October 2024 / Published: 12 October 2024

(This article belongs to the Special Issue Recent Advances in Drought Risk Assessment, Monitoring, and Forecasting II)

Download

Browse Figures

Versions Notes

Abstract

:

We developed a combined drought index to better monitor agricultural drought events. To develop the index, different combinations of the temperature condition index, precipitation condition index, vegetation condition index, soil moisture condition index, gross primary productivity, and normalized difference water index were used to obtain a single drought severity index. To obtain more effective results, a mesoscale hydrologic model was used to obtain soil moisture values. The SHapley Additive exPlanations (SHAP) algorithm was used to calculate the weights for the combined index. To provide input to the SHAP model, crop yield was predicted using a machine learning model, with the training set yielding a correlation coefficient (R) of 0.8, while the test set values were calculated to be 0.68. The representativeness of the new index in drought situations was compared with established indices, including the Standardized Precipitation-Evapotranspiration Index (SPEI) and the Self-Calibrated Palmer Drought Severity Index (scPDSI). The index showed the highest correlation with an R-value of 0.82, followed by the SPEI with 0.7 and scPDSI with 0.48. This study contributes a different perspective for effective detection of agricultural drought events. The integration of an increased volume of data from remote sensing systems with technological advances could facilitate the development of significantly more efficient agricultural drought monitoring systems.

Keywords:

combined drought indices; SHAP; XGBoost; crop yield; agricultural drought

1. Introduction

Drought is a natural disaster characterized by insufficient water resources in a given region to meet demand, typically associated with periods of low precipitation [1,2]. In addition to the widely accepted simple definition, drought can also be described as a complex process involving trends of soil moisture depletion, reduced streamflow, and declining groundwater levels, often accompanied by increasing temperatures [3]. This situation leads to negative impacts on ecosystems, agriculture, water resources, and human life [4]. Drought can be classified in several ways: meteorological drought as a result of periods of below-average precipitation [5]; hydrological drought, which occurs due to reduced surface water flows [6]; agricultural drought, caused by insufficient soil moisture to support normal crop development [7]; and socioeconomic drought, which affects human lives and leads to economic disruption [8]. These are conventional classifications, but today the terms flash drought, characterized by rapid onset and assessed on a 5-day basis [9], and ecological drought, characterized by reduced water availability leading to reduced vegetation activity (with consequent disruption of ecosystem services) [10,11], are also commonly used. Each of these types (except socioeconomic) is defined and assessed based on specific hydroclimatic variables [12], while socioeconomic drought is related to local water supply and monitors water demand through socioeconomic systems [13]. As a natural disaster with increasing costs each year, drought has a significant and widespread impact on many economic activities and people at any given time [14,15].

Due to its complex and nonlinear nature, the mathematical characterization of drought events remains a substantial challenge. Dynamic drought processes are often simplified by defining drought indices, which are simplified formulae derived by analyzing long-term trends of variables like precipitation, evaporation, and temperature using statistical methods [16]. Drought indices, which can characterize drought in terms of duration, severity, and intensity [17], are indispensable in drought assessment as they facilitate drought risk analysis and monitoring by comparing current and future drought characteristics with past events [18]. Some of the drought indices commonly used in the literature include the Standardized Precipitation Index (SPI) [19], Standardized Precipitation Evapotranspiration Index (SPEI) [20], Palmer Drought Severity Index (PDSI) [21], Soil Moisture Anomaly (SMA) [22], Evapotranspiration Deficit Index (ETDI) [23], Streamflow Drought Index (SDI) [24], Enhanced Vegetation Index (EVI) [25], Normalized Difference Vegetation Index (NDVI), Drought Reconnaissance Index (DRI) [26], Evaporative Demand Drought Index (EDDI) [27], and Effective Drought Index (EDI) [28], all of which were designed with different drought impacts in mind. These indices can represent a single class of drought, while some can represent multiple classes of drought. For instance, SPI is commonly used to define meteorological drought, SDI for hydrological drought, SMA for agricultural drought, and PDSI for both meteorological and agricultural drought [29]. These indices may be effective for describing particular drought features, but were not necessarily designed to incorporate all drought impacts.

Drought indices can be classified as single, multivariate, or combined based on the number of parameters they use. While single drought indices are easier to compute, their success in reflecting drought conditions may be less than that of multivariate and combined indices [30]. Multivariate indices are derived by combining different variables. For example, the SPEI uses both precipitation and evapotranspiration values, while the PDSI includes precipitation, temperature, and water-holding capacity of soil. Combined drought indices (CDIs) are formed by merging several single drought indices to highlight the multidimensional characteristics and multifaceted effects of drought [31], and, therefore, may be of particular use for characterizing key drought impacts that single and multivariate models may miss.

With the advancement of technology, there has been a significant increase in the number of parameters measured at Earth’s surface. These measurements can be made by both ground-based stations and remote sensing systems. In contrast to ground-based measurements, remote sensing products (attempt to) provide continuous measurements in both time and space, despite varying spatial, temporal, and spectral resolutions. The data obtained from remote sensing systems, such as land surface temperature, precipitation, soil moisture, and vegetation reflectance characteristics, have introduced numerous drought indices into the literature. Some of the most commonly used indices include the Temperature Condition Index (TCI) [32], the Vegetation Condition Index (VCI) [32], the Normalized Difference Water Index (NDWI) [33], and the Soil Moisture Condition Index (SMCI) [34], all of which measure different physical phenomena. CDIs, derived from the combination of these and/or other indices, play a critical role in the detection of agricultural drought events [35]. One of the earliest examples of such combined indices is the Vegetation Health Index (VHI), which combines the TCI and VCI [36]. Danodia et al. [37] conducted a study in India using TCI, VCI, PCI, and SMCI values to calculate CDI values and found significant positive relationships between CDI values and crop yield. Karimi et al. [38] used TCI, PCI, Soil Water Index, and VCI inputs to calculate CDI for their drought analysis in Iran and reported that the resulting CDI values better represented the 2008 and 2014 droughts than SPI. In a study conducted in Niger, Houmma et al. [35] constructed a CDI index using Normalized Rainfall Efficiency Index, PCI, TCI, and smoothed vegetation index values. They reported that the index had a high correlation with VHI and PDSI, thus providing important information for agricultural drought monitoring. Li et al. [30] proposed a new CDI index using PCI, TCI, and Evapotranspiration Condition Index values as inputs. They found that the new index had a higher correlation with SPI-1, a one-month SPI, than the individual indices. In addition, the proposed index showed a negative correlation with drought affected/disaster area and net primary productivity.

In reviewing the literature on CDI studies, there are a limited number of studies using gross primary productivity (GPP) and NDWI despite opportunities for integrating carbon and other water-based metrics into drought indices. In Wei et al. [39], GPP was not used as an input, but, rather, to validate the derived CDI. NDWI is sensitive to the water content of the vegetation canopy [40], and numerous studies have shown that NDWI is effective in monitoring agricultural drought [41,42]; for example, Delbart et al. [43] found that NDWI was more responsive than NDVI to crop responses to drought. Similarly, in another study, Olmos-Trujillo et al. [44] found that NDWI was more sensitive to agricultural drought than NDVI in semiarid regions. The GPP, as a critical component of terrestrial ecosystem productivity and surface carbon cycling, represents photosynthesis at the ecosystem level [45]. Recent studies have highlighted the importance of the relationship between drought and GPP [46,47,48]. The study by Pendergrass et al. [49] found that changes in GPP should also be considered when examining the causes of regional meteorological and agricultural droughts.

Most methods for creating a CDI derive a linear equation, with the optimal calculation of the equation’s coefficients being one of the main challenges. To address this issue, Kogan [36] calculated the Vegetation Health Index by taking the arithmetic mean of TCI and VCI, giving equal importance to both inputs. In their study, Hao et al. [50] optimized the CDI equation weights through trial and error by assigning different weights to the TCI, SMCI, and PCI indices. This trial-and-error method requires evaluating the CDI values against a target. Zhang et al. [51] used seven different weighting scenarios in a trial-and-error approach, with the PDSI, Z-index, and SPI as target variables to determine the optimal scenario. In studies conducted without a target variable, researchers often utilize methods such as principal component analysis (PCA) [52,53], entropy [54,55], and analytic hierarchy process (AHP) [56,57]. In CDI calculations, PCA captures the common variance of the single indices and creates principal components for analysis, assigning weights to each single index. These weights are determined by each input’s contribution to the principal components, with higher variance components receiving more weight. CDIs based on PCA assume linearity among random variables, which can lead to information loss post-analysis [58]. This linearity assumption can be problematic in drought studies involving complex hydrological and meteorological systems with nonlinear characteristics [59]. AHP methods, on the other hand, rely on expert opinions to weigh criteria. Some disadvantages of this method include high computational requirements even for small problems, a subjective nature that can depend on emotional judgments, and increased time and effort required for additional pairwise comparisons [60].

A review of the CDI literature reveals that very few studies have used machine learning (ML) methods to calculate the weights of the equation. Of those that have employed ML, Houmma et al. [61] used the random forest (RF) method to determine the weights of CDI inputs. However, metrics used in the RF model, such as Gini impurity and permutation importance, can be biased, especially for features with high cardinality (many categories). This can inflate the importance of certain features, thereby leading to an inaccurate assessment of their true contribution to the model [62].

Recently, methods have been developed to explain nonlinear relationships and work alongside ML techniques to evaluate the power of inputs in predicting the output, taking into account their intricate relationships. One of the most frequently used methods that can adapt to almost any type of ML model is the SHapley Additive exPlanations (SHAP) method. SHAP uses Shapley values from game theory, which are used to fairly determine each player’s (in this case, the features in the model) contribution to a game (the model’s prediction). A feature’s contribution to the model is calculated as the average of its marginal contribution across all possible combinations. While random forest’s feature importance ranks the general contributions across the entire model, it does not provide local (individual prediction level) explanations. In contrast, SHAP provides individual explanations for each prediction, allowing for a more detailed and fair understanding of the model’s decisions [63].

The study aims to answer the following questions:

Does adding NDWI and GPP values improve CDI calculations?
How successful is the developed CDI in detecting agricultural drought?
What are the relationships between CDI, formed by different combinations of single drought index indicators, and crop yield?
What is the nonlinear behavior of each input on the output?

To find answers to the questions listed above, this study aims to define a new CDI that will reflect winter wheat yield values. In addition to the commonly used TCI, PCI, VCI, and SMCI, the NDWI and GPP are included to calculate CDI values. To demonstrate the effectiveness of including the CDI, the correlation coefficients between CDI and winter wheat yield are calculated. This is intended to more accurately detect agricultural drought events. Furthermore, high-resolution hydrological model outputs are used to obtain SMCI, thus employing soil moisture values derived with the Mesoscale Hydrological Model (mHM) instead of directly using satellite-derived soil moisture values, which were not available across the study period. The SHAP method is used to determine the equation weights for the CDI, and its effectiveness is evaluated by comparing it with PCA and empirical method results. By using SHAP analysis to assess the nonlinear behavior of each input variable (TCI, PCI, VCI, SMCI, NDWI, and GPP) on the CDI, we provide a deeper understanding of how each individual index contributes to the overall CDI and its impact on crop yield, revealing complex interactions and the relative importance of each input variable. Furthermore, the developed index is compared against the widely used SPEI and scPDSI to quantify differences and lend further insight into drought indices for agricultural yield estimation.

2. Materials

2.1. Study Area

This study focuses on four of the most critical areas for Türkiye’s food security, namely, Aksaray, Karaman, Konya, and Niğde, which are characterized by fertile soils and cover an area of 54,908 km² [64]. These four regions are part of Türkiye’s driest interior zone, with an average annual rainfall of 380 mm. The elevation of the study area ranges from 284 to 3583 m, with agricultural activities primarily conducted in the lower northern sections [65]. These provinces mainly practice rainfed agriculture for wheat. According to the annual reports of the Turkish Statistical Institute (TUIK), approximately 70% of the region is engaged in rainfed agriculture [66]. There are 49 administrative districts in these provinces; we excluded districts with minimal agricultural land as indicated by land cover maps in the present study. The list of districts included in the study and their statistical information is presented in Table S1 and Figure 1b. The districts with limited or no cropland land areas are excluded from the study area and labeled as “Unstudied” in Figure 1b. From the elevation map, it is obvious that arable land is very limited due to topographical conditions.

2.2. Data Description

The precipitation values used to calculate PCI were obtained from the Climate Hazards Group InfraRed Precipitation with Station (CHIRPS) database [67]. CHIRPS is a widely used precipitation dataset for climate data analysis and research. It contains global precipitation estimates from 1981 to the present, available on daily, 10-day, and monthly bases. The dataset is produced by combining two main data sources: satellite-based estimates and ground station data. CHIRPS uses precipitation data provided by the U.S. National Oceanic and Atmospheric Administration Climate Prediction Center and enhances satellite estimates by incorporating precipitation data from thousands of ground stations worldwide [68]. Due to the discrete nature of precipitation data, there can be challenges in obtaining precipitation from remote sensing systems, which necessitates research on their correlation with ground observations. Studies conducted specifically for Türkiye have shown that monthly CHIRPS data have a high correlation with ground observations [69]. In addition, CHIRPS data used in drought studies specific to Türkiye have been verified to produce satisfactory results by Aksu et al. [70] and Orieschnig and Cavus [71].

Land surface temperature (LST) and NDVI were obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) database for the calculation of TCI and VCI. In addition, NDWI was also similarly obtained from MODIS. Launched in 1999 on the Terra satellite and in 2002 on the Aqua satellite, both satellites scan the entire Earth’s surface every 1 to 2 days, collecting data in 36 spectral bands. The instrument’s detectors collect data at three different spatial resolutions: 250 m, 500 m, and 1000 m. MODIS bands 31 (centered at 11,030 µm) and 32 (12,020 µm) are thermal infrared bands used in LST calculations [72]. NDVI, an indicator widely used to monitor vegetation health and density, is obtained using MODIS Band 1 (red: 645 nm) and Band 2 (NIR: 858 nm). Finally, NDWI values are obtained using MODIS Band 2 (NIR: 858 nm) and Band 5 (SWIR: 1240 nm).

In previous studies, researchers have used soil moisture (SM) data from various databases, including the Global Land Data Assimilation System (GLDAS) [73], Advanced Microwave Scanning Radiometer for EOS [74], Famine Early Warning Systems Network Land Data Assimilation System [75], ERA5-Land [76], and Soil Moisture and Ocean Salinity [77]. In this study, SM values were obtained using the mHM, a physically-based high-resolution distributed hydrologic model used to simulate hydrological processes in watersheds at different scales. The mHM divides watersheds into small cells and performs calculations for each cell separately, allowing for more realistic results by accounting for spatial heterogeneity. The model is based on physical principles and simulates hydrological processes including evaporation, transpiration, runoff, and groundwater flow [78]. While the model is designed to predict surface runoff using inputs such as precipitation, potential ET, and mean air temperature, it also outputs SM at various depths, including 0–50 cm. In this study, these SM values were used to calculate the SMCI.

Gross primary productivity (GPP) values were obtained from the Penman–Monteith–Leuning Evapotranspiration V2 (PMLV2) dataset [79]. The Penman–Monteith–Leuning (PML) model was first developed by Leuning et al. [80], who introduced a biophysical model for surface conductance based on the Penman–Monteith (PM) equation [81]. Since then, two improved versions of the PML model have been developed: PML-V1 and PML-V2. PML-V2, as described by Gan et al. [82], uses a biophysical canopy conductance model to integrate evapotranspiration (ET) with GPP. To simplify the model, only key parameters of physiological importance for stomatal response and assimilation are retained. In contrast to PML-V1, PML-V2 considers the effects of increasing leaf area index (LAI), increasing CO₂, and climate forcing on GPP [83].

Temporal and spatial resolutions, dates, and data sources are given in Table 1. In addition, the wheat yields used in the study were obtained from the TUIK database. The yields are in t/ha and were obtained at the district level. The land cover/land use map of the Global Land Analysis and Discovery (GLAD) team with a resolution of 30 m was used to identify the cultivated areas [84].

3. Methods

3.1. Calculations of Drought Indices

3.1.1. The Precipitation Condition Index

The PCI was calculated by comparing current precipitation to long-term maximum and minimum precipitation. The precipitation used to calculate the PCI was obtained from the CHIRPS dataset. The equation used to calculate the PCI is shown below.

P C I_{i} = \frac{C H R I P S_{i, j} - C H R I P S_{i, j, m i n}}{C H R I P S_{i, j, m a x} - C H R I P S_{i, j, m i n}}

(1)

In this context, i represents the pixel, and j represents the month. CHRIPS_i,j,max denotes the maximum precipitation for the i-th pixel in the j-th month, while CHRIPS_i,j,min denotes the minimum precipitation for the i-th pixel in the j-th month. This index ranges from 0 to 1, with values close to 0 (1) indicating dry (wet) periods.

3.1.2. The Temperature Condition Index

The TCI is calculated using LST values obtained from the MOD11A1 dataset. Similar to the calculation method of the PCI, the TCI ranges from 0 to 1, but higher TCI values are associated with drought conditions, while lower values indicate nondrought periods. The TCI, like the PCI, ranges between 0 and 1.

T C I_{i} = \frac{L S T_{i, j, m a x} - L S T_{i, j}}{L S T_{i, j, m a x} - L S T_{i, j, m i n}}

(2)

3.1.3. The Vegetation Condition Index

The VCI was proposed by Kogan [32]. This index indicates how close the current month’s NDVI value is to the minimum NDVI calculated from long-term records. It is developed by scaling each location’s and pixel’s VCI values from 0 to 1:

V C I_{i} = \frac{N D V I_{i, j} - N D V I_{i, j, m i n}}{N D V I_{i, j, m a x} - N D V I_{i, j, m i n}}

(3)

High VCI is associated with wet periods, whereas low VCI is associated with drought periods.

3.1.4. The Soil Moisture Condition Index

The calculation steps of the SMCI are the same as those of the PCI. SMCI values also range from 0 to 1, where low (high) SMCI indicates drought (wet) periods. The steps for calculating SMCI are provided below.

S M C I_{i} = \frac{S M_{i, j} - S M_{i, j, m i n}}{S M_{i, j, m a x} - S M_{i, j, m i n}}

(4)

The SM values produced by the mHM are used to calculate the SMCI. Technical information about the mHM is provided in Section 3.2.

3.1.5. The Normalized Difference Water Index

NDWI is strongly associated with plant water content [85]. Shortwave infrared (SWIR) reflectance captures changes in both spongy mesophyll composition and plant water content. In contrast, NIR (near-infrared) reflectance is determined by the internal structure of the leaf and its dry matter content rather than its water content. The combination of NIR and SWIR is designed to improve the accuracy of SM determination by eliminating variations due to internal leaf composition and leaf dry matter quality.

N D W I_{i} = \frac{N I R_{i} - S W I R_{i}}{N I R_{i} + S W I R_{i}}

(5)

NDWI values range between +1 and −1. An NDWI value close to +1 indicates higher plant water content, while a value close to −1 indicates the absence of plant water content. In this study, the NDWI was normalized using the same min–max normalization applied to the other indices, ensuring they fall within the same range (0–1). NDWI values close to 1 are associated with wet periods, whereas values close to 0 are associated with dry periods.

3.1.6. The Gross Primary Productivity

The GPP data used in this study consist of data obtained using the PMLv2 method [78]. The PMLv2 model is a biophysical model designed to estimate evapotranspiration (ET) and GPP by coupling water and carbon processes. It improves on earlier versions by incorporating stomatal conductance and the effects of carbon dioxide on both water loss and photosynthesis, allowing for a more accurate simulation of land–atmosphere interactions. PMLv2 uses remote sensing data, such as MODIS-derived LAI and albedo, and is driven by meteorological inputs, including air temperature, humidity, wind speed, and solar radiation. Detailed information regarding PMLv2 is provided in the Appendix A.

3.2. Mesoscale Hydrological Model

The Mesoscale Hydrological Model (mHM) is a physically-based, distributed hydrological model designed to understand and model the components of the water cycle, particularly for assessing water resources over large areas and extended periods [86]. The mHM utilizes a distributed approach to model spatially variable hydrological processes using meteorological (e.g., precipitation, temperature, evaporation) and soil data to compute hydrological processes for each grid cell [87]. The importance of SM modeling is the rainfall–runoff module, which simulates processes such as precipitation falling on the soil surface, infiltration, surface runoff, and river flow [86]. The soil moisture module calculates the distribution and changes in moisture within the soil profile. In this study, SMCI was calculated using the SM values generated by the mHM. Precipitation, temperature, and PET were obtained from ERA-5 for use in the mHM. SM values were produced at a spatial resolution of 1 km for each grid cell within the study area. Figure S1 provides an insight into the working principles of the mHM. During the development of the mHM, the calibration was carried out using measurements from 10 streamflow stations within the study area, which includes four provinces: Aksaray, Karaman, Konya, and Nigde. The calibration set achieved a R of 0.78 and an RMSE of 3.2 m³/s between the simulated and observed values. For the validation set, an R of 0.77 and an RMSE of 3.3 m³/s were obtained.

3.3. Crop Yield Detrending

In this study, the weights of the CDI values were optimized using crop yield as the target variable. When comparing drought indices with yield, it is essential to remove the trend component from the original series. This trend is often associated with technological advancements, soil improvements, and fertilization effects in agriculture [88]. Common methods for detrending include polynomial fitting (linear or cubic) [89], moving average [90], and locally weighted regression model (LOWESS) [91,92]. Due to certain assumptions associated with linear trends [93], the difficulties in using cubic splines with small datasets (<60) and their tendency to remove components other than the trend [94], this study opted for the nonparametric LOWESS method [94]. A standardization process was carried out to compare the detrended series with standardized series such as SPEI and scPDSI, and yield anomalies were obtained. The standardization process is given in Equation (6).

d e t r e n d y i e l d_{s t d} = \frac{d e t r e n d_{i} - \bar{d e t r e n d}}{σ_{d e t r e n d}}

(6)

where detrend_i represents the detrended yield in the i-th month,

\bar{d e t r e n d}

represents the temporal average of the detrended yield in a district, and

σ_{d e t r e n d}

represents the standard deviation of the detrended yield in a district.

3.4. Benchmarking Methods

3.4.1. Standardized Precipitation Evapotranspiration Index

The SPEI was developed by Vicente-Serrano et al. [20] to assess drought by considering the climatic balance between precipitation and potential evapotranspiration (PET). Unlike the SPI, the SPEI considers temperature in addition to precipitation. Similar to SPI, SPEI can be calculated on different time scales, with negative values indicating drought periods. Due to the sensitivity of vegetation to evapotranspiration, the SPEI is considered to have a relative advantage in being closely associated with drought events. After calculating the climatic water balance over different accumulation periods, the data are fitted to a parametric distribution and the probabilities are converted to a percentile of the standard normal distribution to obtain the SPEI values [95]. Typically, the three-parameter log-logistic distribution is used as the reference distribution. The parameters are estimated using L-moments, following the approach proposed by Beguería et al. [96]. The SPEI used in this study was obtained from the global dataset produced by Beguería [97], which has a spatial resolution of 0.5 degrees and a monthly temporal resolution. This dataset uses precipitation and temperature data produced by the Climate Research Unit to perform SPEI calculations. To assess the effectiveness of CDI in this study, 1-, 3-, 6-, and 9-month SPEI values were obtained, represented by the abbreviations SPEI_1, SPEI_3, SPEI_6, and SPEI_9.

3.4.2. Self-Calibrated Palmer Drought Severity Index

The Palmer Drought Severity Index (PDSI) [21] characterizes the balance of moisture deficit and surplus based on the difference between actual precipitation and climatically appropriate precipitation. Although this index is primarily a meteorological drought index, it is often used to monitor agricultural drought events [98]. The parameters of the original PDSI were obtained based on observations from the Midwestern United States, which limits its applicability to other regions. To address this issue, Wells et al. [99] proposed the self-calibrating PDSI (scPDSI) by incorporating weighting and persistence factors to improve spatial comparability. The core idea of scPDSI is to determine the weighting and persistence factors based on locally observed hydrometeorological variables. The empirical parameters in the analysis procedure can be self-calibrated, thereby improving the spatiotemporal comparability of drought monitoring results using scPDSI. In this study, scPDSI values with a spatial resolution of 0.5 degrees were obtained from van der Schrier et al. [100].

3.5. eXtreme Gradient Boosting

EXtreme Gradient Boosting (XGBoost) is a powerful ML algorithm used for classification and regression problems, created by sequentially boosting decision trees [101]. Gradient boosting attempts to minimize errors by updating model parameters along the gradient of a loss function. XGBoost improves on this by using a second-order Taylor series expansion to use the second derivatives of the loss function for more precise optimization. Thus, the model is optimized faster and more accurately compared to traditional gradient boosting methods that rely solely on first derivative information. In particular, it uses L1 (lasso) and L2 (ridge) regularization terms to control model overfitting [102]. Regularization helps maintain control over tree depth and number of trees, leading to more general results on complex datasets. XGBoost also has a built-in algorithm that handles missing data and automatically processes missing values. While most gradient boosting algorithms work sequentially, XGBoost processes different parts of the dataset in parallel, speeding up the tree-building process [103]. This parallel processing capability allows XGBoost to work effectively on large datasets.

3.6. SHapley Additive exPlanations

SHAP is an explanation method based on game theory, used to better understand the decisions of machine learning models [104]. SHAP determines feature importances by calculating each feature’s contribution to the prediction. This becomes particularly valuable when trying to explain the internal working mechanism of the model. SHAP is based on a concept called Shapley values [105] from cooperative game theory to fairly distribute the contributions of players (features) to the game (prediction model) [106]. This method calculates the marginal contribution of each feature to the model’s prediction and does so for all possible feature combinations [107]. Unlike traditional feature importance methods (permutation importance and Gini importance etc.), SHAP accounts for interactions between features when measuring each feature’s contribution. This enhances the accuracy and reliability of SHAP, as features in most real-world data often interact with one another. Calculating SHAP values typically has high computational costs, but these costs can be minimized through optimization techniques like the TreeSHAP algorithm proposed by Lundberg and Lee [108]. In this study, the TreeSHAP method, which is particularly effective for decision trees and tree-based ensemble models like XGBoost, was also used.

3.7. Principal Component Analysis

PCA is a statistical technique used to better understand and analyze multivariate data. The primary goal of PCA is to explain the largest variance in a dataset by creating new variables, called principal components, which are linear combinations of the original variables [109]. PCA begins by calculating the covariance matrix of the variables in the dataset, which shows the relationships and shared variance between them. Using this covariance matrix, eigenvalues and eigenvectors are computed. Eigenvalues represent the variance of the principal components, while eigenvectors determine the direction of these components. The eigenvector with the largest eigenvalue defines the first principal component and explains the most variance in the dataset. The eigenvector with the second-largest eigenvalue defines the second principal component, explaining less variance and being orthogonal to the first component. This process continues until as many principal components as the number of variables are created. Since the principal components are linear combinations of the original variables, understanding the relationship between these components can help reveal structural features and hidden patterns in the dataset. However, PCA has some limitations. For instance, it only considers linear relationships and may require data normalization [110]. Additionally, interpreting the principal components can be challenging because they are complex combinations of the original variables.

3.8. Model Evaluation Metrics

In this study, statistical metrics were used to evaluate the performance of ML models and to analyze the relationship between drought indices and crop yield. The metrics employed include Spearman’s correlation coefficient (R), root mean square error (RMSE), and Akaike information criterion (AIC). Due to the generally non-normal distribution of the data and the limited number of yield values available at the district level (20 samples, excluding ML models), Spearman’s correlation coefficient was preferred over the Pearson correlation coefficient. Spearman’s correlation coefficient is nonparametric, robust to outliers, and capable of revealing nonlinear relationships [111]. Additionally, RMSE and AIC were used alongside the correlation coefficient to assess the performance of the ML models. The equations for RMSE, AIC, and Spearman’s correlation coefficient are provided below.

R M S E = \sqrt{\sum_{i = 1}^{n} \frac{{(y_{i} - {\hat{y}}_{i})}^{2}}{n}}

(7)

R = 1 - \frac{6 \sum d_{i}^{2}}{n (n^{2} - 1)}

(8)

A I C = n \times l n (\frac{S S_{e}}{n}) + 2 K

(9)

where

d_{i}

is the difference between the two ranks of each observation, n is the number of observations,

y_{i}

is the actual yield,

{\hat{y}}_{i}

is estimated yield,

S S_{e}

is sum square of errors, and K is number of parameters. Its value lies within the range [−1, 1], where R < 0 indicates a negative correlation, R > 0 indicates a positive correlation, and R = 0 indicates no correlation. The larger the absolute value of r is, the stronger the correlation between two variables. The RMSE ranges from 0 to infinity, with models that achieve RMSE closer to 0 being considered superior. Lower AIC values indicate that the model is more useful.

3.9. Drought Classification

Threshold values are commonly employed to categorize the severity of climatic events, such as droughts and excessive moisture. These classifications are crucial in assessing the extent to which regional climate conditions deviate from expected patterns. Such insights are instrumental in shaping long-term water management strategies and informing climate policies. The use of these classifications is essential for comparing climatic anomalies across different temporal scales and geographic regions. Given that the drought severity levels of CDI and SPEI are derived from similar statistical processes and have comparable lower and upper bounds, a widely used classification in the literature was adopted [112]. In view of the considerable quantitative range encompassed by the severity values of scPDSI, a representation comprising a greater number of classes was selected to facilitate a more comprehensive analysis [113,114]. The classifications for CDI, SPEI, and scPDSI are provided in Table 2.

4. Model Development

In this study, machine learning methods and the capabilities of the SHAP method were utilized to develop a new drought index by combining individual indices into a CDI. The data used for calculating single indices within the study area were obtained from remote sensing systems and subsequently underwent normalization processes for the calculation of each index. The normalized images were cropped to cover the study area. The input data utilized to obtain the CDI were rescaled to have a temporal resolution of monthly and a spatial resolution of 1 × 1 km. This was achieved by averaging the daily data on a monthly basis. The bilinear algorithm was applied for spatial rescaling, transforming all cropped images to a target resolution of 1 × 1 km, and they were subsequently masked to include crop areas based on crop land information obtained from GLAD. After the masking process, spatial averages were taken for each month from October to June, considering the district boundaries within the study area. Thus, a single value representing each month was obtained for each district. Using these values, the weights of the CDI equation were calculated using three different methods. Of these, PCA and empirical methods do not require a target variable, but for the SHAP method, crop yield values measured in each district were used as the target variable. To provide inputs for the SHAP algorithm, prediction models were created using the XGBoost method. The inputs of the XGBoost model were TCI, VCI, SMCI, PCI, GPP, and NDWI values, while the target variable was the detrended crop yield values. Since the GPP input has been available since 2000, the period between 2001 and 2020 was selected for modeling. During the ML model creation phase, all possible scenarios for the six inputs (63 different combinations) were tested. During the modeling phase, a random splitting strategy was embraced, with 75% of the data were used for training and 25% for testing. The districts selected for the test set were Aksehir, Altinekin, Altunhisar, Beysehir, Bor, Camardi, Cihanbeyli, Ilgin, Karapinar, and Selcuklu. In the next step, the calculation of mean SHAP values was carried out using the results of the model with high prediction accuracy. Using the SHAP feature importance values obtained, the weights of the CDI equation were determined, thereby obtaining accurate weights using a method that calculates the marginal contributions of all variables in a nonlinear manner. The correlation between the CDI equations obtained by three different methods and crop yield was examined to measure the success of the new index in representing agricultural drought. Additionally, a comparative analysis with SPEI and scPDSI values, which are also used in agricultural drought studies in the literature, was conducted. Furthermore, the similarities and differences between CDI, SPEI, and scPDSI were examined. The analysis steps are shared in Figure 2.

The hyperparameters of the XGBoost algorithm used in the study were optimized using the grid search method. Additionally, early stopping was applied to overcome the overfitting problem. The search space information and optimum values required for parameter optimization of the XGBoost model are provided in Table S2.

In this study, the empirical method, used to represent conventional approaches, is the simplest method. The CDI equation obtained by assigning equal importance to all inputs is as follows. Thus, the weights of all inputs for this method were set at 0.1666 (i.e., 1/6th).

5. Results and Discussion

5.1. Machine Learning Results

For the model where all inputs were used, the training R value was found to be 0.82, and the test R value was 0.68 (the prediction model results for all combinations are provided in Table S3). The training R values obtained by reducing the inputs one by one were all found to be lower than the initial model. Since the SHAP values’ feature importance reflects the results obtained from the training set, the weights were calculated using the results of the model with the highest R value. In this study, to compare with PCA and empirical methods, the model using all inputs was selected, and PCA and empirical method calculations were performed on this model. Thus, a comprehensive analysis was achieved by using TCI, PCI, SMCI, VCI, NDWI, and GPP values, resulting in an extensive evaluation of factors potentially affecting drought for CDI calculations.

5.2. Weight Obtained by SHAP

A feature importance plot was generated using the SHAP method after obtaining the prediction results from the XGBoost model (Figure S2). In generating these plots, the original feature importance values were used to recalculate relative values on a scale of 100 to determine the impact of each input on the output. NDWI was the most influential variable for yield, contributing 41%. It was followed by TCI (19%), PCI (15%), GPP (11%), VCI (8%), and SMCI (6%). GPP, SMCI, and VCI had lower importance. The CDI equation calculated using the weights obtained from the SHAP method are as follows:

C D I = 0.41 \times N D W I + 0.19 \times T C I + 0.15 \times P C I + 0.08 \times V C I + 0.11 \times G P P + 0.06 \times S M C I

(10)

5.3. Weight Obtained by PCA

For a fair comparison with SHAP, PCA weights were obtained using the data in the training set, and the weights derived from this dataset were validated using the test set (Table S4). When examining CDI studies, information from the first component or the first two components is commonly used [115]. The contribution ratios of the principal components following the PCA analysis for the districts are shown in Table S5. The first two components accounted for more than 60% of the variance.

When examining the weights obtained from the PCA analysis, the input with the highest weight is VCI at 0.23, followed by TCI at 0.19. The third highest weight is GPP at 0.18, while the fourth is NDWI at 0.17. The lowest weight is attributed to PCI, with an average of 0.10. Compared to SHAP, only the TCI input shares the same ranking, while the weight rankings for all other inputs are significantly different.

C D I = 0.16 \times N D W I + 0.19 \times T C I + 0.10 \times P C I + 0.23 \times V C I + 0.18 \times G P P + 0.10 \times S M C I

(11)

To analyze the relationship between the inputs after the PCA analysis, PCA loadings plots were created. These plots allow us to see which inputs are closely related to each other in each district and their contribution to explaining the variance. The axes represent the components that explain the variance in the data, with the direction of the vectors indicating the relationship between the variables and the components, and their length reflecting their contribution. The angle between vectors signifies the correlation between variables; small angles indicate a positive relationship, while large angles suggest a weak association. Vectors pointing in the same direction imply that the variables exhibit similar behavior. The districts with the highest explanation ratio in the first two principal components were selected to create Figure 3. Among the values of the four districts in the graph, VCI provides the highest contribution to both the first (PC1) and second (PC2) components and TCI significantly contribute to PC1 and PC2. PCI and SMCI values have relatively low contributions. Additionally, a strong positive relationship can be observed between VCI and GPP in all regions, while a negative correlation can be observed between VCI and TCI and between NDWI and GPP, as these inputs intersect each other perpendicularly.

5.4. Crop Yield Correlation Results Obtained by Three Different Methods

The relationship between crop yield and CDI equations calculated using three different methods was examined. The results of the correlation analysis for all districts are shared in Table 3. In 7 out of the 40 districts where the study was conducted, correlations were not statistically significant for all methods. Among all the significant models, the SHAP model provided the best results with an average correlation of 0.56. Following the SHAP model, the highest average correlation was 0.47, obtained using the empirical method. Another significant finding is that the correlation results between the PCA and empirical methods are quite similar.

The correlation values between the drought indices used in this study and crop yield are presented in Table S6. When evaluating all 40 districts, the highest correlation was obtained using SHAP in 33 districts, PCA in 1 district, the empirical method in 3 districts, SPEI in 3 districts, and scPDSI in 0 districts. In the comparison among the SPEI indices themselves, SPEI_3 exhibited the highest average correlation at 0.39, while SPEI_9 had the lowest average correlation (0.26). Among all the methods, scPDSI showed the lowest average correlation (0.22). A box plot illustrating the correlation values for the districts in the training and test sets is presented in Figure 4. This figure, which serves as a visual representation of Table S6, allows for the easy identification of the distribution, minimum and maximum values, and median values of the correlation results for each method. The highest median values in the training and test sets belong to CDI_(SHAP), while CDI_(Empirical) and CDI_(PCA) show a very similar distribution. It is observed that SPEI_1, SPEI_3, and SPEI_6 produce similar results in the training set, but the differences become apparent in the districts within the test set. The spatial distribution of the correlation across the 40 districts in the study area is shown in Figure 5. SHAP demonstrates high correlations in a large number of districts, followed closely by PCA. Higher correlations between the drought indices and crop yield are observed in the northern regions of the study area, while a decrease in correlations is observed in the southern regions, which are at a higher elevation. Topographic variations in mountainous regions have the potential to influence precipitation distribution and the conditions under which wheat can be cultivated. Mountainous regions may give rise to orographic effects, which can result in greater (or lower) precipitation accumulation. Conversely, lower elevations may exhibit a more homogeneous distribution of rainfall. This may result in a weakening of the correlation between the drought index and yield in the southern and mountainous regions.

To assess the temporal relationship between the drought indices and crop yields for the districts in the test set, time series of the indices and crop yield are presented in Figure 6. The standardized crop yields, along with CDI and SPEI indices which belong to the same classification, are placed on the left axis, while the scPDSI values are placed on the right axis for visual comparison. To avoid clutter, only the SPEI value with the highest correlation at the district level was used, rather than all SPEI indices. Overall, a good alignment between CDI and SPEI with crop yield was observed. In contrast, scPDSI failed to capture crop yield trends in certain years, a conclusion that can be quickly drawn from the visual inspection. In all districts, the relationship between all indices and crop yield reached its peak during the wet year of 2015. Between 2004 and 2009, scPDSI deviated from the crop yield trends, showing a smooth movement without significant fluctuations, whereas CDI and SPEI responded promptly to crop yield anomalies.

5.5. Spatiotemporal Distribution of Drought Indices

In this study, several maps were prepared to evaluate the spatial distribution of drought indices. Drought maps for the crop development periods in 2016, which was consistently represented as a drought year across all indices, are presented in Figure 7. In the first month of the development period, all indices indicate a wet month; however, according to the CDI, this is predominantly observed in the southeastern region, while SPEI and scPDSI show wetter conditions in the northern regions. Additionally, it is observed that as the SPEI month index increases, there is an increase in wet areas. In November 2016, CDI indicates widespread drought, while SPEI_6 and SPEI_9 show increasing wetness in the north. December also shows high drought according to CDI, SPEI_1, SPEI_3, SPEI_6, and scPDSI. In January, CDI suggests near-normal to slightly wet conditions, but SPEI indices are inconsistent, with some showing drought and others indicating wet areas in the north. February sees CDI and scPDSI aligning, showing drought in the west. By March, CDI notes drought in the east, with mixed conditions across SPEI indices. April is dominated by drought across all indices. In May, CDI shows wet conditions near the borders and drought in the central area of the study region, while SPEI transitions from wet to drought. June ends with normal to mild drought, with scPDSI highlighting widespread drought for most of the period.

The variability of the CDI over the study period is illustrated in Figure 8. The maps were generated using the average values for the nine months of the growing season, from October to June. The years 2001, 2005, 2008, 2014, 2016, and 2020 were mild to moderate drought years, while 2011 and 2015 were moderately to severely wet years. Drought years generally show variability along the north–south axis.

In the maps produced with the same resolution as CDI (1 km), a resolution of 5 km was chosen for mapping SPEI and scPDSI values to avoid issues related to scaling. Additionally, in Figure 9, which analyzes changes over the years, SPEI_3 values were used because they generally align better with crop yield. Similar to CDI, SPEI_3 was classified as mild drought across most of the study area in 2001, 2005, 2008, and 2016. The years 2011 and 2015 were classified as slightly wet and moderately wet, respectively. Unlike CDI values, which were generally classified as near-normal and mild drought in 2004, 2007, and 2020, SPEI_3 remained in the near-normal class across the entire region during these years. However, in 2014 and 2019, CDI and SPEI_3 were observed to belong to similar drought classes in similar areas.

When examining Figure 10, which illustrates the distribution of scPDSI values across the study area, 2001 is mildly to moderately drought-affected, similar to CDI and SPEI_3. However, when analyzing the spatial distribution, drought is more pronounced in the western and southern regions according to scPDSI and predominantly observed in the northern regions in CDI and SPEI_3. In 2002, a drought occurred in the northeastern region, which was not captured by CDI and SPEI_3. One noteworthy feature of the scPDSI values is the variation of drought along the northeast–southwest axis during consecutive years, particularly in 2007, 2008, and 2009. While 2012 could be classified as wet according to CDI and SPEI_3, scPDSI placed it in the very wet category. In contrast to CDI and SPEI_3, 2017 is generally depicted as drier in scPDSI.

5.6. Long-Term Evolution of Drought Indices

In this study, time series were generated to analyze the long-term drought values obtained through the CDI. As shown in Figure 11, the CDI and SPEI time series are presented on the left y-axis, while the scPDSI time series are depicted on the right y-axis. The reason for using two different vertical axes is that the scPDSI varies between −5 and +5, which might pose challenges in the multi evaluation of it with CDI and SPEI, which fluctuate between −2 and +2. One of the most significant findings is the similar behavior exhibited by the CDI and SPEI. The fluctuations in scPDSI values are less pronounced compared to the other two indices. SPEI, in terms of its composition, utilizes precipitation and ET calculations, which are largely based on temperature. This similarity in inputs results in SPEI displaying a similar trend to CDI. However, SPEI focuses only on PET, which is insufficient to represent surface water processes fully [116], and lacks input that provides information about vegetation conditions, leading to significant divergences in some cases. For instance, in the Altinekin district, SPEI values indicate a mild drought and fewer fluctuations compared to CDI values (Figure 11).

The scPDSI generally exhibits a similar pattern to CDI and SPEI, but significant differences are observed in certain years. For example, scPDSI indicates the presence of a prolonged drought from 2003 to 2009, with drought values remaining constant over several months. A similar pattern is observed between 2016 and 2020. In the districts of Aksehir, Beysehir, and Ilgin, a long-term drought with consistent values is present. One of the reasons for this could be that scPDSI values are calculated using the drought values from the previous month, which contributes to the persistence of similar values over extended periods.

5.7. In-Depth Analysis of Feature Importance and Dependencies

In this study, the SHAP algorithm’s feature importance values were used in the process of optimizing the weights for the CDI equation. This method not only calculates the importance of features but also visually represents their impact on the output through graphs. The SHAP summary plot in Figure 12 describes how individual indices affect crop yield and demonstrates their nonlinear responses.

Extremely low NDWI has a very strong negative impact, while higher values have a relatively lower positive impact. The second most important index, TCI, also shows that higher values lead to increased crop yield, while medium and low values have a decreasing effect on crop yield. The PCI slightly increases crop yield at average values, while very low values decrease it. GPP and VCI have a similar effect; higher values lead to an increase in crop yield, while lower values have a minimal negative effect. As for SMCI, very high values cause a slightly positive effect, while medium values have a slightly negative effect. However, SMCI turned out to be the individual index with both the smallest impact size and the narrowest range of effects.

Research examining the relationship between NDWI values and crop yield has also confirmed that NDWI values significantly impact crop yield [117]. There is no consensus on which input is the most important in these studies, as individual indices can vary greatly regionally and may have different levels of influence depending on the climatic classification of the area.

One of the SHAP method’s outputs is the dependence plots, which analyze the nonlinear effects of each input on the output. Figure 13 presents dependence plots for the six indices used in predicting crop yield. A detailed examination reveals that normalized GPP values between 0.2 and 0.3 significantly reduce crop yield, while values above 0.6 have a positive or stable effect. The NDWI plot shows a nearly monotonic positive relationship, with significant yield increases above 0.55. PCI values exhibit a decreasing effect between 0.2 and 0.35, followed by a fluctuating increase above 0.6. SMCI shows a neutral effect up to 0.75, with positive effects emerging afterward. TCI values demonstrate a negative effect up to 0.5, followed by an increase, while VCI shows a negative effect up to 0.4, turning slightly positive above 0.45. These analyses provide a comprehensive view of both the weights and nonlinear effects of the indices on crop yield.

5.8. Future Directions, Recommendations, and Practical Applications

Although the CDI developed in this study was created using innovative algorithms, it has certain limitations. The winter wheat yield values used in the study are limited to 20 years. Developing models with more extensive data is encouraged, noting that remote sensing time series are often limited in duration. Additionally, it has not been investigated whether the remote sensing data used in the study area require any additional corrections; the observations used are derived from established global algorithms. Researchers are strongly advised to explore this aspect and incorporate the need for corrections between the remote sensing data and ground observations in their study areas. In addition to improving temporal resolution, enhancements in spatial resolution could help with yield modeling in individual fields. The present study’s XGBoost hyperparameters optimization was challenging, and grid search algorithm addressed this challenge. Nevertheless, future research should investigate the potential of advanced heuristic methods. Given that XGBoost was employed for the construction of predictive models, TreeSHAP was utilized. In the context of larger datasets and broader regions, it is possible that deep learning models and the KernelSHAP method may yield superior results.

The model inputs include remote sensing datasets containing information on temperature, precipitation, soil moisture, vegetation condition, plant water status, and primary productivity. However, the study did not use evapotranspiration (ET) values. Future studies could benefit from incorporating the Evaporative Stress Index [118,119], which represents the ratio of potential ET to actual ET. Moreover, variables representing global ocean–atmosphere interactions, such as El Niño and La Niña, could further improve CDI calculations if they include independent information from the indices explored here.

Our study focused on one of the largest grain-producing regions in Türkiye. The primary motivation for this work was the fact that the area predominantly practices rain-fed agriculture, as crops grown in rain-fed areas tend to respond more sensitively to meteorological variables. In regions where irrigation data and the genetic characteristics of the crops used are available, it is recommended that these factors be included in the study to develop even more advanced indices.

Just as drought affects all sectors, the sector where its direct impacts are often most clearly seen is agriculture [120]. Among the most important stakeholders in drought studies are companies operating in the agricultural sector, local producers, and traders involved in the sale and marketing of agricultural products [121]. We hope that this study will contribute to the development of agricultural drought monitoring systems at the local level to enable decision-makers to address complex drought monitoring processes in greater detail and accelerate effective decision making.

In the USA, a country known for its advanced drought monitoring systems, CDIs are being developed using similar applications [122]. Various CDIs can be generated using different combinations of inputs. Countries like Türkiye, where agricultural products constitute a significant share of the gross national product, could improve decision-making processes by utilizing CDI-like indices. Since all the inputs used in creating CDI indices are obtained from free sources and are continuously updated, this study is valuable for the development of online drought monitoring systems that provide local-scale information to end-users.

6. Conclusions

In this study, the newly developed combined drought index, designed to more effectively monitor agricultural drought events, may offer more effective usage when used in conjunction with existing indices. The use of SHAP algorithm-determined weights and a mesoscale hydrologic model enhanced the reliability and accuracy of this new index. Notably, the index exhibited a higher correlation with wheat yields than widely used indices like SPEI and scPDSI, marking a significant contribution to agricultural drought monitoring. Furthermore, it is anticipated that the integration of increasing volumes of data from remote sensing systems with technological advancements will further enhance the efficiency of agricultural drought monitoring systems in the future. We anticipate that CDIs enhanced using ML techniques will continue to improve agricultural drought monitoring.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs16203799/s1, Figure S1. The mHM workflow. Figure S2. SHAP feature importance graph. Table S1. The mean [minimum: maximum] winter wheat yield (t/ha) per district in the study area from 2001 to 2020. Table S2. XGBoost hyperparameter range and optimum values. Table S3. XGBoost model performance evaluation results. Table S4. CDI weights obtained from PCA of the training set. Table S5. Cumulative variance explanation values of the principal components. Table S6. Correlation test results between drought indices and crop yield.

Author Contributions

Conceptualization, E.E.B.; formal analysis, E.E.B.; investigation, E.E.B.; methodology, E.E.B.; software, E.E.B.; supervision, M.C.D.; validation, E.E.B.; visualization, E.E.B.; writing—original draft, E.E.B. and P.C.S.; writing—review and editing, P.C.S., M.C.D., M.O. and J.A.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Scientific and Technological Research Council of Türkiye (TÜBİTAK), BIDEB2214-A program, grant number 1059B142201613.

Data Availability Statement

We also appreciate the open source data of GPP products (PMLV2, https://doi.org/10.6084/m9.figshare.14185739.v4 (accessed on 15 April 2024); land cover (GLAD, https://storage.googleapis.com/earthenginepartners-hansen/GLCLU2000-2020/v2/download.html (accessed on 15 April 2024); rainfall (CHIRPS, https://data.chc.ucsb.edu/products/CHIRPS-2.0/global_monthly/ (accessed on 15 April 2024); NDWI (https://developers.google.com/earth-engine/datasets/catalog/MODIS_MOD09GA_006_NDWI (accessed on 15 April 2024); LST (https://modis.gsfc.nasa.gov/data/dataprod/mod11.php (accessed on 15 April 2024) and NDVI (https://lpdaac.usgs.gov/products/mod13a3v006/ (accessed on 15 April 2024)).

Acknowledgments

The authors would like to thank the Scientific and Technological Research Council of Türkiye (TÜBİTAK) for their support in funding doctoral research. We would like to thank the General Directorate of State Hydraulic Works, Türkiye, for providing streamflow data. The research presented in this paper was produced as part of the first author’s Ph.D. thesis at Istanbul Technical University. Thanks to Sadegh Ranjbar for their scientific valuable comments during the preparation of this work.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

In the PML-V2 model, GPP is calculated based on photosynthesis (

A_{g s}

), constrained by the surface vapor pressure deficit (D).

G P P = A_{g s} f_{D}

(A1)

A_{g s} = \frac{P_{1} C_{a}}{k (P_{2} + P_{4})} {k L A I + l n \frac{P_{2} + P_{3} + P_{4}}{P_{2} + P_{3} \exp (k L A I) + P_{4}}}

(A2)

P_{1} = A_{m} β I_{0} η, P_{2} = A_{m} β I_{0}, P_{3} = A_{m} η C_{a}, P_{4} = β I_{0} η C_{a}

(A3)

A_{m} ≅ 0.5 \times V_{m}

(A4)

V_{m} = \frac{V_{m, 25} e x p [a (T - 25)]}{1 + \exp [b (T - 41)]}

(A5)

f_{D} = {\begin{matrix} 1 D < D_{m i n} \\ \frac{(D_{m a x} - D)}{D_{m a x} - D_{m i n}} D_{m i n} < D < D_{m a x} \\ 0 D > D_{m a x} \end{matrix}

(A6)

where f_D is the D constraint function, D_min is the threshold below which there is no vapor pressure constraint, D_max is the threshold above which there is no assimilation, D₀ is the reference vapor pressure deficit at stomatal conductance reduction, I₀ is the photosynthetically active radiation estimated from downwelling shortwave radiation, Ca is the atmospheric CO₂ concentration, V_m is the maximum catalytic capacity of Rubisco per unit leaf area at 25 °C, β is the initial photochemical efficiency, η is the initial value of the slope of CO₂ response curve, m is the stomatal conductance coefficient, P₁ represents the term calculated using the maximum photosynthetic capacity (A_m), β, I₀, and the η. P₂ includes the same components except for the efficiency factor, reflecting the direct effect of light on photosynthesis. P₃ defines the interaction between the maximum photosynthetic capacity and C_a along with efficiency. P₄ represents the combined effect of light intensity, CO₂, and efficiency.

References

Başakın, E.E.; Ekmekcioğlu, Ö.; Özger, M. Drought Prediction Using Hybrid Soft-Computing Methods for Semi-Arid Region. Model. Earth Syst. Environ. 2021, 7, 2363–2371. [Google Scholar] [CrossRef]
Mishra, A.K.; Singh, V.P. A Review of Drought Concepts. J. Hydrol. 2010, 391, 202–216. [Google Scholar] [CrossRef]
Spinoni, J.; Naumann, G.; Vogt, J.V. Pan-European Seasonal Trends and Recent Changes of Drought Frequency and Severity. Glob. Planet. Change 2017, 148, 113–130. [Google Scholar] [CrossRef]
Bhaga, T.D.; Dube, T.; Shekede, M.D.; Shoko, C. Impacts of Climate Variability and Drought on Surface Water Resources in Sub-Saharan Africa Using Remote Sensing: A Review. Remote Sens. 2020, 12, 4184. [Google Scholar] [CrossRef]
Kumari, P.; Rehana, S.; Singh, S.K.; Inayathulla, M. Development of a New Agro-Meteorological Drought Index (SPAEI-Agro) in a Data-Scarce Region. Hydrol. Sci. J. 2023, 68, 1301–1322. [Google Scholar] [CrossRef]
Hasan, H.H.; Mohd Razali, S.F.; Muhammad, N.S.; Ahmad, A. Research Trends of Hydrological Drought: A Systematic Review. Water 2019, 11, 2252. [Google Scholar] [CrossRef]
Wu, X.; Xu, H.; He, H.; Wu, Z.; Lu, G.; Liao, T. Agricultural Drought Monitoring Using an Enhanced Soil Water Deficit Index Derived from Remote Sensing and Model Data Merging. Remote Sens. 2024, 16, 2156. [Google Scholar] [CrossRef]
Pathak, A.A.; Dodamani, B.M. Connection between Meteorological and Groundwater Drought with Copula-Based Bivariate Frequency Analysis. J. Hydrol. Eng. 2021, 26, 05021015. [Google Scholar] [CrossRef]
Otkin, J.A.; Svoboda, M.; Hunt, E.D.; Ford, T.W.; Anderson, M.C.; Hain, C.; Basara, J.B. Flash Droughts: A Review and Assessment of the Challenges Imposed by Rapid-Onset Droughts in the United States. Bull. Am. Meteorol. Soc. 2018, 99, 911–919. [Google Scholar] [CrossRef]
Raposo, V.D.M.B.; Costa, V.A.F.; Rodrigues, A.F. A Review of Recent Developments on Drought Characterization, Propagation, and Influential Factors. Sci. Total Environ. 2023, 898, 165550. [Google Scholar] [CrossRef]
Wang, Y.; Liu, S.; Huang, S.; Zhou, Z.; Shi, H. Bivariate Assessment of Socioeconomic Drought Events Based on an Improved Socioeconomic Drought Index. J. Hydrol. 2023, 623, 129878. [Google Scholar] [CrossRef]
Raza, A.; Hussain, I.; Ali, Z.; Faisal, M.; Elashkar, E.E.; Shoukry, A.M.; Al-Deek, F.F.; Gani, S. A Seasonally Blended and Regionally Integrated Drought Index Using Bayesian Network Theory. Meteorol. Appl. 2021, 28, e1992. [Google Scholar] [CrossRef]
Tu, X.; Wu, H.; Singh, V.P.; Chen, X.; Lin, K.; Xie, Y. Multivariate Design of Socioeconomic Drought and Impact of Water Reservoirs. J. Hydrol. 2018, 566, 192–204. [Google Scholar] [CrossRef]
Li, L.; Peng, Q.; Wang, M.; Cao, Y.; Gu, X.; Cai, H. Quantitative Analysis of Vegetation Drought Propagation Process and Uncertainty in the Yellow River Basin. Agric. Water Manag. 2024, 295, 108775. [Google Scholar] [CrossRef]
Naumann, G.; Cammalleri, C.; Mentaschi, L.; Feyen, L. Increased Economic Drought Impacts in Europe with Anthropogenic Warming. Nat. Clim. Change 2021, 11, 485–491. [Google Scholar] [CrossRef]
Zhang, M.; Fernández-Torres, M.-Á.; Camps-Valls, G. Domain Knowledge-Driven Variational Recurrent Networks for Drought Monitoring. Remote Sens. Environ. 2024, 311, 114252. [Google Scholar] [CrossRef]
Cavus, Y.; Stahl, K.; Aksoy, H. Drought Intensity–Duration–Frequency Curves Based on Deficit in Precipitation and Streamflow for Water Resources Management. Hydrol. Earth Syst. Sci. 2023, 27, 3427–3445. [Google Scholar] [CrossRef]
Han, J.; Singh, V.P. A Review of Widely Used Drought Indices and the Challenges of Drought Assessment under Climate Change. Environ. Monit. Assess. 2023, 195, 1438. [Google Scholar] [CrossRef]
McKee, T.B.; Doesken, N.J.; Kleist, J. The Relationship of Drought Frequency and Duration to Time Scales; American Meteorological Society: Anaheim, CA, USA, 1993; pp. 179–184. [Google Scholar]
Vicente-Serrano, S.M.; Beguería, S.; López-Moreno, J.I. A Multiscalar Drought Index Sensitive to Global Warming: The Standardized Precipitation Evapotranspiration Index. J. Clim. 2010, 23, 1696–1718. [Google Scholar] [CrossRef]
Palmer, W.C. Meteorological Drought; Office of Climatology Research Paper 45; Weather Bureau: Washington, DC, USA, 1965; p. 58. [Google Scholar]
Bergman, K.H.; Sabol, P.; Miskus, D. Experimental Indices for Monitoring Global Drought Conditions; United States Department of Commerce: Cambridge, MA, USA, 1988. [Google Scholar]
Narasimhan, B.; Srinivasan, R. Development and Evaluation of Soil Moisture Deficit Index (SMDI) and Evapotranspiration Deficit Index (ETDI) for Agricultural Drought Monitoring. Agric. For. Meteorol. 2005, 133, 69–88. [Google Scholar] [CrossRef]
Nalbantis, I.; Tsakiris, G. Assessment of Hydrological Drought Revisited. Water Resour. Manag. 2009, 23, 881–897. [Google Scholar] [CrossRef]
Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the Radiometric and Biophysical Performance of the MODIS Vegetation Indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
Tsakiris, G.; Vangelis, H. Establishing a Drought Index Incorporating Evapotranspiration. Eur. Water 2005, 9, 3–11. [Google Scholar]
Hobbins, M.T.; Wood, A.; McEvoy, D.J.; Huntington, J.L.; Morton, C.; Anderson, M.; Hain, C. The Evaporative Demand Drought Index. Part I: Linking Drought Evolution to Variations in Evaporative Demand. J. Hydrometeorol. 2016, 17, 1745–1761. [Google Scholar] [CrossRef]
Byun, H.-R.; Wilhite, D.A. Objective Quantification of Drought Severity and Duration. J. Clim. 1999, 12, 2747–2756. [Google Scholar] [CrossRef]
Gunda, T.; Hornberger, G.M.; Gilligan, J.M. Spatiotemporal Patterns of Agricultural Drought in Sri Lanka: 1881–2010. Int. J. Climatol. 2016, 36, 563–575. [Google Scholar] [CrossRef]
Li, J.; Li, Y.; Yin, L.; Zhao, Q. A Novel Composite Drought Index Combining Precipitation, Temperature and Evapotranspiration Used for Drought Monitoring in the Huang-Huai-Hai Plain. Agric. Water Manag. 2024, 291, 108626. [Google Scholar] [CrossRef]
AghaKouchak, A.; Farahmand, A.; Melton, F.S.; Teixeira, J.; Anderson, M.C.; Wardlow, B.D.; Hain, C.R. Remote Sensing of Drought: Progress, Challenges and Opportunities. Rev. Geophys. 2015, 53, 452–480. [Google Scholar] [CrossRef]
Kogan, F.N. Application of Vegetation Index and Brightness Temperature for Drought Detection. Adv. Space Res. 1995, 15, 91–100. [Google Scholar] [CrossRef]
Gao, B. NDWI—A Normalized Difference Water Index for Remote Sensing of Vegetation Liquid Water from Space. Remote Sens. Environ. 1996, 58, 257–266. [Google Scholar] [CrossRef]
Zhang, A.; Jia, G. Monitoring Meteorological Drought in Semiarid Regions Using Multi-Sensor Microwave Remote Sensing Data. Remote Sens. Environ. 2013, 134, 12–23. [Google Scholar] [CrossRef]
Hanadé Houmma, I.; Gadal, S.; El Mansouri, L.; Garba, M.; Gbetkom, P.G.; Mamane Barkawi, M.B.; Hadria, R. A New Multivariate Agricultural Drought Composite Index Based on Random Forest Algorithm and Remote Sensing Data Developed for Sahelian Agrosystems. Geomat. Nat. Hazards Risk 2023, 14, 2223384. [Google Scholar] [CrossRef]
Kogan, F.N. Operational Space Technology for Global Vegetation Assessment. Bull. Am. Meteorol. Soc. 2001, 82, 1949–1964. [Google Scholar] [CrossRef]
Danodia, A.; Kushwaha, A.; Patel, N.R. Remote Sensing-Derived Combined Index for Agricultural Drought Assessment of Rabi Pulse Crops in Bundelkhand Region, India. Environ. Dev. Sustain. 2021, 23, 15432–15449. [Google Scholar] [CrossRef]
Karimi, M.; Shahedi, K.; Raziei, T.; Miryaghoubzadeh, M. Meteorological and Agricultural Drought Monitoring in Southwest of Iran Using a Remote Sensing-Based Combined Drought Index. Stoch. Environ. Res. Risk Assess. 2022, 36, 3707–3724. [Google Scholar] [CrossRef]
Wei, W.; Yan, P.; Zhou, L.; Zhang, H.; Xie, B.; Zhou, J. A Comprehensive Drought Index Based on Spatial Principal Component Analysis and Its Application in Northern China. Environ. Monit. Assess. 2024, 196, 193. [Google Scholar] [CrossRef]
Shashikant, V.; Mohamed Shariff, A.R.; Wayayok, A.; Kamal, M.R.; Lee, Y.P.; Takeuchi, W. Utilizing TVDI and NDWI to Classify Severity of Agricultural Drought in Chuping, Malaysia. Agronomy 2021, 11, 1243. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, P.; Chen, Y.; Yang, J.; Wu, D.; Ma, Y.; Huo, Z.; Liu, S. Daily Dynamic Thresholds of Different Agricultural Drought Grades for Summer Maize Based on the Vegetation Water Index. J. Hydrol. 2023, 625, 130070. [Google Scholar] [CrossRef]
Patil, P.P.; Jagtap, M.P.; Khatri, N.; Madan, H.; Vadduri, A.A.; Patodia, T. Exploration and Advancement of NDDI Leveraging NDVI and NDWI in Indian Semi-Arid Regions: A Remote Sensing-Based Study. Case Stud. Chem. Environ. Eng. 2024, 9, 100573. [Google Scholar] [CrossRef]
Delbart, N.; Kergoat, L.; Le Toan, T.; Lhermitte, J.; Picard, G. Determination of Phenological Dates in Boreal Regions Using Normalized Difference Water Index. Remote Sens. Environ. 2005, 97, 26–38. [Google Scholar] [CrossRef]
Olmos-Trujillo, E.; González-Trinidad, J.; Júnez-Ferreira, H.; Pacheco-Guerrero, A.; Bautista-Capetillo, C.; Avila-Sandoval, C.; Galván-Tejada, E. Spatio-Temporal Response of Vegetation Indices to Rainfall and Temperature in A Semiarid Region. Sustainability 2020, 12, 1939. [Google Scholar] [CrossRef]
Başakın, E.E.; Stoy, P.C.; Demirel, M.C.; Pham, Q.B. Spatiotemporal Variability of Gross Primary Productivity in Türkiye: A Multi-Source and Multi-Method Assessment. Remote Sens. 2024, 16, 1994. [Google Scholar] [CrossRef]
Yang, P.; Zhai, X.; Huang, H.; Zhang, Y.; Zhu, Y.; Shi, X.; Zhou, L.; Fu, C. Association and Driving Factors of Meteorological Drought and Agricultural Drought in Ningxia, Northwest China. Atmos. Res. 2023, 289, 106753. [Google Scholar] [CrossRef]
Wei, X.; He, W.; Zhou, Y.; Cheng, N.; Xiao, J.; Bi, W.; Liu, Y.; Sun, S.; Ju, W. Increased Sensitivity of Global Vegetation Productivity to Drought Over the Recent Three Decades. JGR Atmos. 2023, 128, e2022JD037504. [Google Scholar] [CrossRef]
Zhang, R.; E, X.; Ma, Z.; An, Y.; Bao, Q.; Wu, Z.; Wu, L.; Sun, Z. Drought Sensitivity and Vulnerability of Rubber Plantation GPP—Insights from Flux Site-Based Simulation. Land 2024, 13, 745. [Google Scholar] [CrossRef]
Pendergrass, A.G.; Meehl, G.A.; Pulwarty, R.; Hobbins, M.; Hoell, A.; AghaKouchak, A.; Bonfils, C.J.W.; Gallant, A.J.E.; Hoerling, M.; Hoffmann, D.; et al. Flash Droughts Present a New Challenge for Subseasonal-to-Seasonal Prediction. Nat. Clim. Change 2020, 10, 191–199. [Google Scholar] [CrossRef]
Hao, C.; Zhang, J.; Yao, F. Combination of Multi-Sensor Remote Sensing Data for Drought Monitoring over Southwest China. Int. J. Appl. Earth Obs. Geoinf. 2015, 35, 270–283. [Google Scholar] [CrossRef]
Zhang, L.; Jiao, W.; Zhang, H.; Huang, C.; Tong, Q. Studying Drought Phenomena in the Continental United States in 2011 and 2012 Using Various Drought Indices. Remote Sens. Environ. 2017, 190, 96–106. [Google Scholar] [CrossRef]
Parthsarthi Pandya, N.K.; Gontia, H.V. Parmar Development of PCA-Based Composite Drought Index for Agricultural Drought Assessment Using Remote-Sensing. J. Agrometeorol. 2022, 24, 384–392. [Google Scholar] [CrossRef]
Arab, S.T.; Noguchi, R.; Ahamed, T. Yield Loss Assessment of Grapes Using Composite Drought Index Derived from Landsat OLI and TIRS Datasets. Remote Sens. Appl. Soc. Environ. 2022, 26, 100727. [Google Scholar] [CrossRef]
Tang, X.; Feng, Y.; Gao, C.; Lei, Z.; Chen, S.; Wang, R.; Jin, Y.; Tong, X. Entropy-Weight-Based Spatiotemporal Drought Assessment Using MODIS Products and Sentinel-1A Images in Urumqi, China. Nat. Hazards 2023, 119, 387–408. [Google Scholar] [CrossRef]
Waseem, M.; Ajmal, M.; Kim, T.-W. Development of a New Composite Drought Index for Multivariate Drought Assessment. J. Hydrol. 2015, 527, 30–37. [Google Scholar] [CrossRef]
Cai, S.; Zuo, D.; Wang, H.; Xu, Z.; Wang, G.; Yang, H. Assessment of Agricultural Drought Based on Multi-Source Remote Sensing Data in a Major Grain Producing Area of Northwest China. Agric. Water Manag. 2023, 278, 108142. [Google Scholar] [CrossRef]
Shi, X.; Ding, H.; Wu, M.; Shi, M.; Chen, F.; Li, Y.; Yang, Y. A Comprehensive Drought Monitoring Method Integrating Multi-Source Data. PeerJ 2022, 10, e13560. [Google Scholar] [CrossRef]
Suo, N.; Xu, C.; Cao, L.; Song, L.; Lei, X. A Copula-Based Parametric Composite Drought Index for Drought Monitoring and Applicability in Arid Central Asia. CATENA 2024, 235, 107624. [Google Scholar] [CrossRef]
Kanthavel, P.; Saxena, C.K.; Singh, R.K. Integrated Drought Index Based on Vine Copula Modelling. Int. J. Climatol. 2022, 42, 9510–9529. [Google Scholar] [CrossRef]
Salvia, A.L.; Brandli, L.L.; Leal Filho, W.; Locatelli Kalil, R.M. An Analysis of the Applications of Analytic Hierarchy Process (AHP) for Selection of Energy Efficiency Practices in Public Lighting in a Sample of Brazilian Cities. Energy Policy 2019, 132, 854–864. [Google Scholar] [CrossRef]
Hanadé Houmma, I.; El Mansouri, L.; Hadria, R.; Emran, A.; Chehbouni, A. Retrospective Analysis and Version Improvement of the Satellite-Based Drought Composite Index. A semi-arid Tensift-Morocco application. Geocarto Int. 2022, 37, 3069–3090. [Google Scholar] [CrossRef]
Wallace, M.L.; Mentch, L.; Wheeler, B.J.; Tapia, A.L.; Richards, M.; Zhou, S.; Yi, L.; Redline, S.; Buysse, D.J. Use and Misuse of Random Forest Variable Importance Metrics in Medicine: Demonstrations through Incident Stroke Prediction. BMC Med. Res. Methodol. 2023, 23, 144. [Google Scholar] [CrossRef]
He, Z.; Yang, Y.; Fang, R.; Zhou, S.; Zhao, W.; Bai, Y.; Li, J.; Wang, B. Integration of Shapley Additive Explanations with Random Forest Model for Quantitative Precipitation Estimation of Mesoscale Convective Systems. Front. Environ. Sci. 2023, 10, 1057081. [Google Scholar] [CrossRef]
Ergün, E.; Demirel, M.C. On the Use of Distributed Hydrologic Model for Filling Large Gaps at Different Parts of the Streamflow Data. Eng. Sci. Technol. Int. J. 2023, 37, 101321. [Google Scholar] [CrossRef]
Koycegiz, C.; Sen, O.L.; Buyukyildiz, M. An Analysis of Terrestrial Water Storage Changes of a Karstic, Endorheic Basin in Central Anatolia, Turkey. Ecohydrol. Hydrobiol. 2023, 23, 688–702. [Google Scholar] [CrossRef]
Kececi, H.; Cali, B.; Basar, S. KONYA Tarim Istatistikleri; Konya İl Tarım ve Orman Müdürlüğü: Konya, Turkey, 2021. [Google Scholar]
Climate Hazards Center. CHIRPSv2.0Climate Hazards Center InfraRed Precipitation with Station (CHIRPSv2.0); Climate Hazards Center - UC Santa Barbara: Santa Barbara, CA, USA, 2015. [Google Scholar]
Funk, C.; Peterson, P.; Landsfeld, M.; Pedreros, D.; Verdin, J.; Shukla, S.; Husak, G.; Rowland, J.; Harrison, L.; Hoell, A.; et al. The Climate Hazards Infrared Precipitation with Stations—A New Environmental Record for Monitoring Extremes. Sci Data 2015, 2, 150066. [Google Scholar] [CrossRef]
Aksu, H.; Akgül, M.A. Performance Evaluation of CHIRPS Satellite Precipitation Estimates over Turkey. Theor. Appl. Clim. 2020, 142, 71–84. [Google Scholar] [CrossRef]
Aksu, H.; Cavus, Y.; Aksoy, H.; Akgul, M.A.; Turker, S.; Eris, E. Spatiotemporal Analysis of Drought by CHIRPS Precipitation Estimates. Theor. Appl. Clim. 2022, 148, 517–529. [Google Scholar] [CrossRef]
Orieschnig, C.; Cavus, Y. Spatial Characterization of Drought through CHIRPS and a Station-Based Dataset in the Eastern Mediterranean. Proc. IAHS 2024, 385, 79–84. [Google Scholar] [CrossRef]
Kim, K.Y.; Haagenson, R.; Kansara, P.; Rajaram, H.; Lakshmi, V. Augmenting Daily MODIS LST with AIRS Surface Temperature Retrievals to Estimate Ground Temperature and Permafrost Extent in High Mountain Asia. Remote Sens. Environ. 2024, 305, 114075. [Google Scholar] [CrossRef]
Yin, G.; Zhang, H. A New Integrated Index for Drought Stress Monitoring Based on Decomposed Vegetation Response Factors. J. Hydrol. 2023, 618, 129252. [Google Scholar] [CrossRef]
Jiao, W.; Wang, L.; Novick, K.A.; Chang, Q. A New Station-Enabled Multi-Sensor Integrated Index for Drought Monitoring. J. Hydrol. 2019, 574, 169–180. [Google Scholar] [CrossRef]
Baig, M.H.A.; Abid, M.; Khan, M.R.; Jiao, W.; Amin, M.; Adnan, S. Assessing Meteorological and Agricultural Drought in Chitral Kabul River Basin Using Multiple Drought Indices. Remote Sens. 2020, 12, 1417. [Google Scholar] [CrossRef]
Rahman, K.U.; Ejaz, N.; Shang, S.; Balkhair, K.S.; Alghamdi, K.M.; Zaman, K.; Khan, M.A.; Hussain, A. A Robust Integrated Agricultural Drought Index under Climate and Land Use Variations at the Local Scale in Pakistan. Agric. Water Manag. 2024, 295, 108748. [Google Scholar] [CrossRef]
Wang, Z.; Wang, Z.; Xiong, J.; He, W.; Yong, Z.; Wang, X. Responses of the Remote Sensing Drought Index with Soil Information to Meteorological and Agricultural Droughts in Southeastern Tibet. Remote Sens. 2022, 14, 6125. [Google Scholar] [CrossRef]
Demirci, U.; Demirel, M.C. Effect of Dynamic PET Scaling with LAI and Aspect on the Spatial Performance of a Distributed Hydrologic Model. Agronomy 2023, 13, 534. [Google Scholar] [CrossRef]
Zhang, X.; Zhang, Y.; Kong, D. Global Monthly GPP, ET, Ec, Es and Ei Simulated by PML-V2 with AVHRR Data at a 0.05 Degree Resolution over 1982–2014. 2023. [Google Scholar]
Leuning, R.; Zhang, Y.Q.; Rajaud, A.; Cleugh, H.; Tu, K. A Simple Surface Conductance Model to Estimate Regional Evaporation Using MODIS Leaf Area Index and the Penman-Monteith Equation. Water Resour. Res. 2008, 44, 2007WR006562. [Google Scholar] [CrossRef]
Pei, Y.; Dong, J.; Zhang, Y.; Yang, J.; Zhang, Y.; Jiang, C.; Xiao, X. Performance of Four State-of-the-Art GPP Products (VPM, MOD17, BESS and PML) for Grasslands in Drought Years. Ecol. Inform. 2020, 56, 101052. [Google Scholar] [CrossRef]
Gan, R.; Zhang, Y.; Shi, H.; Yang, Y.; Eamus, D.; Cheng, L.; Chiew, F.H.S.; Yu, Q. Use of Satellite Leaf Area Index Estimating Evapotranspiration and Gross Assimilation for Australian Ecosystems. Ecohydrology 2018, 11, e1974. [Google Scholar] [CrossRef]
Naeem, S.; Zhang, Y.; Zhang, X.; Rehman, A.U.; Tang, Z.; Xu, Z.; Li, C.; Azeem, T. Recent Change in Ecosystem Water Use Efficiency in China Mainly Dominated by Vegetation Greening and Increased CO₂. Remote Sens. Environ. 2023, 298, 113811. [Google Scholar] [CrossRef]
Potapov, P.; Hansen, M.C.; Pickens, A.; Hernandez-Serna, A.; Tyukavina, A.; Turubanova, S.; Zalles, V.; Li, X.; Khan, A.; Stolle, F.; et al. The Global 2000–2020 Land Cover and Land Use Change Dataset Derived from the Landsat Archive: First Results. Front. Remote Sens. 2022, 3, 856903. [Google Scholar] [CrossRef]
Tapas, M.R.; Kumar, U.; Mogili, S.; Jayakumar, K.V. Development of Multivariate Integrated Drought Monitoring Index (MIDMI) for Warangal Region of Telangana, India. J. Water Clim. Change 2022, 13, 1612–1630. [Google Scholar] [CrossRef]
Samaniego, L.; Kumar, R.; Attinger, S. Multiscale Parameter Regionalization of a Grid-based Hydrologic Model at the Mesoscale. Water Resour. Res. 2010, 46, 2008WR007327. [Google Scholar] [CrossRef]
Kumar, R.; Samaniego, L.; Attinger, S. Implications of Distributed Hydrologic Model Parameterization on Water Fluxes at Multiple Scales and Locations. Water Resour. Res. 2013, 49, 360–379. [Google Scholar] [CrossRef]
Araneda-Cabrera, R.J.; Bermúdez, M.; Puertas, J. Assessment of the Performance of Drought Indices for Explaining Crop Yield Variability at the National Scale: Methodological Framework and Application to Mozambique. Agric. Water Manag. 2021, 246, 106692. [Google Scholar] [CrossRef]
Zipper, S.C.; Qiu, J.; Kucharik, C.J. Drought Effects on US Maize and Soybean Production: Spatiotemporal Patterns and Historical Changes. Environ. Res. Lett. 2016, 11, 094021. [Google Scholar] [CrossRef]
Chen, Y.; Huang, J.; Song, X.; Gao, P.; Wan, S.; Shi, L.; Wang, X. Spatiotemporal Characteristics of Winter Wheat Waterlogging in the Middle and Lower Reaches of the Yangtze River, China. Adv. Meteorol. 2018, 2018, 3542103. [Google Scholar] [CrossRef]
Lischeid, G.; Webber, H.; Sommer, M.; Nendel, C.; Ewert, F. Machine Learning in Crop Yield Modelling: A Powerful Tool, but No Surrogate for Science. Agric. For. Meteorol. 2022, 312, 108698. [Google Scholar] [CrossRef]
Liu, W.; Li, Z.; Li, Y.; Ye, T.; Chen, S.; Liu, Y. Heterogeneous Impacts of Excessive Wetness on Maize Yields in China: Evidence from Statistical Yields and Process-Based Crop Models. Agric. For. Meteorol. 2022, 327, 109205. [Google Scholar] [CrossRef]
Başakın, E.E. Comments on “Groundwater Quality Modeling Using a Novel Hybrid Data-Intelligence Model Based on Gray Wolf Optimization Algorithm and Multi-Layer Perceptron Artificial Neural Network: A Case Study in Asadabad Plain, Hamedan, Iran” Cheraghi, Mehrdad et al. (10.1007/S11356-021–16300-4). Environ. Sci. Pollut. Res. 2022, 29, 41869–41871. [Google Scholar] [CrossRef]
Lu, J.; Carbone, G.J.; Gao, P. Detrending Crop Yield Data for Spatial Visualization of Drought Impacts in the United States, 1895–2014. Agric. For. Meteorol. 2017, 237–238, 196–208. [Google Scholar] [CrossRef]
Laimighofer, J.; Laaha, G. How Standard Are Standardized Drought Indices? Uncertainty Components for the SPI & SPEI Case. J. Hydrol. 2022, 613, 128385. [Google Scholar] [CrossRef]
Beguería, S.; Vicente-Serrano, S.M.; Reig, F.; Latorre, B. Standardized Precipitation Evapotranspiration Index (SPEI) Revisited: Parameter Fitting, Evapotranspiration Models, Tools, Datasets and Drought Monitoring. Int. J. Climatol. 2014, 34, 3001–3023. [Google Scholar] [CrossRef]
Beguería, S. Sbegueria/SPEIbase: Version 2.7; DIGITAL.CSIC: Madrid, Spain, 2022. [Google Scholar]
Shen, Z.; Zhang, Q.; Singh, V.P.; Sun, P.; Song, C.; Yu, H. Agricultural Drought Monitoring across Inner Mongolia, China: Model Development, Spatiotemporal Patterns and Impacts. J. Hydrol. 2019, 571, 793–804. [Google Scholar] [CrossRef]
Wells, N.; Goddard, S.; Hayes, M.J. A Self-Calibrating Palmer Drought Severity Index. J. Clim. 2004, 17, 2335–2351. [Google Scholar] [CrossRef]
Van Der Schrier, G.; Barichivich, J.; Briffa, K.R.; Jones, P.D. A scPDSI-based Global Data Set of Dry and Wet Spells for 1901–2009. JGR Atmos. 2013, 118, 4025–4048. [Google Scholar] [CrossRef]
Zhang, Y.; Bu, J.; Zuo, X.; Yu, K.; Wang, Q.; Huang, W. Vegetation Water Content Retrieval from Spaceborne GNSS-R and Multi-Source Remote Sensing Data Using Ensemble Machine Learning Methods. Remote Sens. 2024, 16, 2793. [Google Scholar] [CrossRef]
Gupta, A.; Gowda, S.; Tiwari, A.; Gupta, A.K. XGBoost-SHAP Framework for Asphalt Pavement Condition Evaluation. Constr. Build. Mater. 2024, 426, 136182. [Google Scholar] [CrossRef]
Bushenkova, A.; Soares, P.M.M.; Johannsen, F.; Lima, D.C.A. Towards an Improved Representation of the Urban Heat Island Effect: A Multi-Scale Application of XGBoost for Madrid. Urban Clim. 2024, 55, 101982. [Google Scholar] [CrossRef]
Mandal, J.; Chatterjee, C.; Das, S. An Explainable Machine Learning Technique to Forecast Lightning Density over North-Eastern India. J. Atmos. Sol. Terr. Phys. 2024, 259, 106255. [Google Scholar] [CrossRef]
Shapley, L.S. A Value for N-Person Games. In Contributions to the Theory of Games (AM-28); Kuhn, H.W., Tucker, A.W., Eds.; Princeton University Press: Princeton, NJ, USA, 1953; Volume 2, pp. 307–318. ISBN 978-1-4008-8197-0. [Google Scholar]
Ekmekcioğlu, Ö.; Koc, K.; Özger, M.; Işık, Z. Exploring the Additional Value of Class Imbalance Distributions on Interpretable Flash Flood Susceptibility Prediction in the Black Warrior River Basin, Alabama, United States. J. Hydrol. 2022, 610, 127877. [Google Scholar] [CrossRef]
Koc, K. Role of Shapley Additive Explanations and Resampling Algorithms for Contract Failure Prediction of Public–Private Partnership Projects. J. Manag. Eng. 2023, 39, 04023031. [Google Scholar] [CrossRef]
Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.-I. From Local Explanations to Global Understanding with Explainable AI for Trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef]
Ma, Y.; Wang, J.; Xiong, J.; Sun, M.; Wang, J. Risk Assessment for Cropland Abandonment in Mountainous Area Based on AHP and PCA—Take Yunnan Province in China as an Example. Ecol. Indic. 2024, 158, 111287. [Google Scholar] [CrossRef]
Jolliffe, I.T.; Cadima, J. Principal Component Analysis: A Review and Recent Developments. Phil. Trans. R. Soc. A 2016, 374, 20150202. [Google Scholar] [CrossRef] [PubMed]
Zhao, Q.; Zhang, X.; Li, C.; Xu, Y.; Fei, J. Compound Ecological Drought Assessment of China Using a Copula-Based Drought Index. Ecol. Indic. 2024, 164, 112141. [Google Scholar] [CrossRef]
Nigatu, Z.M.; Fan, D.; You, W.; Melesse, A.M. Hydroclimatic Extremes Evaluation Using GRACE/GRACE-FO and Multidecadal Climatic Variables over the Nile River Basin. Remote Sens. 2021, 13, 651. [Google Scholar] [CrossRef]
Gavrilov, M.B.; Radaković, M.G.; Sipos, G.; Mezősi, G.; Gavrilov, G.; Lukić, T.; Basarin, B.; Benyhe, B.; Fiala, K.; Kozák, P.; et al. Aridity in the Central and Southern Pannonian Basin. Atmosphere 2020, 11, 1269. [Google Scholar] [CrossRef]
Raphael, M.W.; Benedict, M.M.; James, M.R. Analysis of Spatial and Temporal Drought Variability in a Tropical River Basin Using Palmer Drought Severity Index (PDSI). Int. J. Water Res. Environ. Eng. 2017, 9, 178–190. [Google Scholar] [CrossRef]
Chang, W.; Li, W.; Ma, H.; Wang, D.; Bandala, E.R.; Yu, Y.; Rodrigo-Comino, J. An Integrated Approach for Shaping Drought Characteristics at the Watershed Scale. J. Hydrol. 2022, 604, 127248. [Google Scholar] [CrossRef]
Abrar Faiz, M.; Zhang, Y.; Tian, X.; Tian, J.; Zhang, X.; Ma, N.; Aryal, S. Drought Index Revisited to Assess Its Response to Vegetation in Different Agro-Climatic Zones. J. Hydrol. 2022, 614, 128543. [Google Scholar] [CrossRef]
Tuvdendorj, B.; Wu, B.; Zeng, H.; Batdelger, G.; Nanzad, L. Determination of Appropriate Remote Sensing Indices for Spring Wheat Yield Estimation in Mongolia. Remote Sens. 2019, 11, 2568. [Google Scholar] [CrossRef]
Anderson, M.C.; Norman, J.M.; Mecikalski, J.R.; Otkin, J.A.; Kustas, W.P. A Climatological Study of Evapotranspiration and Moisture Stress across the Continental United States Based on Thermal Remote Sensing: 1. Model Formulation. J. Geophys. Res. 2007, 112, 2006JD007506. [Google Scholar] [CrossRef]
Anderson, M.C.; Norman, J.M.; Mecikalski, J.R.; Otkin, J.A.; Kustas, W.P. A Climatological Study of Evapotranspiration and Moisture Stress across the Continental United States Based on Thermal Remote Sensing: 2. Surface Moisture Climatology. J. Geophys. Res. 2007, 112, 2006JD007507. [Google Scholar] [CrossRef]
Sharma, A. Nexus of Drought, Relief Finances, and Economic Growth: Evidence from Indian States. Nat. Hazards Rev. 2024, 25, 04024033. [Google Scholar] [CrossRef]
Orimoloye, I.R. Agricultural Drought and Its Potential Impacts: Enabling Decision-Support for Food Security in Vulnerable Regions. Front. Sustain. Food Syst. 2022, 6, 838824. [Google Scholar] [CrossRef]
Svoboda, M.; LeComte, D.; Hayes, M.; Heim, R.; Gleason, K.; Angel, J.; Rippey, B.; Tinker, R.; Palecki, M.; Stooksbury, D.; et al. The drought monitor. Bull. Am. Meteor. Soc. 2002, 83, 1181–1190. [Google Scholar] [CrossRef]

Figure 1. (a) The study area in Türkiye, (b) districts in the study area, (c) a digital elevation map, and (d) cropland areas.

Figure 2. Study flowchart.

Figure 3. Loading plots of PCA analysis results represent the four selected provinces within the study area, with the selected districts for Aksaray, Karaman, Konya, and Niğde being Gülagaç, Ayrancı, Derbent, and Çiftlik, respectively.

Figure 4. Box plots of correlation coefficients between drought indices and crop yield: (a) train set, (b) test set.

Figure 5. Spatial distribution of correlation results between drought indices and crop yield.

Figure 6. Annual time series of drought indices and crop yield of the test districts.

Figure 7. CDI, SPEI, and scPDSI maps for all months of the winter wheat growing season during the 2016 dry year.

Figure 8. The CDI drought map for winter wheat growing seasons in the study region.

Figure 9. SPEI drought map for the winter wheat growing season across the study area.

Figure 10. The same as Figure 9 but for scPDSI.

Figure 11. Long-term time series of CDI, SPEI, and scPDSI for the districts in the test set.

Figure 12. SHAP summary plot calculated from predictions obtained using the XGBoost method.

Figure 13. Dependences plots obtained after SHAP analysis of single indices used to calculate CDI.

Table 1. Descriptive information of the data.

Data	Spatial Resolution	Temporal Resolution	Time Range	Sources	Product Name
Rainfall	0.25 degree	Monthly	1 January 2000–30 December 2021	CHIRPS	CHIRPS-2.0
LST	1000 m	Daily	1 February 2000–30 December 2021	MODIS	MOD11A1
NDVI	1000 m	Monthly	1 February 2000–30 December 2021	MODIS	MOD13A3
NDWI	0.05 degree	Daily	24 February 2000–30 December 2021	MODIS	MOD09GA v061
GPP	0.05 degree	Monthly	1 March 2000–30 December 2021	MODIS and GLDAS	PML_V2 0.1.7
Land Cover	30 m	-	2020	GLAD
Crop Yield	District Based	Annual	2001–2020	TUIK
Soil Moisture	500 m	Monthly	1 January 2000–30 December 2021	mHM

Table 2. Classification thresholds for CDI, SPEI, and scPDSI.

CDI and SPEI	Class	scPDSI	Class
$\leq - 2$	Extreme Drought	$\leq - 4$	Extreme Drought
−1.5 to −1.99	Severe Drought	−3 to −3.99	Severe Drought
−1 to −1.49	Moderate Drought	−2 to −2.99	Moderate Drought
−0.5 to −0.99	Mild Drought	−1 to −1.99	Mild Drought
−0.49 to −0.49	Near Normal	−0.5 to −0.99	Incipient Drought
0.5–0.99	Slightly Wet	0.49 to −0.49	Near Normal
1–1.49	Moderately Wet	0.5–0.99	Incipient Wet
1.5–1.99	Severely Wet	1–1.99	Slightly Wet
$\geq 2$	Extremely Wet	2–2.99	Moderately Wet
		3–3.99	Very Wet
		$\geq 4$	Extremely Wet

Table 3. Correlation results for different CDI weight calculation methods.

District	SHAP	PCA	Empirical	District	SHAP	PCA	Empirical
Agacoren	0.008	−0.039	−0.011	Guneysinir	0.182	0.120	0.080
Akoren	0.439 *	0.417 *	0.438 *	Guzelyurt	0.340	−0.056	0.036
Aksaray_Merkez	0.681 *	0.636 *	0.644 *	Huyuk	0.689 *	0.723 *	0.708 *
Aksehir	0.583 *	0.319	0.389 *	Ilgin	0.794 *	0.759 *	0.705 *
Altinekin	0.547 *	0.353	0.347	Kadinhani	0.829 *	0.752 *	0.740 *
Altunhisar	0.738 *	0.719 *	0.716 *	Karaman_Merkez	0.729 *	0.549 *	0.556 *
Ayranci	0.269	0.008	0.008	Karapinar	0.635 *	0.453 *	0.454 *
Beysehir	0.623 *	0.591 *	0.614 *	Karatay	0.571 *	0.445 *	0.468 *
Bor	0.644 *	0.574 *	0.565 *	Kulu	0.331	0.402	0.430 *
Camardi	0.504 *	0.320	0.305	Meram	0.456 *	0.365	0.385
Celtik	0.701 *	0.487 *	0.493 *	Nigde_Merkez	0.547 *	0.438 *	0.441 *
Ciftlik	0.408	0.475 *	0.483 *	Ortakoy	0.565 *	0.444 *	0.516 *
Cihanbeyli	0.762 *	0.689 *	0.692 *	Sarayonu	0.377	0.311	0.287
Cumra	0.558 *	0.498 *	0.480 *	Sarıyahsi	0.600 *	0.580 *	0.538 *
Derbent	0.573 *	0.304	0.367	Selcuklu	0.665 *	0.603 *	0.576 *
Doganhisar	0.520 *	0.477 *	0.417 *	Seydisehir	0.002	−0.110	−0.114
Emirgazi	0.800 *	0.701 *	0.713 *	Tuzlukcu	0.558 *	0.505 *	0.550 *
Eregli	0.254	0.155	0.167	Ulukisla	0.672 *	0.592 *	0.618 *
Eskil	0.505 *	0.268	0.332	Yalihuyuk	0.128	0.075	0.180
Gulagac	0.735 *	0.600 *	0.662 *	Yunak	0.641 *	0.639 *	0.627 *

In bold are test set districts; * p < 0.05.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Başakın, E.E.; Stoy, P.C.; Demirel, M.C.; Ozdogan, M.; Otkin, J.A. Combined Drought Index Using High-Resolution Hydrological Models and Explainable Artificial Intelligence Techniques in Türkiye. Remote Sens. 2024, 16, 3799. https://doi.org/10.3390/rs16203799

AMA Style

Başakın EE, Stoy PC, Demirel MC, Ozdogan M, Otkin JA. Combined Drought Index Using High-Resolution Hydrological Models and Explainable Artificial Intelligence Techniques in Türkiye. Remote Sensing. 2024; 16(20):3799. https://doi.org/10.3390/rs16203799

Chicago/Turabian Style

Başakın, Eyyup Ensar, Paul C. Stoy, Mehmet Cüneyd Demirel, Mutlu Ozdogan, and Jason A. Otkin. 2024. "Combined Drought Index Using High-Resolution Hydrological Models and Explainable Artificial Intelligence Techniques in Türkiye" Remote Sensing 16, no. 20: 3799. https://doi.org/10.3390/rs16203799

APA Style

Başakın, E. E., Stoy, P. C., Demirel, M. C., Ozdogan, M., & Otkin, J. A. (2024). Combined Drought Index Using High-Resolution Hydrological Models and Explainable Artificial Intelligence Techniques in Türkiye. Remote Sensing, 16(20), 3799. https://doi.org/10.3390/rs16203799

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Combined Drought Index Using High-Resolution Hydrological Models and Explainable Artificial Intelligence Techniques in Türkiye

Abstract

1. Introduction

2. Materials

2.1. Study Area

2.2. Data Description

3. Methods

3.1. Calculations of Drought Indices

3.1.1. The Precipitation Condition Index

3.1.2. The Temperature Condition Index

3.1.3. The Vegetation Condition Index

3.1.4. The Soil Moisture Condition Index

3.1.5. The Normalized Difference Water Index

3.1.6. The Gross Primary Productivity

3.2. Mesoscale Hydrological Model

3.3. Crop Yield Detrending

3.4. Benchmarking Methods

3.4.1. Standardized Precipitation Evapotranspiration Index

3.4.2. Self-Calibrated Palmer Drought Severity Index

3.5. eXtreme Gradient Boosting

3.6. SHapley Additive exPlanations

3.7. Principal Component Analysis

3.8. Model Evaluation Metrics

3.9. Drought Classification

4. Model Development

5. Results and Discussion

5.1. Machine Learning Results

5.2. Weight Obtained by SHAP

5.3. Weight Obtained by PCA

5.4. Crop Yield Correlation Results Obtained by Three Different Methods

5.5. Spatiotemporal Distribution of Drought Indices

5.6. Long-Term Evolution of Drought Indices

5.7. In-Depth Analysis of Feature Importance and Dependencies

5.8. Future Directions, Recommendations, and Practical Applications

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI