1. Introduction
The changes in global climate have led to an increase in extreme weather events, which in turn have significant impacts on geological disasters. China is one of the countries most severely affected by extreme climate events and disasters worldwide. The country experiences a wide variety of disasters with high intensity and frequency, causing severe damage [
1,
2]. Research from the China Meteorological Administration indicates that in the foreseeable future, the frequency and scope of regional or widespread high temperatures, heavy rainfall, and droughts will significantly increase, escalating the risk of disasters [
3]. Natural rock masses usually contain many cracks, fissures, and faults. Under dynamic loads, these complex structures can generate fractures, which, as they propagate and aggregate, ultimately lead to the instability, destruction, and geological disasters of the rock mass. External factors such as extreme climatic conditions are significant driving forces for engineering geological disasters. For example, high temperatures accelerate the transport of water molecules, leading to rapid evaporation of moisture, reducing the stability of the soil, and increasing the risk of geological disasters. In recent years, the impact of extreme climate events on geological disasters has become increasingly significant [
4,
5,
6,
7,
8]. The Yellow River Basin, particularly Shanxi Province, due to its unique geographical and climatic conditions, has become a high-risk area for geological disasters. During the National Day holiday in October 2021, Shanxi Province experienced several consecutive days of heavy rainfall, leading to severe geological disasters in many areas, including cave collapses, landslides, and the collapse of ancient buildings. This extreme rainfall event not only set a record for the highest rainfall during the same period in history, but also directly caused the collapse of many cave dwellings, posing a serious threat to the lives and property of local residents [
9]. The exacerbation of extreme climate change has increased the frequency and intensity of geological disasters, highlighting the urgency of strengthening geological disaster prevention and extreme climate response measures. Hence, studying the impact of extreme climate on geological disasters in Shanxi and proposing effective disaster prevention and mitigation strategies are of significant practical importance [
10,
11].
In recent years, research on the interaction between climate change and geological disasters has mainly focused on certain countries and regions in the Northern Hemisphere, with two-thirds of all studies originating from Western European countries. Statistical analysis of the correlation between climatic factors and the frequency of geological disasters is an important means of studying the response of geological disasters to climate change [
12]. Huang Mingkui et al. [
13] conducted a statistical analysis of geological disasters and meteorological data in Chongqing since 1950 and found that the frequency of geological disasters is well coupled with periods of warming and cooling in the climate over time. Cai Xia et al. [
14] studied the trend of ground temperature changes in spring and the correlation between ground temperature, air temperature, and geological disasters such as collapses and landslides in Shanxi Province using daily average temperature and ground temperature data from 107 national station sites from 1981 to 2018. They found a significant positive correlation between the ground temperature at 0 cm and the frequency of collapses and landslides. Fu Jianfei et al. [
15] analyzed the trend of climate change in Liaoning Province, pointing out a warming and drying trend, which may lead to an increase in the frequency and intensity of geological disasters in the region. Lv Wenxi et al. [
16] analyzed the spatial and temporal distribution patterns and triggering factors of loess geological disasters in Yulin and Yan’an over 22 years based on geological disaster data from Shaanxi Province from 2001 to 2022. They found that geological disasters in the loess area of northern Shaanxi are closely related to rainfall, especially under conditions of heavy or continuous rainfall. These studies indicate that climatic factors, especially temperature and precipitation, are closely related to the frequency of geological disasters. More specific indicator analyses often focus on the relationship between rainfall amounts and slope instability, a challenging issue for geological disaster researchers. The estimation of rainfall thresholds has been widely documented and is considered an important part of landslide early warning systems, with key indicators including rainfall intensity, duration, daily rainfall, and multi-day continuous rainfall [
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27].
A comprehensive review of current research on the interaction between climate change and geological disasters reveals that temperature and rainfall, as significant factors influencing geological disasters, yield different statistical models and threshold applicability in different regions. The selection of indicators for temperature and precipitation has often been too singular. The Expert Team on Climate Change Detection and Indices (ETCCDMI) recommends 27 extreme climate indices derived from temperature and precipitation, but existing studies have not fully considered the diversity of these extreme climate indicators [
28,
29,
30]. Different extreme climate indicators reflect the multifaceted impact of climate change on geological disasters. Short-term effect indicators represent heavy rainfall events (such as Rx1day, Rx5day, SDII, R10mm, and R20mm), which can cause instantaneous geological disasters like debris flows and collapses. Long-term effect indicators, like sustained high or low temperatures (such as SU25/SU35, TR10/20, FD0, FD-10, ID0, and CSDI) or prolonged wet or dry conditions (such as CWD and CDD), can alter soil and vegetation conditions, affecting long-term geological stability. Severe effect indicators such as extreme temperature and precipitation events (such as TXx, TNx, TXn, TNn, R95p, and R99p) can lead to severe geological disasters, especially in areas prone to frequent extreme climate events. Only by comprehensively analyzing the correlation between these multidimensional climate indicators and the density of geological disasters can the impact of climate change on geological disasters be better revealed.
The formation and evolution of geological disasters result from the combined effects of various factors, including the natural geographical environment and human activities. Based on the disaster-forming conditions, development characteristics, regional features, data availability, and factor independence tests of the study area, this paper selects natural geographical environment data, including elevation, terrain relief, and stratigraphic rock formation data. Using daily precipitation data from 27 meteorological stations in Shanxi Province from 1975 to 2020 and following the ETCCDMI indicator system, 32 extreme climate indices were calculated from 46 years of extreme climate events in Shanxi Province. By overlaying natural geographical environment and geological disaster data and using correlation coefficients, grey relational degrees, and other indicators, we analyzed the response of geological disasters to extreme climate events and established a geological disaster risk model under extreme climate scenarios.
2. Materials and Methods
2.1. Overview of the Study Area
Shanxi Province is located in the North China region, with geographic coordinates approximately between 34°34′ and 40°43′ N latitude and 110°14′ and 114°33′ E longitude. The province features a complex and diverse topography, mainly composed of mountains, hills, basins, and plains. In terms of soil types, Shanxi primarily has brown soil, loess, and alluvial soil. The stratigraphy of Shanxi is intricate, with rich geological structures that include sedimentary, volcanic, and metamorphic rocks from the Paleozoic, Mesozoic, and Cenozoic eras, widely exposed and providing favorable conditions for the development of mineral resources.
Shanxi experiences a temperate continental climate, characterized by distinct seasons: cold and dry winters, hot and rainy summers, with precipitation concentrated in the summer months, often leading to heavy rain and flooding. The spring and autumn seasons are mild but with less rainfall. The climate and topography of Shanxi are closely related, with extreme weather events such as heavy rain, drought, and cold waves significantly impacting the frequency and distribution of geological disasters. Extreme weather events in different topographical regions may trigger geological disasters such as landslides, mudslides, and ground subsidence, posing significant challenges to the local ecological environment and economic development. Therefore, studying the impact of extreme weather events on the frequency and distribution of geological disasters in Shanxi is crucial for disaster prevention, mitigation, and sustainable regional development.
2.2. Data Sources
Due to the abundance of DEM databases, such as GTOPO30, SRTM, and TanDEM [
31], this study chose to use the Copernicus Digital Elevation Model (Copernicus DEM: European Space Agency, Paris, France). The Copernicus DEM, released by the European Space Agency (ESA), offers selectable resolutions of 10 m, 30 m, and 90 m. It provides extensive coverage, making it particularly suitable for the needs of our study area. This study used the 30 m resolution data, available at:
https://scihub.copernicus.eu/ (accessed on 1 March 2024). The terrain ruggedness was calculated using ArcGIS based on the DEM. The geological environment data for the loess plateau stratigraphic rock groups came from the “China Geological Map (1:250,000)”. Meteorological data were obtained from the China Meteorological Data Network, comprising daily series precipitation and temperature data from 27 meteorological stations in Shanxi Province from 1975 to 2020. Geological disaster data included over 250,000 geological disaster points nationwide, covering collapse, subsidence, landslide, mudslide, ground fissures, ground settlement, and unstable slopes, sourced from the China Geological Survey and the China Geological Environment Monitoring Institute.
2.3. Research Methods
The study primarily comprised two modules: database construction and multifactor analysis. The research steps and methods were as follows:
- (1)
Using the RClimDex model to calculate extreme climate indices, the model computed 31 extreme climate indices (core extreme climate indices recommended by the World Meteorological Organization’s Climate Commission). It calculated the time series values of extreme climate indices for various meteorological stations in Shanxi Province, forming an extreme climate dataset.
- (2)
Based on ArcGIS Pro (version 3.0.0; Esri Inc., Redlands, CA, USA) and the geological disaster distribution data on Shanxi Province, spatial statistical analysis methods were employed to calculate the z-scores and p-values for each geological disaster point. The Zscore measured the local clustering degree of geological disaster points, and the p-value, derived from the Z-score using the standard normal distribution, assessed the significance level of the results. The confidence interval provided the reliability range of the statistical results. Finally, significant hotspot and coldspot areas within Shanxi Province were identified, revealing the spatial distribution characteristics and significance of geological disasters.
is the estimated density value at location ;
is the total number of geological hazard points;
is the bandwidth (smoothing parameter);
is the kernel function (commonly used kernel functions include the Gaussian Kernel and the Uniform Kernel);
is the location of the th geological hazard point.
- (3)
Using the Kernel Density Estimation (KDE) method, the spatial distribution of the geological disaster points was analyzed through GIS software, generating a density distribution map. This process yielded the spatial density distribution of geological disasters.
- (4)
By overlaying the extreme climate characteristic indicators of each station with the natural geography and geological disaster layers, a database of extreme climate indicators, natural geographic factors, and geological disaster density distribution was obtained. The correlation and significance between geological disaster density and each factor were then calculated.
- (5)
A grey relational matrix was constructed and a cluster analysis was performed based on the grey relational degree to identify potential patterns between extreme climate indices and geological disasters. Here is a detailed explanation of the elements that motivated our choice and the steps involved in our analysis:
Analysis Method
We standardized the original data on extreme climate indices (
) and the geological disaster data (
) to ensure comparability. The standardized sequences were:
- 2.
Calculation of the grey absolute relational degree:
We calculated the grey absolute relational degree
using the following formula:
where
is the resolution coefficient, typically set to 0.5.
- 3.
Calculation of the grey relational degree:
The grey relational degree
was calculated as:
- 4.
Construction of the grey relational matrix:
The elements of the grey relational matrix
are the grey relational degrees between different indices. The matrix was structured as follows:
- 5.
Clustering analysis:
Based on the grey relational matrix , we applied appropriate clustering algorithms (e.g., hierarchical clustering, K-means) to identify potential patterns between extreme climate indices and geological disasters.
Motivation for Clustering Intervals
The choice of clustering intervals was influenced by the following elements:
Normalization: ensuring that data from different sources and scales were comparable.
Resolution Coefficient : setting to 0.5 balanced the influence of the maximum and minimum differences.
Data Characteristics: the nature of the extreme climate indices and geological disaster data informed the selection of clustering intervals to accurately capture patterns.
Clustering Algorithm Requirements: different clustering algorithms (e.g., hierarchical clustering, K-means) may have specific requirements or perform optimally with certain interval settings.
By following these steps, we aimed to accurately identify and analyze the relationship between extreme climate events and geological disasters.
- (6)
Based on the results of the grey relational degree analysis, meteorological indicators were selected to construct various models, including linear regression, polynomial regression, generalized additive models (GAM), random forests, support vector machines (SVM), neural networks, and Lasso and Ridge regression models. The fitting performance of these models was then evaluated.
The organization chart is as follows (
Figure 1):
3. Results
3.1. Characteristics of Changes in Extreme Climate Indices
Based on the trend and significance analysis results of 31 extreme climate indices from 27 stations in Shanxi Province (
Figure 2), it has been found that 16 extreme climate indices show a significant overall change trend (with the number of significant stations being greater than 14). Particularly, the TMAXmean index shows a 100% proportion of significant stations, and the proportions of significant stations for TX10P, TX90P, GSL, and ID25 reach 96%, 96%, 93%, and 93%, respectively. The trend changes in various meteorological indices exhibit very high consistency across Shanxi Province. Among these, eight indices, including CDD (100%), CSDI (96%), and FD-10 (96%), generally show a decreasing trend. Conversely, 21 indices, including GSL (100%), PRCPTOT (74%), and R10mm (74%), generally show an increasing trend. The trends in the indices CWB and R20mm do not exhibit uniformity overall.
The extreme climate indicators in Shanxi Province are divided into two main categories: temperature and precipitation. Based on the definitions of each indicator, it can be determined whether the indicator represents a positive or negative change in temperature and precipitation, as shown in
Table 1. According to the calculation results of extreme climate indices, the characteristics of extreme climatic changes in Shanxi Province are obtained, as illustrated in
Figure 3. For precipitation, 62.55% of the indicators with a positive effect show an increasing trend, but only 37.45% of these positive effect indicators reach a significant level. This indicates that while there is an increasing trend in extreme precipitation events in Shanxi Province, the overall change is not statistically significant. Conversely, the indicators with a negative effect on precipitation show an absolute (100%) decreasing trend, with 77.78% of these indicators reaching significant levels. Analyzing the positive and negative effect indicators of precipitation comprehensively, although the increasing trend of positive effects has begun to manifest, showed that it has not yet reached a significant level. It can be inferred that if this trend continues, extreme precipitation events may show more significant increasing changes in the future. The significant decrease in negative effects also suggests that the overall precipitation pattern may be adjusting, and the frequency and intensity of extreme precipitation events may further increase in the future. For temperature, 96.3% of the indicators with a positive effect show an increasing trend, with 73.6% of these positive effect indicators reaching significant levels. This means that extreme temperature events symbolizing high temperatures in Shanxi Province generally show a significant increasing trend. On the other hand, 77.8% of the indicators with a negative effect on temperature exhibit a decreasing trend, with 55.97% of these negative effect indicators reaching significant levels. This indicates that extreme temperature events symbolizing low temperatures in Shanxi Province generally show a significant decreasing trend. From the analysis of the positive and negative effect indicators of temperature, it is evident that the temperature in Shanxi Province is increasing significantly, and there is a need to pay attention to the hazards of extreme high temperature events in the future.
Overall, Shanxi Province may experience more extreme high-temperature and extreme precipitation events in the future. Extreme high temperatures will accelerate soil drying and rock weathering, while extreme precipitation could lead to soil saturation and mountain instability, thereby increasing the risk of geological disasters. This can trigger a series of geological hazards such as landslides, debris flows, and collapses. It is imperative to pay close attention to the potential impacts and dangers of these extreme climatic changes on geological disasters.
3.2. Spatial Distribution Characteristics of Geological Disasters
The spatial distribution of approximately 11,000 geological disaster points in Shanxi Province, along with the cold- and hotspots of geological disasters and kernel density analysis results, are shown in
Figure 4. The analysis reveals that the hotspot areas with a significance level of 90% cover approximately 10,000 km
2, with a kernel density range of 556.51–4053.65 and a number density range of 0.16–1.32. These significant hotspot areas account for about 6.42% of the province’s total land area and exhibit clustered and uneven spatial distribution. They are mainly located in the central north–south region of Shanxi Province and can be divided into seven areas from north to south and west to east (
Figure 4), with areas of approximately: 0.158 × 10
4 km
2, 0.14 thousand km
2, 0.29 × 10
4 km
2, 0.041 × 10
4 km
2, 0.134 × 10
4 km
2, 0.112 × 10
4 km
2, and 0.107 × 10
4 km
2, respectively. The kernel density ranges are 684–2593, 903–2977, 641–2884, 1220–2090, 882–3042, 894–2439, and 773–4053, with the average kernel densities of 1981.6, 2028, 2068.3, 1738.7, 2005.6, 1830.5, and 2381.1, respectively. It can be observed that Area III has the largest area and a relatively high kernel density, while Area VII has the highest density of geological disasters, with its number density exceeding 1.0. It is recommended to strengthen geological disaster monitoring and preventive measures in these key areas, especially in zones with higher kernel densities. Particular attention should be given to Area III, which has a large hotspot area and high density, and Area VII, which has a high density of geological disasters. Enhanced monitoring and prevention efforts are crucial in these regions.
Extracting the significant hotspot areas of geological disasters in Shanxi Province, these areas were divided into pixel units and overlaid with the geological strata, DEM, and topographic relief maps of Shanxi Province to obtain the geographical overview of the geological disaster hotspots, as shown in
Figure 5. From the perspective of elevation (
Figure 5a), geological disasters are more likely to occur in mid-elevation areas (500–1500 m), with significant hotspot geological disasters at mid-elevation accounting for 57.92% of the total, which is close to the total area of mid-elevation regions in Shanxi Province (67.6%). This indicates that, relative to the total area of Shanxi Province, more geological disaster hotspots are located in low-elevation regions. In terms of topographic relief (
Figure 5b), significant hotspot geological disasters occurring in hilly plateaus and low mountains account for 96.66% of the total, far exceeding the area proportion of these two topographic types in Shanxi Province. This suggests that geological disaster hotspots are more concentrated in areas with a topographic relief of 30–200 m (63.91%) and 200–500 m (32.75%). Regarding landform types (
Figure 5c), geological disaster hotspots are more concentrated in mid-elevation hills and plateaus (35.3%). From the perspective of geological conditions (
Figure 5d), Q (Quaternary) and Pz (Paleozoic) strata concentrate 85.42% of the geological disaster hotspots, with these two strata accounting for 40.63% and 29.78% of the area of Shanxi Province, respectively. This indicates that geological disaster hotspots are more likely to occur in areas with Q strata.
The kernel density distribution of significant geological disaster hotspot areas in Shanxi Province is shown in
Figure 6. The kernel density of geological disasters does not show significant differences in elevation and geological strata (
Figure 5a,d). However, in terms of topographic relief, the density is higher in hilly plateaus and low mountain areas, with an average value reaching 2062.45 (
Figure 5b). Regarding landforms, the density is higher in low-elevation hilly plateaus, with an average value reaching 2126.89 (
Figure 5c). Comparing the distribution of significant geological disaster hotspot areas in Shanxi Province, it can be observed that although low-elevation hilly plateaus are not the landform type with the highest proportion of hotspot areas, they have the highest geological disaster density. This indicates that low-elevation hilly plateaus are high-density areas for geological disasters, with a higher frequency of geological disasters per unit area. These areas should be the focus of geological disaster prevention and mitigation efforts.
3.3. Single-Factor Correlation Analysis
By overlaying the Z-score of the geological disaster hotspot analysis in Shanxi Province, kernel density values, DEM, relief degree, and 32 meteorological indicator values from the meteorological station within the corresponding Thiessen polygons, a database of influencing factors for geological disaster density was constructed. First, the correlation between the Z-score and influencing factors was analyzed, as shown in
Figure 7. From
Figure 7, it is evident that a total of 21 indicators have a significant correlation with the Z-score: Temperature Positive Correlation (temperature_positive): SU25.Slope (0.14), SU35.Slope (0.30), TN90P.Slope (0.08), TNx.Slope (0.03), TR20.Slope (0.24), and TX90P.Slope (0.07). This indicates that these temperature-positive climate indicators have a significant positive correlation with geological disaster hotspots. Temperature Negative Correlation (temperature_negative): CSDI.Slope (−0.15), FD-10.Slope (0.05), ID0.Slope (0.09), ID25.Slope (−0.17), TNn.Slope (−0.04), and TXn.Slope (−0.04). Overall, these indicators show a negative correlation with geological disaster hotspots, suggesting that increases in these indicators may be associated with a reduction in geological disaster hotspots. Precipitation Positive Correlation (precipitation_positive): PRCPTOT.Slope (−0.17), R10mm.Slope (−0.25), R20mm.Slope (−0.14), R99p.Slope (0.08), and RX5day.Slope (0.17). It can be observed that indicators representing large precipitation events (PRCPTOT, R10mm, and R20mm) have a significant negative correlation with geological disaster hotspots, whereas indicators representing extreme rainfall intensity (R99p and RX5day) have a significant positive correlation with geological disaster hotspots. This implies that, over the long term, higher total precipitation and frequent large rainfall events are associated with a reduction in geological disaster hotspots. In contrast, extreme rainfall events (i.e., intense rainfall over a short period) are associated with an increase in geological disaster hotspots.
The correlation between kernel density and influencing factors is shown in
Figure 8. It can be observed that a total of 21 indicators have a significant correlation with kernel density. Among these significantly correlated indicators: Temperature Positive Correlation (temperature_positive): SU25.Slope (0.13), SU35.Slope (0.29), TN90P.Slope (0.08), TR20.Slope (0.26), and TX10P.Slope (0.06). This indicates that these temperature-positive climate indicators have a significant positive correlation with geological disaster hotspots. Temperature Negative Correlation (temperature_negative): CSDI.Slope (−0.16), FD-10.Slope (0.05), ID0.Slope (0.12), ID25.Slope (−0.15), TMAXmean.Slope (−0.06), TMINmean.Slope (−0.04), and WSDI.Slope (−0.06). Overall, these temperature-negative indicators show a negative correlation with geological disaster hotspots, suggesting that increases in these indicators may be associated with a reduction in geological disaster hotspots. Precipitation Positive Correlation (precipitation_positive): R95p.Slope (0.05), R99p.Slope (0.10), RX1day.Slope (0.04), and RX5day.Slope (0.17). These indicators have a significant positive correlation with geological disaster hotspots, indicating that extreme rainfall events (i.e., intense rainfall over a short period) are associated with an increase in geological disaster hotspots. Precipitation Negative Correlation (precipitation_negative): PRCPTOT.Slope (−0.15), R10mm.Slope (−0.26), and R20mm.Slope (−0.17). These indicators have a significant negative correlation with geological disaster hotspots, suggesting that higher total precipitation and frequent large rainfall events are associated with a reduction in geological disaster hotspots.
In summary, by analyzing the correlation between the Z-score and kernel density of geological disaster hotspots in Shanxi Province and meteorological indicators, it is found that high-temperature-related climate indicators (such as SU25.Slope, SU35.Slope, TN90P.Slope, etc.) show a significant positive correlation with geological disaster hotspots overall. Conversely, temperature-negative indicators (such as CSDI.Slope, ID25.Slope, TXn.Slope, etc.) generally have a negative correlation with geological disaster hotspots, suggesting that an increase in these indicators may be associated with a decrease in geological disaster hotspots. Regarding precipitation, precipitation-positive indicators (such as R99p.Slope and RX5day.Slope) indicate that extreme rainfall events (i.e., intense rainfall over a short period) are related to an increase in geological disaster hotspots. On the other hand, precipitation-negative indicators (such as PRCPTOT.Slope, R10mm.Slope, and R20mm.Slope) show that higher total precipitation and frequent heavy rainfall events are instead associated with a reduction in geological disaster hotspots.
3.4. Multi-Factor Grey Relational Analysis
Based on the constructed database of geological disaster density influencing factors, the grey relational degree between the Z-score and kernel density with other influencing factors was analyzed, as shown in
Figure 9. For the Z-score, the factors with high relational degrees are: TXn.Slope (0.83), TXx.Slope (0.82), TMAXmean.Slope (0.81), GSL.Slope (0.81), and TX90P.Slope (0.81). These high relational degree factors indicate that extreme temperatures (both extremely low and high temperatures) and changes in the growing season have a significant impact on the occurrence of geological disasters. For kernel density, the factors with high relational degrees are GSL (0.80), TXn.Slope (0.83), TX90P (0.80), TXn (0.82), and TXx (0.81), further clarifying that the occurrence of geological disasters is closely related to extreme temperature events (such as extreme high and low temperatures) and the growing season.
3.5. Geological Disaster Risk Model
Based on the results of the relational degree analysis, six extreme climate factors related to geological disaster density were selected to construct Z-score and kernel density models using linear regression, polynomial regression, GAM (Generalized Additive Models), random forest, SVM (Support Vector Machine), and neural networks, as well as Lasso and Ridge regression models. The simulation results of these models are shown in
Figure 10. From the Mean Absolute Error (MAE) results, the random forest (rf) model performed the best, with an average MAE of 1.15, significantly lower than the other models. This indicates that the random forest model has a smaller prediction error and higher accuracy. Polynomial regression (poly) and the generalized additive model (gam) followed closely, with average MAEs of 1.35 and 1.32, respectively. The performances of the linear regression (lm) and support vector machine (svm) were similar, at 1.43 and 1.18, respectively. The neural network (nn) had the highest average MAE of 1.47, indicating the largest prediction error. In terms of the Root Mean Square Error (RMSE) metric, the random forest (rf) model again performed best, with an average RMSE of 1.57, significantly lower than the other models, indicating its advantage in handling the sum of squared errors. Polynomial regression (poly) and the generalized additive model (gam) had average RMSEs of 1.79 and 1.76, slightly better than the support vector machine (svm) at 1.71. The linear regression (lm) and neural networks (nn) had relatively high average RMSEs of 1.91 and 1.93, respectively. The R
2 value reflects the model’s explanatory power. In this metric, the random forest (rf) model also performed excellently, with an average R
2 of 0.42, showing strong explanatory power. The polynomial regression (poly) and the generalized additive model (gam) had average R
2 values of 0.24 and 0.27, respectively, indicating a good fit to the data. The support vector machine (svm) had an average R
2 of 0.33, better than linear regression (lm) at 0.138. The neural network (nn) had the lowest average R
2 of 0.16, indicating the weakest explanatory power.
By applying linear regression, polynomial regression, GAM, random forest, SVM, neural networks, and Lasso and Ridge regression models to predict nuclear density using terrain and climate data, we obtained the following results: In terms of MAE (Mean Absolute Error), the random forest (rf) model performed the best, with an average MAE of 1.15, significantly lower than other models, indicating smaller prediction errors and higher accuracy. Polynomial regression (poly) and generalized additive models (gam) followed closely, with average MAEs of 1.35 and 1.32, respectively. The support vector machine (svm) and linear regression (lm) had MAEs of 1.18 and 1.43, respectively, while the neural network (nn) had the highest average MAE of 1.47, indicating the largest prediction errors. Regarding RMSE (Root Mean Square Error), the random forest (rf) again performed the best, with an average RMSE of 1.57, significantly lower than other models, indicating its advantage in handling squared error sums. Polynomial regression (poly) and generalized additive models (gam) had average RMSEs of 1.79 and 1.76, slightly outperforming the support vector machine (svm) with an RMSE of 1.71. Linear regression (lm) and the neural network (nn) had relatively high average RMSEs of 1.91 and 1.93, respectively. The R2 value reflects the model’s ability to explain the data. The random forest (rf) model performed excellently, with an average R2 of 0.42, indicating strong explanatory power. Polynomial regression (poly) and generalized additive models (gam) had average R2 values of 0.24 and 0.27, respectively, suggesting that they also fit the data well. The support vector machine (svm) had an average R2 of 0.33, better than linear regression (lm) with an R2 of 0.14. The neural network (nn) had the lowest average R2 of 0.16, indicating the weakest explanatory power.
In summary, the random forest (rf) model outperformed all of the other models across all the evaluation metrics, making it the best predictive model in this study. The polynomial regression (poly) and generalized additive models (gam) also performed well in most metrics and could serve as alternative models. The linear regression (lm), support vector machines (svm), and neural networks (nn) performed relatively poorly, especially the neural network, which failed to achieve the performance of the other models across all the metrics.
4. Discussion
Significant trends were observed in multiple extreme climate indices in Shanxi Province. Particularly, the TMAXmean index showed significant changes at all stations and reached 100%, indicating that extreme high-temperature events are widespread in Shanxi Province [
32]. Additionally, 62.55% of the indicators for the positive effect of precipitation exhibited an upward trend, but only 37.45% reached significant levels, suggesting that extreme precipitation events may show a more significant increasing trend in the future [
33]. Meanwhile, indicators for the negative effect of precipitation exhibited a 100% downward trend, and 77.78% reached significant levels. This means that although the current changes in the positive effect of precipitation have not reached significance, the frequency and intensity of extreme precipitation events may further increase over time.
This study’s results indicate that the spatial distribution of geological disasters has a clear clustering characteristic, mainly concentrated in the central region of Shanxi Province, which can be divided into seven sub-regions. The development area of geological hazards in the third region is the largest, which is 0.290 × 104 km2, and the density of geological hazards in the seventh region is the highest, with an average nuclear density of 2381.1. The frequency of geological disasters is relatively high in these areas, suggesting that monitoring and preventive measures should be strengthened, especially in areas with a high nuclear density. From a geographical perspective, geological disasters are more likely to occur in mid-altitude regions (500–1500 m), hilly terraces and low mountain regions (terrain undulation 30–500 m), and the Q and Pz strata. Although low-altitude hills and terraces do not account for the highest proportion of hotspot areas in geomorphological types, their geological disaster density is the greatest, indicating that the frequency of geological disaster occurrence per unit area is higher in these regions.
According to the univariate correlation analysis results between geological disaster density and extreme climate, different types and intensities of temperature and precipitation have different mechanisms of impact on geological disaster hotspots. High temperatures are associated with an increase in geological disaster hotspots. Rising temperatures not only increase atmospheric precipitation but also enhance evaporation rates. The impact of rising evaporation rates on slope stability is twofold: on one hand, accelerated evaporation of moisture from the slope speeds up the dissipation of pore water pressure, thereby improving slope stability; on the other hand, the repeated cycle between wetting and drying of the slope makes the soil more prone to cracking, which facilitates water infiltration during rainfall, increasing the likelihood of landslides [
34,
35,
36]. Sustained and moderate precipitation (total precipitation and frequent heavy rainfall events) may help stabilize the environment and reduce the occurrence of geological disasters. However, extreme heavy rain in a short period can lead to rapid infiltration of large amounts of water into the surface, exceeding the absorption capacity of soil and vegetation, thereby causing geological disasters such as landslides and mudslides. This aligns with the general practice of using both rainfall intensity and total rainfall in geological disaster meteorological warning indicators [
37], further confirming that in geological disaster risk management, attention should not only be given to total precipitation but also to extreme rainfall events over short periods. The grey relational analysis between geological disaster density and meteorological factors strongly confirmed the high correlation between extreme temperature events and geological disaster development. However, current research mainly focuses on analyzing rainfall thresholds, with relatively scarce results on threshold analysis for extreme high-temperature events causing geological disasters. Additionally, changes in crop growing seasons may also affect soil stability, especially when vegetation cover changes.
In the study of geological disaster risk models, through the multi-factor grey relational analysis and the comparison of various regression and machine learning models, the RF model’s superior performance in geological disaster risk assessment highlights its advantages in handling complex nonlinear relationships and highly dimensional data. Polynomial regression and generalized additive models can serve as alternative models, providing relatively reliable predictions in certain specific contexts. Conversely, linear regression, support vector machines, and neural networks performed relatively poorly in this study, especially the neural network, showing its limitations with the current dataset.