1. Introduction
The physical environment associated with heat in urban areas is known as the urban thermal environment (UTE) and can affect human thermal comfort and physical conditions [
1,
2]. During the rapid urbanization process, the changing urban form, population migration and uncontrolled land exploitation cause alterations in the surface energy balance, which in turn leads to the continuous deterioration of the UTE [
3,
4]. Statistics show that the human-induced global land surface temperature has increased by 1.07 °C from 1850–1900 to 2010–2019 [
5]. As a result, many ecological and social problems have arisen, such as ecosystem destruction, increased energy consumption and threats to human health [
6,
7,
8]. The UTE should therefore be an important research object for sustainable urban development [
9].
Studies increasingly recognize that investigating the spatial and temporal characteristics of the UTE and its driving mechanisms is a prerequisite for mitigating its deterioration [
10,
11]. Although station observations, model simulations and mobile measurements are the main methods to obtain the local thermal environment, these methods hardly reflect the spatial and temporal characteristics of the UTE on a macroscale [
12]. Studying urban scales, however, helps to understand the unique surface structure of a city [
13]. Recently, the popularity of satellite remote sensing has facilitated access to the acquisition of geospatial information related to land cover, especially in large-scale and long-term monitoring [
14]. It has become common to quantify the spatial and temporal characteristics of the UTE and the mechanisms driving it using land surface temperature (LST) obtained from satellite thermal infrared remote sensing [
11].
LST is a complex defined by multiple factors [
12,
15], such as human activities and landscape composition and configuration. Researchers often describe the biophysical characteristics of the land using landscape composition [
16]. Land covered by buildings usually increases LST, while the opposite is true for land covered by vegetation and water bodies [
17]. Studies have demonstrated that the role of landscape composition in LST can be quantified using the normalized difference building index (NDBI), fractional vegetation cover (FVC) and modified normalized difference water index (MNDWI) [
18,
19].
Human beings tend to release large amounts of anthropogenic heat through intensive and continuous activities on the ground, resulting in a rise in LST [
20]. Population density and nighttime light data are considered suitable for analyzing the contribution of human activities to LST [
21,
22]. Moreover, the combination of landscape composition and human activities can jointly facilitate the spatial arrangement of the entire landscape, i.e., landscape configuration [
23]. The efficiency of energy exchange and surface energy fluxes between landscape patches can be influenced by their spatial configuration, which in turn leads to differences in the distribution of LST [
15].
Most studies have used landscape pattern indices to examine the role of the spatial configuration of landscape patches, including area factors, shape factors, aggregation factors and diversity factors [
15,
24]. It is important to note that the landscape pattern indices are multidimensional and usually include landscape-level indices and class-level indices in studies [
25]. Landscape-level indices are applied to reflect the spatial configuration of multi-class mosaics of landscape patches, but it is difficult to show the morphological characteristics of each landscape patch [
26].
Class-level indices can provide more detailed information on the spatial pattern of each type of landscape patch than landscape-level indices [
26]. The stratified nature of ecological processes in the landscape may influence LST [
15]. As a result, a growing number of researchers have investigated the impact of landscape pattern indices that combine both the landscape level and the class level on LST. For example, by studying the relationship between the spatial configuration of heat source and heat sink landscapes and LST in Zhengzhou over an 18-year period, Zhao et al. [
27] found that both landscape-level and class-level landscape pattern indices have a significant impact on LST and can inform urban planning based on the results of the different levels. Furthermore, there is a consensus among researchers that it is more important to optimize the landscape composition than to optimize the landscape configuration at different levels when reducing LST [
15,
27]. However, the difference in the importance of landscape pattern indices at the landscape level and class level in reducing LST has not received much attention. They represent different spatial hierarchies of landscape configuration and may contribute differently to LST [
28]. By comparing the differences in the contributions of landscape-level and class-level landscape pattern indices to LST, it is possible to clarify whether the importance of landscape configuration in reducing LST varies from level to level and thus to inform the adjustment of urban planning strategies [
28,
29].
In fact, quantifying the contribution of drivers is challenging because their effects on LST may be the result of a nonlinear interaction process [
30]. For instance, by studying the effect of urban landscape structure on LST, Zhou et al. [
31] found interactions among some landscape indices; i.e., changes in some landscape indices will indirectly enhance or diminish the effects of other indices on LST. Hu et al. [
32] highlighted the existence of thresholds for the effects of the Normalized Difference Vegetation Index (NDVI) and NDBI on LST through empirical studies. Common analytical methods for exploring the effects of drivers mainly include correlation analysis and regression analysis [
33,
34,
35]. However, these methods are only applicable to explore linear relationships between variables, and most of them have difficulty detecting interactions between independent variables, making it difficult to capture the marginal effects of variables on LST. Meanwhile, the highly complex variability and spatial heterogeneity of the UTE require deeper insight to accurately characterize the relationship between LST and its drivers [
30]. As a statistical method that has emerged in recent years, a geographical detector can quantify the impacts of drivers and their interactions on spatial heterogeneity [
36]. The detector does not follow the linearity assumptions of traditional statistical methods, and it can avoid the effects of collinearity between variables [
37]. This approach has become common in attribution analysis in a number of fields, such as social sciences, environmental sciences and public health [
38,
39,
40]. Therefore, this study conducted a multi-level attribution analysis for LST using the geographical detector.
In addition to the lack of quantification of the interactions and nonlinear effects of drivers on LST, there are limitations in research on the driving mechanisms of LST at the time scale. Much of the research on the driving mechanisms of LST focuses on a single stage of urbanization [
20,
41] while ignoring the differences in the driving mechanisms of LST across different urbanization stages. Driving mechanisms may show variable results across different stages of urbanization, so the neglect of such variability may not support the development of urban planning strategies to reduce LST in differentiated UTE contexts [
42].
The rapid urbanization of major Chinese cities over the last twenty years has led to the transformation of land cover, which in turn has resulted in the continuing deterioration of the UTE [
43]. As one of China’s fifteen sub-provincial cities, Shenyang’s urbanization rate was over 20% higher than China’s average urbanization rate in 2020. Meanwhile, Shenyang’s status as an important industrial base in China has allowed it to undergo a phase of rapid urbanization and led to increased UTE deterioration [
44]. Like many cities, Shenyang has faced the threat of extreme weather conditions in recent years [
45]. In the summer of 2018, the highest temperature in Shenyang reached 38.4 °C, surpassing previous statistically extreme values. Therefore, Shenyang can be used as a representative for the study of UTE deterioration during rapid urbanization, and the findings can be a reference for other cities with similar development backgrounds.
Based on the 2000, 2010 and 2020 LSTs of Shenyang, this study analyzed the spatial and temporal differentiation characteristics and driving mechanisms of the UTE. The main objectives of our study were (1) to detect the spatial and temporal differentiation characteristics of LST and (2) to construct an analytical framework to quantify the relationship between landscape composition, landscape configuration, human activities and LST heterogeneity in a hierarchical manner and to reveal the changing characteristics of the driving mechanisms at different stages of rapid urbanization. This study can enhance the understanding of the mechanisms driving the UTE during rapid urbanization and provide marginal contributions to the formulation of strategies for land use and urban planning.
4. Discussion
4.1. Spatial and Temporal Differences
Researchers in urban ecology believe that the UTE reflects the result of the surface and atmospheric energy balance [
11]. The study of the spatial and temporal differentiation of the UTE is fundamental to understanding ecological change and urban development [
12,
31]. From 2000 to 2020, LST in Shenyang continued to rise. The hotspots consistently extended in the northeast–southwest direction, and the hot/cold spots in the city center and its suburbs kept expanding. These findings indicate that rapid urbanization has indeed resulted in the deterioration and spatial heterogeneity of the UTE [
58].
Among all land use types, we found that the hotspot regions were always concentrated on impervious surfaces. Impervious surfaces, including impervious landscapes and exposed surfaces in cities, are thought to always increase urban temperatures by holding heat and reducing evaporative cooling [
59]. It is noteworthy that, although water and forest were always the land types with the lowest mean values of LST, the cold spots in Shenyang were always clustered in croplands near water during the 20 years. This may be influenced by the area of land cover [
60]. As shown in
Table 8, cropland was always the dominant land type in the study area, with a consistent percentage of over 58%, although it continued to decrease. In contrast, forest and water continued to expand, but the sum of the areas never exceeded 9%. For the whole study area, the smaller area proportions make it difficult to form spatial clusters with significant cold spots in forest and water. However, the growth of large areas of crops in summer, such as corn and rice, can lead to an increase in surface soil moisture and evapotranspiration and thus can promote the clustering of cold spots [
61,
62].
An interesting phenomenon is that the LST hotspots in the Hun River and its surrounding areas in the central city of Shenyang have been disappearing over the past 20 years, creating two heat islands to the north and south along the Hun River. This may be due to the pollution control and ecological protection of the Hun River. As an important industrial city in China, Shenyang has well-developed food and chemical industries, but the uncontrolled discharge of pollutants led to extremely serious pollution of the river until 2002 [
63]. The Shenyang municipal government then embarked on river management and gradually established the landscape corridor to optimize the quality of the river and its surrounding ecology [
63]. Through these measures, the LST in the area has also been reduced, gradually creating a corridor separating LST hotspots.
4.2. Driving Mechanism at the Class Level
The relationship between the landscape configuration and LST varies considerably across land types at the class level. On longer time scales, the PLAND and LPI of cropland and impervious surfaces were always the main landscape configuration indices that affected LST. The intensity of LST decreases as the area and patches of cropland increase, while the opposite is true for impervious surfaces. As part of the urban green space, the positive impact of the increased size and abundance of cropland on LST is associated with increased irrigation and an increased scale of transpiration during crop growth peaks [
64,
65]. Furthermore, the aggregation and large-scale expansion of built-up land can lead to a reduction in surface infiltration and surface moisture, which in turn leads to the deterioration of LST [
66].
It is noteworthy that the LST of cropland always decreased rapidly in the first two levels of LPI and PLAND, and the LST of impervious surfaces always increased rapidly in the last two levels of LPI and PLAND. This implies that the transpiration cooling effect of cropland on LST occurs rapidly with increasing area and abundance [
67]. Moreover, when the impervious surface increases to a certain extent, unrestricted urban expansion leads to difficulties in finding a balance between urban development and urban ecology, resulting in a rapid rise in LST [
68]. However, over the 20-year period, the difference in LST increased for each level of PLAND and LPI, and the relationship between PLAND, LPI and LST gradually changed from nonlinear to near-linear. This can be attributed to rapid urbanization [
69]. Rapid urbanization has led to massive urban expansion, rapid land cover changes and ever-increasing LST. In this case, while the PLAND and LPI of cropland and impervious surfaces still have effects on LST, their marginal effects are more difficult to specify, and further attention needs to be paid to the interaction of landscape configuration to maximize LST reduction.
For other landscape configuration indices, even if the individual explanatory power was not strong, the interactions between them tended to enhance their impact on LST. This implies that the complex synergy of factors is more important than the role of a single factor [
38]. The strongest explanatory power of the interactions appeared in the area and shape indices of the impervious surface, including LPI, PLAND, ED and FRAC_AM. This suggests that the combination of area, abundance, shape and connectivity patterns of different patches of impervious surface needs to be focused on when optimizing the UTE [
70]. The PLAND, LPI and ED of cropland also showed strong interactions, suggesting that the combined configuration of the area, abundance and patch connectivity patterns of cropland also needs to be taken into account. Attention also needs to be paid to the interacting configurations of the LPI and FRAC_AM of water, as they also showed stronger interactions in 2010 and 2020.
4.3. Driving Mechanism at the Landscape Level
Landscape composition and human activities are key drivers that influence LST at the landscape level. The NDBI, population density, and nighttime light were significantly and positively correlated with LST from 2000 to 2020. During the rapid expansion phase of cities, population migration and economic activity can increase anthropogenic heat release and LST [
71,
72]. In contrast, FVC showed a significantly negative correlation with LST. Urban green areas can promote air ventilation by improving the thermodynamic properties of the land surface, which can help reduce LST in urban areas [
71,
73]. Unlike previous studies, the MNDWI and LST were not always significantly negatively correlated, but rather, there was a threshold effect. The positive effect of the MNDWI on LST could be attributed to the fact that the remote-sensing images were acquired at night and in late summer. Gunawardena et al. [
74] showed that urban heat island intensity is greatest at night and in late summer, which may lead to warming effects on water bodies. In addition, we speculate that LST will rapidly decrease when the surface water content reaches a certain threshold.
The landscape configuration indices had less explanatory power for LST at the landscape level than at the class level. This finding might further indicate that the configuration of land cover is more valuable than that of the entire landscape when studying the driving forces of LST [
75,
76]. However, we still found strong explanatory power for FRAC_AM for LST at the landscape level. Complex landscape patches can lead to a reduction in LST, especially in the first two levels of FRAC_AM. Zhao et al. [
27] argued that the shape complexity of urban landscapes should receive more attention when aiming to reduce the LST of cities, which is consistent with our findings.
With the exception of the MNDWI and FRAC_AM, all drivers showed a near-linear correlation trend with LST over 20 years, similar to the results at the class level. This suggests that more attention also needs to be paid to the interaction of drivers at the landscape level during the rapid urbanization phase.
The two variables with the strongest interaction effects on LST were constantly changing over 20 years. However, one of the interacting variables always included the NDBI. The single explanatory power of the NDBI for LST is very strong, but the explanatory power can still be enhanced by its interaction with other factors, such as nighttime lighting, population density, FVC and SHDI. These findings emphasize the importance of built-up areas in LST at the landscape level. At the same time, urban planners need to pay attention to the combination of buildings and vegetation, population, economic activity and the diversity of the landscape when optimizing the UTE.
4.4. Impact on Urban Planning
This study explored the spatial and temporal differentiation characteristics and driving mechanisms of the UTE. In addition, the differences in driving mechanisms across different stages of urbanization were investigated. The results of this study may therefore have important implications for UTE-based urban planning. At the landscape level, due to the high correlation between the NDBI, population density and nighttime light and LST, the high density of buildings, concentrated population and frequent economic activities can have a significant impact on UTE deterioration. Therefore, urban sprawl and human activities should first be strictly controlled during rapid urbanization. FVC and FRAC_AM are significantly negatively correlated with LST, so during rapid urbanization, increasing vegetation cover and landscape diversity can help to rapidly optimize the UTE, for example, by creating pocket parks and landscape corridors in high-density urban cores and increasing the protection of ecological reserves in the outskirts of cities. The marginal contribution of the MNDWI to LST suggests that the stock of water bodies should be maintained, their quality should be improved, and large lakes should be built on the outskirts of the city. At the class level, the relationship between PLAND and LPI and LST for cropland and impervious surfaces indicates that an ecological red line for cropland should be drawn, and continuous cropland should be built on the outskirts of the city to avoid the encroachment on cropland caused by concentrated urban development during rapid urbanization. The integration of the main drivers at the landscape level and class level allows for the efficient optimization of the UTE during rapid urbanization.
In the later stage of rapid urbanization, the development of large urban areas is constrained, and the marginal contribution of a single driver to UTE is reduced. At this stage, therefore, urban planners need to focus on the interactions between drivers to maximize the reduction in LST. Firstly, according to our study, the combination of the area, abundance, shape and patchy connectivity patterns of impervious surfaces needs to be focused on. This means that when controlling urban sprawl, attention should also be paid to the diversity of morphology and the complexity of the boundaries of urban built-up areas. Secondly, the area, abundance and patch connectivity patterns of cropland have strong effects on LST. This suggests that when conserving cropland, attention needs to be paid to the size of the cropland, the type of crop and the complexity of the cropland boundaries. Thirdly, the abundance and morphology of water can also significantly interact with LST, and it is therefore recommended that when increasing water area, it should be combined with a diversity of morphological features. Finally, the combination of buildings and vegetation, population, economic activity and landscape diversity has a significant impact on LST, suggesting that, in the later stage of rapid urbanization, urban development should be intensified, and the ecological benefits of the landscape should be achieved by allocating diverse vegetation and landscape functions to high-density built-up areas.
5. Conclusions
Based on three Landsat images of Shenyang city in 2000, 2010 and 2020, we investigated the spatial and temporal differentiation characteristics of the UTE and its driving mechanisms during rapid urbanization using the SDE, hotspot analysis and the geographical detector. We identified the driving mechanisms by which landscape composition, landscape configuration and human activities affect the UTE at multiple levels. The key findings of this study are as follows.
Firstly, the average LST in Shenyang continued to rise and had significant spatial and temporal differentiation characteristics during the twenty-year period. The hotspot regions extended in the northeast–southwest direction. Hotspot and cold spot areas continued to expand in urban and suburban areas, respectively. The hotspots were always concentrated on impervious surfaces, while the cold spots were scattered in croplands near water. Over a period of 20 years, the hotspot in the central city has gradually been divided into two parts along the north and south banks of the Hun River.
Secondly, the spatial configuration indices at the class level are more important than those at the landscape level when studying the UTE. PLAND and LPI for class-level cropland and impervious surfaces were strong drivers of LST over the 20-year period, whereas only FRAC_AM at the landscape level had a consistently significant effect on LST.
Thirdly, as urbanization progresses, the marginal effects of some drivers on LST decrease, showing a near-linear correlation. At the class level, increases in the PLAND and LPI of cropland could rapidly reduce LST, while increases in the PLAND and LPI of impervious surfaces could rapidly increase LST. At the landscape level, the NDBI, population density and nighttime light were positively correlated with LST, FVC and FRAC_AM could effectively reduce LST, and there was a threshold for the response of the MNDWI to LST. However, over a period of 20 years, the nonlinear relationship between all factors and LST, except the MNDWI and FRAC_AM, gradually shifted to a near-linear relationship.
Finally, the interaction between the drivers should receive more attention when optimizing the UTE in the later stage of rapid urbanization. All class-level and landscape-level factors had enhanced effects on LST through their interactions. At the class level, the focus needs to be on the combination of the area, abundance, shape and patch connectivity patterns of impervious surfaces. The combined configuration of the area, abundance and patch connectivity patterns of cropland and the patch abundance and shape of water also need to be considered. At the landscape level, attention needs to be paid to the integration of buildings and vegetation, population, economic activity and landscape diversity. These results will provide the basis for a more comprehensive understanding of the UTE and the implementation of urban planning strategies during rapid urbanization.
Our study also has some limitations. First, our study was conducted only in Shenyang, a typical city in cold regions. The spatial pattern of LST may vary greatly among different cities due to differences in urban development patterns. Recent studies have concluded that conducting studies on LST in different cities or urban clusters can avoid bias in the results [
13,
77]. Second, in this study, only three remote-sensing images were selected for the time-scale research. Although the meteorological conditions of the selected remote-sensing images are similar and good, it is difficult to reflect the results of the study on a time scale of 20 years with data from only three remote-sensing images. Follow-up studies could increase the number of remote-sensing images to draw more precise conclusions. Third, we studied only one analysis unit at 2 km*2 km. Although this unit is recognized, the modifiable areal unit problem (MAUP) may exist, and different analysis scales may lead to differences in the results [
54]. Therefore, in future UTE studies, the inclusion of different statistical scales should be considered to increase the accuracy of the results. Finally, the drivers affecting changes in the UTE may not be limited to two-dimensional composition and configuration. Three-dimensional landscape indicators, such as the floor area ratio and shadow patterns, have been found to influence the UTE [
78,
79]. However, this was not discussed in this study. The relationship between three-dimensional urban landscape metrics and the UTE should be emphasized and combined with two-dimensional metrics for a comprehensive analysis.