1. Introduction
The rapid pace of urbanization has become a global phenomenon, presenting a myriad of challenges that threaten the sustainability and well-being of cities worldwide. According to the United Nations, in 2018, 55% of the world’s population resided in urban areas, and this figure is expected to rise to 68% by 2050 [
1]. However, the problem of unbalanced development within urban agglomerations is becoming increasingly prominent [
2], especially in developing countries. In 2019, the Yangtze River Delta region experienced unbalanced internal development, with Shanghai’s per capita GDP being two to three times that of Anhui Province [
3]. Similarly, in the BosWash urban agglomeration on the Atlantic coast of the northeastern United States, New York City’s per capita GDP in 2019 was
$200,259, while Pennsylvania’s was only
$60,334 [
4]. Based on the experience of developed countries [
5], as China’s urbanization process gradually enters the late stage, how to promote high-quality and high-level urbanization development has become a key issue, especially in terms of how to promote balanced development in narrow regional disparities [
6]. Recently, the Chinese government has formulated a strategy with urban regions as the main carrier, advocating new-type urbanization and promoting collaborative development among cities [
7]. In this context, studying the spatial patterns of urban agglomerations to facilitate future growth is an urgent task.
In recent years, the interaction between cities affects urban growth and spatial patterns [
8,
9,
10]. The urban flows [
11], i.e., the flow of populations, materials, information, and other elements between cities, are the intrinsic motivation influencing the evolution of urban spatial structures [
12,
13]. Traditional gravity-based models have proven to be effective for measuring the spatial interactions of urban flows, but the connections between cities tend to be more multi-sourced [
14,
15]. Some scholars have constructed the population flow, information flow, and gravitational effects between cities to reflect the cooperative expansion of cities under regional development [
16]. In the field of land use prediction, there are generally two ways for urban flow elements to be embedded in land use models. One approach is to quantify urban flow as an influence factor [
14]; the second approach is to use means such as binary logistic regression [
17,
18,
19,
20] to construct new spatial interaction rules and thus embed them in land use models. Based on this, many studies have also focused on the insufficient consideration of hierarchical differences within urban agglomerations in land use. Some researchers have used the hierarchical generalized linear model (HGLM) [
21] or other functional approaches to construct a multilevel vector cellular automata (CA) model [
22,
23] to explore the impact of differences within urban agglomerations on land expansion. These studies show that the urban flow model has achieved good results in measuring the spatial connections within urban agglomerations and thus predicting land use.
In terms of land use prediction, various models are commonly used as simulation tools, including the CA-Markov model [
24,
25], CLUE-S model [
26,
27], the FLUS model [
28,
29], and artificial neural networks [
30]. Among them, the patch-generating land use simulation (PLUS) model adopts raster cells as the basic simulation unit, which can better express spatial heterogeneity, capture the characteristics of land use change on the local scale, and can better simulate the evolution of future patches [
31]. However, existing studies using the PLUS model to predict future land use patterns often focus on the selection of influencing factors and the consideration of scale effects [
32]. In contrast, existing studies using the PLUS model to predict future land use patterns tend to focus on the selection of impact factors [
32], the consideration of scale effects [
33], the spatial interactions between cities [
34], etc. However, existing PLUS studies have given insufficient consideration to the internal hierarchical differences in urban agglomerations and cannot take urban agglomerations as a whole and explore the impact of internal differences on land expansion [
35]. Meanwhile, the emergence of multi-source big data presents new opportunities for exploring the influence of human socio-economic factors on land use change when selecting PLUS influencing factors. Multi-source big data, such as nighttime lighting (NTL) [
36], remote sensing data, Tencent User Density (TUD) [
37], and Point of Interest (POI) data [
38], can reflect human socio-economic activities at a fine spatial resolution. Therefore, integrating modern multi-source data will be more favorable to the final prediction results for high-resolution land use conditions.
Considering these factors, this paper proposes the construction of a multi-scenario PLUS model that integrates inter-city interactions and a hierarchical structure. Using the Beijing–Tianjin–Hebei (BTH) region as a case study, we consider four dimensions: economy, transportation, population, and information. We construct multidimensional city interaction indicators, carry out hierarchical divisions, and introduce HGLM into the PLUS model. By integrating multi-source big data, we simulate and analyze the spatial and temporal evolution of the BTH region’s land use characteristics and explore the impacts of the hierarchical structure of inter-city interactions on the city cluster’s land expansion. Furthermore, based on the background of coordinated city cluster development, we set up differentiated scenarios to predict the future expansion pattern of the BTH region, aiming to provide decision-making references for regional coordinated development.
3. Methods
3.1. Overview
This paper integrates multiple models and methodologies to conduct multi-scenario simulations of urban expansion in the BTH region. First, it employs the urban flow model to develop urban flow indicators across four dimensions: economy, transportation, population, and information. Subsequently, the study selects various factors, including urban flow indicators, distances to city centers, distances to county centers, railways, distances to national highways, distances to expressways, and distances to provincial roads, to construct an HGLM, which calculates a hierarchical urban flow layer. This layer is then incorporated into the driving factors of the PLUS model for land use simulation, resulting in an integrated PLUS model. Furthermore, the paper effectively utilizes multi-source big data, such as nighttime light data and Points of Interest (POI) data, as influential factors in the simulation, thereby enhancing the accuracy of the predictions.
3.2. Urban Flow Model
Urban flow models are essential tools for studying various flow phenomena within cities and for informing practices in recent years. These models portray the spatio-temporal dynamics of cities from multiple dimensions, such as population, transportation, and economic factors [
12,
13,
46,
47]. Based on the basis of these studies, we select the economic flow (urban flow model) and four dimensions of urban flow (economic flow, traffic flow, population flow, and information flow) to provide a multidimensional representation of the urban flow. Most of the economic flow data have been evaluated by gravitational models [
10,
14]. However, compared with the gravity model, the urban flow model can introduce the concept of the comprehensive quality of the city and consider the impact of multiple indicators on the city’s economic strength, rather than relying solely on a single GDP indicator. As a result, the urban flow model can provide a more multidimensional and comprehensive representation of economic indicators. For traffic flow, population flow, and information flow, the weighted sum is adopted. Specific indicator selection and data sources are shown in
Table 2.
The intensity of urban flow indicates the extent of influence generated by a city’s outward functions (including agglomeration and radiation) in the linkages between cities within a city cluster. This indicator can reflect the degree of connectivity between each city in the cluster and the outside world, essentially measuring a city’s ability to radiate influence beyond its boundaries [
48]. Based on this model, the economic flow of city
i is calculated as follows:
where
N denotes the functional efficiency, measured in conjunction with the
GDP, while
represents the urban functional efficiency of city
i. The term
denotes the amount of industrial outward function, which is closely related to the location quotient (
). To measure the relative specialization of urban economic activities in the BTH region, this paper employs the following formula to calculate the location quotient of employees:
where
represents the number of people employed in industry
j in city
i;
denotes the total number of persons employed in city
i. The variable
represents the number of employees in industry
j in the study area, and
denotes the total number of persons employed in the study area.
To describe the regional economic linkages between the cities, the following formula is used to express the amount of industry outward function (
):
In this equation, represents the number of people employed in industry i in city j. A value greater than zero indicates that the industry is economically exportable to other cities.
The total amount of outward functionality for the m sectors in city
i can be expressed as follows, indicating that the total amount of economic linkages in city
i is equal to the sum of the outward functionality across all industries within that city:
3.3. HGLM
The HGLM is a statistical model designed to handle nested structured data [
49]. As a generalized form of hierarchical linear models (HLMs), which are used to deal with multilevel data with nested structures, the HGLM can accommodate various types of dependent variables, such as continuous, dichotomous, counting, and ordinal variables. This expands the scope of application for HLMs [
50]. For the binary classification of whether an area is urban land or not, it is more appropriate to use HGLMs. When determining whether a city-wide binary classification is appropriate, the HGLM is more suitable. Compared to traditional logistic regression models that assume individual independence, the HGLM takes into account the hierarchical structure of the data, resulting in better statistical properties and a stronger explanatory power.
In this paper, all the 30 m × 30 m tuples in the BTH region are considered to be level 1, while the prefecture-level cities in the same city cluster are treated as level 2. This approach constructs a data model with a nested structure.
Before employing the HGLM, a null model determination is required, using the Intraclass Correlation Coefficient (ICC) to assess the between-group differences in the hierarchical linear model.
In the above equation,
denotes the between-group variance, and
represents the within-group variance. The
ICC values range from 0 to 1, with a low within-group correlation when
ICC < 0.059, moderate within-group correlation when 0.059 ≤
ICC < 0.138, and high within-group correlation when
ICC ≥ 0.138 [
50].
To investigate the effects of the meta-cellular and urban layer variables on the dependent variable, this study first incorporates the meta-cellular layer variables to the null model and constructs a Random Coefficient Model (RCM). This step verifies whether the effects of the meta-cellular layer variables on the dependent variable are significant and assesses the degree to which they explain the variance.
Subsequently, the urban layer variable is incorporated into the null model to construct a Random Intercept Model (RIM). This step tests the significance of the effect of the urban layer variable on the dependent variable and evaluates the extent to which it explains the variance.
The specific formulas of the level 1 model are as follows:
where
is the suitability of the meta-cell to be transformed into urban land;
is the probability that tuple
i is judged to be urban land use at the tuple level;
. is an explanatory variable at the tuple level, and
is the regression coefficient of
.
The formulas for the level 2 in HGLM are as follows:
where
and
denote the intercept and slope of the regression of the dependent variable on the independent variable for the
jth prefecture-level city, respectively;
denotes the
mth independent variable of the
jth prefecture-level city in the city layer;
denotes the intercepts for
, and
denotes the intercepts for
;
denotes the slopes of
, and
denotes the slopes of
; and
and
are the random error term.
3.4. PLUS Model
The PLUS model is a CA model for predicting and modeling land cover change [
47]. The core function of the PLUS model is to analyze historical land use change data to explore the driving factors and patterns of various types of land use expansion, and to simulate future land use change based on this model [
31]. The PLUS model consists of two modules: the land expansion analysis strategy (LEAS) module and the multi-type stochastic plate seed (CARS) module [
27].
The main role of the LEAS module is to extract samples of various types of land use expansion from historical land use change data and use the Random Forest algorithm to establish the relationship between land use expansion and its drivers. The Random Forest algorithm can handle high-dimensional data and measure the importance of each driver. The outputs of the LEAS module include the development probability of each type of land use and the contribution of the drivers. The development probability can be expressed as follows:
where
Pi denotes the development probability of the
ith land use category,
,
, …,
denote the n drivers, and
f denotes the functional relationship established by the Random Forest algorithm.
The main role of the CARS module is to automatically generate land use change simulation results that align with the historical pattern of change under the constraints of the development probability output from the LEAS module. The CARS module adopts the random seed generation and threshold decreasing mechanism and controls the expansion speed and amount of each type of land use by constantly adjusting the threshold, ultimately generating simulation results similar to the actual changes. The calculation formula for the threshold value is as follows:
where
Ti(
t) denotes the threshold value of land use in category
i at the
tth iteration,
Ti(
t − 1) denotes the threshold value of the previous iteration, and Δ
Ti denotes the decreasing amount of the threshold value, which can be set according to the actual situation.
The PLUS model realizes the prediction and simulation of land use change through the organic combination of two modules, LEAS and CARS. Compared with the traditional CA model, the PLUS model can better explore the driving factors of land use change and generate more realistic simulation results.
4. Results
4.1. Multilevel Analysis of Urban Regions Based on Urban Flows
In this paper, four dimensions of economic flow (urban flow model), traffic flow, population flow, and information flow are selected to provide a multidimensional representation of urban flow. Seven driving factors, including the urban flow intensity, municipal center, district and county centers, railroads, national highways, high speeds, provincial highways, and roads, are selected for the simulation. Finally, a hierarchical urban flow layer is calculated and integrated into the driving factors of PLUS for the calculation.
In the HGLM, for continuous dependent variables, the ICC is calculated as before; however, for dichotomous dependent variables, since the level 1 residuals obey a logistic distribution, their variance is fixed at . The null model test of this paper’s model yields a between-group variance of 0.88162 and an ICC of 0.211337. An ICC ≥ 0.138 indicates a high degree of within-group correlation, with some between-group dependence and between-group differences. The use of multilevel modeling can more accurately portray the structure of the data and improve the validity of statistical inference.
In this paper, a multilevel logistic regression model is employed to classify the independent variables into two levels: the meta-cellular level (level 1) and the urban level (level 2). By constructing a series of nested models, including the stochastic covariate model, stochastic intercept model, and complete model, the independent variables are screened and categorized into two categories: explanatory variables and control variables.
At the meta-cellular level (level 1), six independent variables are included in this paper. Among them, the distance from the city center, distance from the district and county centers, and distance from the provincial highway are regarded as explanatory variables; the distance from the railroad, distance from the highway, and distance from the national highway are regarded as control variables.
At the city level (level 2), this paper introduces the urban flow intensity as a group-level variable. In this study, it is assumed that the urban flow intensity varies depending on the city in which it is located, but all metropolitan cells within the same city enjoy the same urban flow intensity. By introducing the city-level variable, this paper can examine the impact of inter-city differences on the spatial expansion of urban agglomerations.
The dependent variable in this study is binary, with a value of 1 indicating that the metropole is classified as urban land and a value of 0 indicating that it is classified as non-urban land. Through the multilevel logistic regression model, this paper estimates the effects of different levels of independent variables on the probability of urbanization of the tuple and reveal the driving mechanism of the spatial expansion of the urban agglomerations. Meanwhile, by comparing the goodness of fit and explanatory power of different models, this paper can assess the relative importance of the variables at different levels and provide a comprehensive and detailed empirical basis for understanding the spatial evolution of urban agglomerations.
The results presented in
Table 3 indicate that the spatial expansion of urban agglomerations is mainly affected by the intensity of urban flow, distance from county-level centers, distance from municipal centers, and distance from provincial highways. Among these factors, the urban flow intensity has a significant positive effect on the probability of metropolitan urbanization, while the probability of metropolitan urbanization decreases as the distance from county-level centers, municipal centers, and provincial highways increases. Furthermore, the urban flow intensity enhances the negative effect of distance from county-level centers and city-level centers on the probability of metropolitan urbanization. These results suggest that the spatial expansion of urban agglomerations is subject to a combination of multi-scale factors, including both macro-level drivers at the city level and micro-level locational influences at the meta-cellular level.
4.2. Driver Factor Selection
To clarify the complex driving forces of the BTH region expansion, this paper selects socio-economic factors and natural environment factors as the basic driving factor layers (
Figure 2). Based on the results of the urban flow’s multilevel driving force analysis of urban agglomeration expansion, hierarchical urban flow layers are added to the BTH region simulation.
To explore the relationship between the selected drivers and land use changes in the central city, this paper uses a logistic regression model for quantitative analysis. Logistic regression is a widely used statistical method for classification problems, which can establish a regression relationship with a dichotomous variable (0 or 1) [
51]. Logistic regression allows this study to assess the extent to which each driver influences land use change and reveals the intrinsic links between them.
It is generally accepted that when VIF > 10, it indicates a serious multicollinearity problem [
52]. As shown in
Table 4, the factors selected for this study do not suffer from the problem of multicollinearity, thus ensuring the reliability of the influencing factors and the robustness of the model.
The fitting accuracy of logistic regression models is usually assessed using the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve [
53]. The ROC curve is plotted with the True Positive Rate (TPR) as the vertical coordinate and the False Positive Rate (FPR) as the horizontal coordinate. The AUC value indicates the discriminative ability of the model, which ranges from 0 to 1 and is positively correlated with the model’s goodness of fit. When the AUC > 0.75, it is usually considered that the model is well fitted and has certain discriminative and predictive abilities. The ROC curves of the influencing factors are shown in
Figure 3.
As can be seen in
Figure 3, the values of the ROC are all greater than 0.75, and the curve is close to the upper left corner. The AUC value is 0.804, indicating that the 10 selected influencing factors in this paper can influence the distribution of the city limits of the BTH region and have a clear correlation with the influencing factors in terms of the spatial distribution.
4.3. Model Implementation and Calibration
The default value of the neighborhood category in the CA simulation parameter setting is 3, and both the automata take a 3 × 3 molar neighborhood. Based on the results of repeated experiments, the threshold of plaque generation is set to 0.2, and the diffusion coefficient is set to 0.1, with a higher value indicating that it is easier to generate new plaques. The random seed is set to 0.1, with a higher value indicating that the generated plaques are more dispersed. Using the Markov model based on the initial state and the transfer probability between different types, the future size of each land use type can be predicted. Based on relevant paper references [
54] and the experimental results, the neighborhood weights were finally set to 0.2 for non-urban land use and 1 for urban land use.
The final accuracy validation serves as a criterion for assessing the reasonableness of the model parameter settings and their applicability to the study area. This validation is measured using the Kappa coefficient, overall accuracy, and the figure of merit (FoM) index. These metrics quantitatively reflect the simulation’s effectiveness and can also be employed to evaluate the consistency between the two maps. In this paper, the data from 2000 and 2010 are used to predict the urban land in 2020. The overall accuracy of urban land prediction by integrating the hierarchical urban flow layer is found to be 0.981, the Kappa coefficient is 0.906, and the FoM index is 0.979. This shows that an excellent simulation effect can be obtained by considering the hierarchicality among cities when predicting the urban expansion of the BTH region.
4.4. Multi-Scenario Simulation of Urban Expansion
Building on the research of previous papers [
55] and the policy background of the coordinated development of the BTH region, this paper uses data from 2000 and 2010 for urban land forecasting, respectively, and designs four different scenarios to verify the accuracy of the model:
Scenario I: overall, this scenario overall considers the BTH synergy and incorporates the hierarchical urban flow layer.
Scenario II: this scenario considers the BTH synergy but does not incorporate the hierarchical urban flow layer.
Scenario III: this scenario does not consider the BTH synergy but incorporates the hierarchical urban flow layer.
Scenario IV: this scenario does not consider the BTH synergy and does not incorporate the hierarchical urban flow layer.
The comparison results used in this paper are shown in
Table 5 and
Figure 4. These comparisons are made to further verify the accuracy of the model.
Scenario I has the best simulation results. Its overall accuracy reaches 0.981, the Kappa coefficient is 0.906, and the FoM index is 0.979, all of which are the highest among the four scenarios. This suggests that when predicting the urban area of the BTH region in 2020, the simultaneous consideration of the inter-city hierarchical mobility and regional synergistic development can significantly improve the simulation accuracy of the model.
Scenario II has the worst simulation results, with an overall accuracy of 0.934, a Kappa coefficient of 0.699, and an FoM index of 0.928. This result suggests that considering only regional synergy and ignoring hierarchical flows between cities when predicting urban expansion significantly reduces the model’s simulation accuracy.
Scenario III is slightly less effective than Scenario IV, with an overall accuracy of 0.953, a Kappa coefficient of 0.751, and an FoM index of 0.949. This suggests that considering only hierarchical flows between cities and ignoring regional synergistic development when predicting urban expansion may reduce the model’s simulation accuracy.
Scenario IV has excellent simulation results for the second imitation, with an overall accuracy of 0.979, a Kappa coefficient of 0.859, and an FoM index of 0.976. These results indicates that the model is still able to predict urban expansion well, even if hierarchical inter-city flows and regional synergistic development are not taken into account.
In summary, when predicting the urban expansion of the BTH region, Scenario I provides the most accurate simulation results. Neglecting either of these factors will reduce the simulation accuracy of the model to some extent. This result suggests that inter-city hierarchical mobility and regional synergistic development are two key factors affecting the urban expansion of the BTH region, which need to be taken into account when constructing the model.
As shown in
Figure 5 and
Figure 6, under different scenarios, the longitudinal comparison reveals that the urban land area will expand by 19.15% in 2020–2030 and 11.29% in 2030–2040. In 2040, the urban land area will reach 1,164,502 km
2, reflecting that the level of urbanization in BTH is still increasing, and the expansion rate will slow down as time goes by. This indicates a shift towards more internal high-quality development as the level of urbanization increases.
In terms of the direction of expansion, all four sets of scenarios predict significant urban expansion in Beijing–Tianjin–Tangshan and south-central Hebei. Scenarios I and II not only predict the significant outward expansion of cities in these regions but also predict a clear trend of the group-type expansion of cities driven by the central cities of the Beijing–Tianjin Rim, Tangshan, and its environs. These regions are the high ground of urbanization in BTH and are also the focus of population and industrial agglomeration. They are likely to show group and integrated expansion in the future.
5. Discussion
5.1. Comparison of Simulation Results Under Different Scenarios
As illustrated in
Figure 7, a side-by-side comparison of the scenario results reveals that the predicted changes under the BTH synergistic development scenario exhibit greater stability, demonstrating a smoother trend of urbanization expansion. This stability may be closely related to regional synergistic development and the strong interconnections among the cities. With the acceleration of the regional integration process in BTH, transportation, economic, and resource sharing among cities have been significantly enhanced. This development has contributed to the rapid urbanization of the entire urban agglomeration while ensuring a degree of coordination and order.
In contrast, the magnitude of change in the projections under the scenario of non-synergy between Beijing, Tianjin, and Hebei shows large fluctuations. Such fluctuations may reflect the fact that the lack of regional synergistic mechanisms characterizes the expansion of cities and the allocation of resources in a more unstable and uneven manner.
In addition, the simulation integrating hierarchical urban flows (
Figure 8) showed the consistency of results across scenarios, with more reasonable and stable predictions.
On the whole, the results from Scenario I are the most consistent with the development pattern of the BTH region, indicating that under the synergistic effect of regional overall planning, the division of urban functions, infrastructure connectivity, and other regional synergies, and relying on the hierarchical and staggered urban system, the BTH region is likely to show a networked expansion pattern of “driven by the core, connected by axes and belts, and breakthrough by clusters”. The future of the BTH region is likely to show a “core-driven, axes and belts connected, clusters breakthrough” network expansion.
In the case of the BTH urban region, factors such as the radiation from core cities, the flow of population and information, the transfer of functional industries, the interconnectivity of infrastructure, and the guidance of regional policies are quantified into hierarchical urban flows, profoundly influencing the process of urban expansion in the region. Therefore, the planners and policy makers develop plans based on the perspective of the urban flow hierarchy, fully considering the complex spatial interactions within the urban region, identifying the hierarchical structure and functional division within the urban agglomeration, and set differentiated development goals and strategies for cities of different levels and types, ultimately achieving the high-quality coordinated development of regional urbanization and the economy and society.
5.2. Policy Implications
The simulation results indicate that, under the coordinated development scenario, the urban expansion of the BTH region will exhibit a networked spatial pattern. The core cities will drive regional development, while urban clusters will be interconnected through transportation corridors and functional axes, leading to a gradual achievement of integrated development across the entire region. This research outcome carries significant policy implications for fostering the coordinated development of the BTH region and other urban agglomerations. First, it is essential to optimize the regional spatial structure and establish a multi-centered, networked urban system. Second, enhancing the industrial division of labor and collaboration among cities will promote regional economic integration. Third, improving regional infrastructure and increasing the overall accessibility is crucial. Fourth, efforts should be made to equalize basic public services and reduce regional development disparities. Finally, establishing and refining a regional coordination and development mechanism will provide institutional support for the integrated development of urban agglomerations. This study offers new perspectives and methodologies for regional coordinated development and serves as a valuable scientific reference for advancing the high-quality integrated development of the BTH region and other agglomerations.
6. Conclusions
In this paper, the spatial interaction between cities within the urban agglomeration is more comprehensively portrayed by constructing a multidimensional urban flow model, which comprehensively considers the economic flow, transportation flow, population flow, and information flow. The HGLM-PLUS model is coupled to simulate urban land use in 2020. Compared with the traditional PLUS model, the overall simulation accuracy, Kappa coefficient, and FoM value are improved by 0.047, 0.207, and 0.051, respectively. This indicates that the HGLM-PLUS model, which takes into account the differences in the urban hierarchical structure, can better simulate the land use expansion of the BTH region. Under the background of BTH collaborative development, this paper conducts a simulation prediction study under four development scenarios. The results show that the urbanization level of the BTH region will continue to increase in the future, but the speed of this will slow down. The spatial presentation shows a trend of urban group expansion. When comparing the scenarios, it is evident that the predicted changes under the synergistic development scenario are more stable, whereas the fluctuation in the prediction results is larger under the scenario lacking a regional synergistic mechanism. This reflects the necessity and superiority of the BTH synergistic policy. In addition, the simulation results considering hierarchical urban flows are more consistent, and the prediction results are more robust.
The multidimensional urban flow indicator system proposed in this paper can more comprehensively characterize the interactions among cities within urban agglomerations, providing a reference for urban interaction research in other urban agglomerations. Second, this paper introduces urban hierarchical structure into land use simulation, highlighting the impact of internal structural differences within urban agglomerations on urban expansion. This approach can be extended to other urban agglomerations to better understand and guide their coordinated development. However, due to the availability and accessibility of data, the indicators used in this study are limited. More comprehensive and systematic indicators can more accurately explore the mechanism of urban expansion and obtain more realistic and usable simulation results. Future research could focus on incorporating more comprehensive indicators, exploring the long-term effects of collaborative development policies, and comparing the findings with other urban agglomerations to identify common patterns and region-specific challenges.