1. Introduction
With the development of society and the improvement of quality of life, there has been increasing attention paid to the urban environment. Urban street greening (including border trees, shrubs (Bush), grass, and other forms of vegetation) has long been recognized as an important component of urban ecosystems [
1], providing significant environmental, economic, and social benefits [
2]. It not only reflects urban characteristics but also mitigates the negative impacts of human activities on the natural environment [
3]. Good urban street greening can bring multiple environmental benefits, such as carbon sequestration and oxygen production [
4], the absorption of air pollutants [
5], the alleviation of the urban heat island effect [
6,
7], and a reduction in noise pollution and stormwater runoff [
8,
9]. Moreover, urban street greening also reflects various social attributes, such as the population density, economic development, healthcare, and education, across different city areas, which are often mirrored in the Urban Street Greening General Structure (USGGS) [
10]. Research has shown that residents in areas with higher income levels tend to benefit more from street greenery, raising concerns about environmental justice. Therefore, urban street greening not only serves environmental functions but is also closely tied to social, economic, and cultural contexts, making it a crucial indicator of urban sustainability and human well-being.
To effectively assess and improve urban street greening and its ecological functions, it is necessary to quantify various aspects of street greening. This requires not only identifying and describing the types of vegetation, but also quantifying their proportions and distribution in street spaces [
11]. Traditional methods of urban street greening assessment rely heavily on field surveys conducted by professionals, but these methods are time-consuming, inefficient, and prone to errors due to external influences [
12]. In large-scale urban surveys especially, data collection often relies on volunteers without professional expertise, which further increases uncertainty [
13]. Therefore, enhancing the accuracy of assessments, reducing human resources, and scaling assessments to larger urban areas remain key challenges.
In recent years, remote sensing technologies, particularly satellite and aerial imagery, have provided new avenues for urban green space analysis [
2]. These remote sensing methods allow researchers to analyze green space coverage on a large scale [
14]. However, challenges remain with these methods, such as their inability to accurately capture the fine details of street-level vegetation and their sensitivity to weather and lighting conditions [
15]. As a result, assessing street-level greenery with high precision remains a significant challenge that requires innovative solutions [
10].
In this context, the use of street view images has gained traction as a promising approach to urban greening assessment. Unlike traditional remote sensing imagery, street view images offer a ground-level perspective that is more aligned with human visual perception, providing a more intuitive and accurate view of urban street greening. Researchers have increasingly turned to open-source street view data, such as Google Street View (GSV), Baidu Street View (BSV), and Tencent Street View (TSV), to quantify street greening and explore the relationship between street greenery and human perception [
16,
17]. By calculating metrics such as the Green View Index (GVI) and the Green Landscape Index (GLI), these methods provide insights into the distribution and effectiveness of street greening from a pedestrian perspective [
18,
19]. However, while street view images offer a novel methodological framework, current approaches have limitations. GVI, for example, can only calculate the percentage of green vegetation, without distinguishing between different plant types, making it difficult to assess the structural quality of street greening comprehensively.
With the advancement of computer vision technologies, deep-learning-based image recognition methods offer a new solution to address these challenges. Through techniques such as semantic segmentation, deep neural networks can accurately identify and classify objects in images, improving classification efficiency [
20]. Recent studies have applied deep learning models to automatically process street view images and classify street greening into different vegetation types [
16,
19]. However, these methods still rely on public datasets, such as ADE20K and Cityscapes, which were not specifically optimized for street greening, leading to an unsatisfactory performance when applied to street view imagery [
21].
In summary, previous studies have primarily focused on assessing urban greenery at the street level [
16,
19,
22,
23,
24], quantifying trees at the urban level through tree cover calculations [
18,
25,
26,
27] and exploring urban street perception [
28]. However, the assessment of the street-level greening quality has primarily focused on the Green View Index (GVI), which only calculates the percentage of greenery without categorizing different types of vegetation. As a result, it is unable to assess the quality of street greening from a spatial structural perspective. Recently, Zhang et al. introduced a new dataset, SGSS [
29]; models trained on this dataset are capable of accurately quantifying the generalized structure of urban street greening, filling a gap in the field.
In this context, the aim of this study is to investigate the spatial distribution and structural characteristics of urban greening in Beijing, focusing on three types of greening structures: Single Tree (S-T), Tree–Bush (T-B), and Tree–Bush–Grass (T-B-G). By analyzing their distribution patterns across different socio-economic and temporal contexts, particularly before, during, and after the COVID-19 pandemic, this study seeks to uncover the underlying socio-economic dynamics that shape urban greening structures. The study hypothesizes that socio-economic factors such as commercial activity, population mobility, and economic conditions significantly influence the resilience and adaptability of these greening structures, with simpler forms like S-T being more resilient during economic stress, while complex forms like T-B-G are more vulnerable to socio-economic disruptions.
3. Methodology
3.1. Data Verification
To ensure the scientific rigor and validity of the research, the data were subjected to essential statistical analyses, such as tests for missing values and multicollinearity. Before testing, all data were normalized. In this study, MMS normalization (Max-Min normalization) was chosen.
Variance Inflation Factor (VIF) is a statistical method used to detect multicollinearity in regression models. Multicollinearity occurs when independent variables are highly correlated, leading to unstable regression coefficients and unreliable model interpretations. Usually, the VIF value should not be greater than 10 and preferably less than 5. High VIF values indicate strong correlations among variables, which inflate the variance of the coefficients and affect the model’s predictive ability. VIF quantifies this correlation by comparing the variance of an independent variable in the model to its variance when regressed independently, helping identify multicollinearity and improving the model’s stability and reliability. The calculation formula is as follows:
Among them, represents the variance inflation factor of the independent variable and the coefficient of determination () obtained when conducting regression analysis, with the independent variable as the dependent variable and all other independent variables as independent variables. denotes the linear correlation between the independent variable and other independent variables. If the value of is high, it indicates a strong correlation between and other independent variables, and the VIF value will also increase accordingly.
When the VIF value is low (usually less than 10), it indicates that the correlation between the independent variable and other independent variables is low, that the estimation of regression coefficients is stable, and that there is no significant multicollinearity problem. When the VIF value is high (usually greater than 10), it indicates a strong correlation between the independent variable and other independent variables, which may cause multicollinearity problems and lead to unstable regression coefficients.
The advantages of using matrix operations to calculate the VIF values are as follows: firstly, compared to the VIF values calculated using one regression method, the matrix operation method is more efficient, especially when there are many independent variables, which can significantly improve the calculation speed. Secondly, matrix operations can simultaneously calculate the VIF values of multiple independent variables, avoiding repeated calculations when regressing one by one. Finally, through the inverse operation of the matrix, the VIF value of each independent variable can be accurately calculated, ensuring the accuracy of the calculation results.
Firstly, all independent variables are organized into a matrix
, where each column represents an independent variable and each row represents an observation value. The second step is to calculate the correlation matrix
of the independent variable matrix
, which describes the linear correlation between the independent variables. Finally, the inverse matrix of the correlation matrix is used to calculate the VIF value of each independent variable. Specifically, the VIF value is the reciprocal of the diagonal elements of the inverse matrix of the correlation matrix. The formula is
Among them, is the covariance matrix of the independent variable matrix . is the inverse matrix of the covariance matrix. is the diagonal element of the inverse matrix, representing the linear correlation between each independent variable and all other independent variables.
We found through the variance inflation factor (VIF) test that there was a high correlation between independent variables, and that the VIF values of multiple independent variables were much higher than the common thresholds (e.g., VIF > 10), indicating serious multicollinearity issues in the model. When there is high collinearity between independent variables, standard ordinary least squares (OLS) regression may lead to unstable regression coefficients, thereby affecting the predictive ability and interpretability of the model. To address this issue, we chose to use the ridge regression model as an alternative.
3.2. DeepLabV3+ Neural Network Model Quantification of USGGS
In this study, a semantic segmentation network based on the DeepLabV3+ neural network architecture, which was open-sourced by Chen [
30], was selected. This decision was guided by the model’s exceptional performance in terms of both accuracy and processing speed, which make it stand out among the wide array of models available in the domain of computer vision (CV). Compared to traditional segmentation models, DeepLabV3+ represents a significant advancement, offering enhanced capabilities in complex tasks such as urban feature extraction. Its design is particularly suitable for urban scene analysis, enabling the precise identification and segmentation of landscape features within urban streetscapes.
The DeepLabV3+ neural network architecture builds upon DeepLabV3 as its encoder, leveraging advanced convolutional operations to generate multidimensional feature representations. A key feature of this model is its utilization of Atrous convolutions, which allow for the extraction of features at multiple scales without losing spatial resolution. To further enhance its feature extraction capabilities, DeepLabV3+ employs the Atrous Spatial Pyramid Pooling (ASPP) strategy, which facilitates multi-scale analysis by aggregating contextual information at different receptive fields. This multi-scale extraction mechanism is crucial for addressing the inherent complexity and variability of urban landscapes, where features often appear at diverse scales and orientations.
In addition to the encoder, DeepLabV3+ integrates a cascaded decoder mechanism designed to refine the segmentation output, particularly in terms of boundary details. Urban features, such as the edges of sidewalks, greenery, and other streetscape elements, often have intricate and fine-grained details that require precise delineation. The decoder mechanism in DeepLabV3+ ensures that these details are accurately captured. Furthermore, the model incorporates depthwise separable convolutions, a computationally efficient operation that simplifies the overall structure of the model, reduces the number of parameters, and enhances both accuracy and speed. These combined features position DeepLabV3+ as a highly effective tool for semantic segmentation tasks, especially those involving the analysis of urban environments.
Figure 3 illustrates the operational workflow of the DeepLabV3+ neural network, highlighting its key components and processes.
To train the DeepLabV3+ model, this study utilized the Cityscape dataset, a benchmark dataset widely used for understanding urban scenes. However, recognizing the limitations of using a single dataset, the training process was further augmented with the SGSS dataset, which was specifically tailored to the study’s objectives. By integrating these two datasets, the model was equipped to accurately identify key vegetation elements within urban streetscapes, such as trees, bushes, and ground cover plants. These elements were identified based on their distinct features, enabling the model to extract and represent the structure of urban street greenery with high precision.
The enhanced DeepLabV3+ model proved particularly effective in extracting not only greenery structures but also other characteristic elements of urban streetscapes. For instance, the model demonstrated its ability to segment roads, sidewalks, buildings, and other features commonly found in urban environments. This comprehensive segmentation capability is critical for urban studies, where a detailed understanding of the spatial arrangement and composition of streetscapes is often required.
The DeeplabV3+ training freeze phase parameters used in the study are as follows: Init_Epoch = 0, Freeze_Epoch = 1000, Freeze_Batch_Size = 16 and Freeze_Lr = 5 × 10−4. The parameters for the thawing stage are as follows: UnFreeze_Epoch = 2000, UnFreeze_Batch_Size = 16 and UnFreeze_Lr = 5 × 10−3.
The results of the DeepLabV3+ model’s application are illustrated in
Figure 3, which shows how the model effectively extracts urban street greenery structures and highlights the characteristic elements of urban streets. These outputs validate the model’s capacity to handle complex urban environments and provide high-quality segmentation results.
Table 2 further presents some of the segmentation results, offering a quantitative assessment of the model’s performance across various feature categories.
Overall, the integration of state-of-the-art techniques in the DeepLabV3+ architecture, including ASPP and depthwise separable convolutions, combined with the use of diverse datasets for training, underscores the model’s superiority in urban scene segmentation. The ability to capture the intricate details of urban features and greenery structures marks a significant advancement in the field, providing a robust tool for urban studies and landscape analysis. Through this study, the DeepLabV3+ model has demonstrated its potential to contribute to a deeper understanding of urban environments, paving the way for further applications in landscape architecture and urban design research.
3.3. Construction of Ridge Regression Model
In linear regression models, when there is covariance between input features (i.e., high linear correlation between multiple features), the regression coefficients estimated by least squares can become very large, causing the model to perform poorly in predicting new data. Ridge regression reduces the sensitivity of the model to the data by introducing a regularization term (the square of the L2 norm) in the loss function, thus placing a limit on the size of the regression coefficients. The goal is to estimate the regression coefficients by minimizing the following objective function:
where
is the actual output of the
th observation,
is the
th eigenvalue of the
th sample,
is the bias term (intercept),
is the regression coefficient corresponding to the
th eigenvalue, and
is the regularization parameter, which is called the “ridge coefficient” or the “penalty coefficient”, When
, the ridge regression is degraded to a linear regression model. Meanwhile, in ridge regression, adding the penalty term can force the regression coefficient to become smaller, thus controlling the complexity of the model and avoiding overfitting. The objective function expressed in the form of a matrix is as follows:
where
is the
output vector,
is the
identity matrix,
is the
vector of regression coefficients, and
is the regularization parameter. The gradient of the objective function is as follows:
Therefore, its other gradient is 0, which in turn yields an expression for the solution:
where
is the
unit matrix. The
in Equation is used to regularize the
matrix so that it is invertible, avoiding the case in which multicollinearity leads to a singular (irreversible) matrix.
The regularization parameter is very important in the ridge regression model because it controls the complexity of the model. When , the ridge regression degenerates into an ordinary linear regression without any regularization. When is too large, the regression coefficient tends to be close to 0, the model becomes too simple, and underfitting may occur. When takes a suitable value, it can effectively reduce the variance of the model, prevent overfitting and improve the generalization capabilities of the model.
In the construction of the model, data standardization is needed first. In ridge regression, it is usually necessary to standardize the input features (i.e., the mean of each feature is changed to 0 and the variance is changed to 1) because the regularization term will be affected by different feature measures, resulting in the regression coefficients of some features being over-penalized. Second, the regularization parameter needs to be selected The most appropriate regularization parameter is selected through methods such as cross-validation, and is a relatively small value for the model, as determined in this study through the ridge trace plots; this means that the study expects the model to penalize the data to some extent, but still retains enough fitting ability to strike a balance between the training and test sets. Third, the regression coefficients were solved for, using the closed-form solution of ridge regression (Equation (4)) to calculate the regression coefficients. Finally, the obtained regression coefficients are used to construct a model that is used to make predictions for new samples.
5. Discussion
5.1. Changes in Urban Greening Structure Before and After COVID-19: Distribution Patterns and Their Impact
This study reveals significant insights into the spatial distribution of three types of urban greening structures—Single Tree (S-T), Tree–Bush (T-B), and Tree–Bush–Grass (T-B-G)—and how their distribution is influenced by socio-economic factors in the context of Beijing. The results indicate that the distribution patterns of these structures are closely linked to the degree of commercialization, population mobility, and available space for greening in different urban areas. Understanding these patterns is crucial for interpreting the performance of each structure and its response to socio-economic changes during and after the COVID-19 pandemic.
Single-Tree (S-T) structures are most commonly found in areas with dense road networks and higher levels of commercialization. These areas, characterized by a prevalence of hard surfaces such as pavements, restrict the available planting space, which consequently affects the distribution of S-T structures. The pandemic highlighted the resilience of these structures, as S-T proved to be well suited to areas with high commercial activity and foot traffic but limited space. The low maintenance and space-efficient nature of S-T structures made them more adaptable to economic constraints and reduced public spending during the pandemic. However, while S-T structures are practical in commercial zones, their ecological value is limited compared to more complex green systems. Future research should explore the long-term ecological impacts of the widespread use of S-T structures in high-density urban areas, as well as strategies for enhancing their biodiversity.
Tree–Bush (T-B) structures are typically located in residential areas with lower levels of commercialization. These areas tend to have more available space for planting, making them better suited to the development of T-B structures, which require more room and maintenance. The decline in support for T-B structures during the pandemic, especially in areas facing greater economic pressure, reflects a broader trend of prioritizing simpler, low-maintenance greening forms in times of economic uncertainty. While T-B structures provide a range of ecological and social benefits, including enhanced air quality and social cohesion, their higher maintenance costs and space requirements make them vulnerable during financial crises. The reduced support for T-B structures during the pandemic underscores the challenge of maintaining complex green infrastructure in areas with limited resources. Future urban greening strategies should focus on integrating T-B structures into residential areas in ways that balance ecological benefits with cost-effective management.
The Tree–Bus–-Grass (T-B-G) structure is primarily found in suburban and lower-commercial activity areas, where space is more abundant. However, this structure experienced a significant decline in support during the pandemic, particularly in highly commercialized regions. T-B-G structures, which require larger spaces and more resources for maintenance, were less likely to be prioritized in areas where economic pressures led to the reallocation of funds. The reduction in support for T-B-G highlights the growing tension between the demand for multifunctional green spaces and the financial constraints faced by urban areas. While T-B-G structures offer essential ecological services, such as biodiversity enhancement and stormwater management, their complexity makes them less feasible during periods of economic downturn. The pandemic, by shifting priorities toward immediate economic recovery and basic infrastructure, has highlighted the need for more flexible and resilient green infrastructure that can adapt to varying economic conditions. Future urban greening efforts should explore how T-B-G structures can be integrated into more densely built environments or adapted to smaller spaces to ensure their continued ecological contribution.
As the post-pandemic recovery continues, urban greening strategies must be more flexible and resilient to changing socio-economic conditions. The distribution of S-T, T-B, and T-B-G structures suggests that different areas of the city require different approaches to urban greening. In highly commercialized districts, where space is limited, simpler greening forms like S-T are more feasible. Meanwhile, in suburban or less commercialized areas, where more space is available, T-B and T-B-G structures can be implemented to enhance ecological diversity and environmental quality. Future urban greening policies should aim to balance the needs of different urban areas, integrating simple and complex green structures in a way that maximizes both ecological benefits and social well-being.
5.2. The Influence of Socio-Economic Factors: Shifts in Consumption Patterns and Population Mobility
The analysis of socio-economic variables, such as the per capita disposable income (SR) and consumption expenditure (XF), provides important context for understanding how economic shifts influence urban greening. During the pandemic, as disposable income increased, there was a significant rise in the demand for more complex green spaces, particularly T-B-G structures. This trend indicates that as residents’ incomes grew, they began to place greater emphasis on environmental quality, particularly in residential areas. The increase in disposable income likely reflects a broader societal shift toward prioritizing quality of life and environmental health. However, this shift also underscores the disparity in green space demand across different socio-economic groups. Wealthier neighborhoods, where residents have more disposable income, are more likely to invest in multifunctional greening forms, while lower-income areas may face challenges in prioritizing green infrastructure. Future research should investigate how socio-economic disparities influence residents’ engagement with urban greening and how policies can bridge this gap to ensure that all communities benefit from green space investments.
During the pandemic, XF expenditures were largely directed toward essential goods and healthcare, leading to a reduction in financial support for urban greening projects. This was particularly noticeable for T-B-G structures, which are resource-intensive and require higher levels of investment. The shift in consumption patterns reveals a direct link between economic pressure and the prioritization of urban infrastructure. As financial resources were diverted towards immediate needs, such as healthcare, the long-term sustainability of green infrastructure was compromised. The findings suggest that, in future crises, urban greening projects must be designed with greater flexibility to withstand financial fluctuations. Diversifying funding sources, including public–private partnerships (PPP), could ensure the continued support and development of green spaces, even during periods of economic instability.
The significant negative impact of the number of high school students (ZXS) on T-B and T-B-G structures, particularly during the pandemic, highlights how disruptions to education affect the demand for green spaces. With the shift to online learning, there was a reduction in the time students spent in school and, by extension, a decrease in the need for green spaces around schools. This was especially true for T-B-G structures, which are often present in school environments and their surrounding areas. The redirection of funds from campus greening to online education infrastructure further exacerbated the decline in green space investment in educational settings. This trend raises concerns about the future role of school-based greening in promoting students’ mental and physical health. Urban greening policies should consider the importance of green spaces in educational environments, even as educational models shift, to ensure that students continue to have access to beneficial green spaces.
5.3. Limitations and Future Research Directions
This study also has some limitations that provide directions for future research. First, due to limited data sources, this study focuses solely on Beijing and does not compare the impact of individual characteristics versus urban environments across different cities or countries. Future research should extend this analysis to multiple cities or representative cities from different countries, and obtain more granular socio-economic and urban environmental data to conduct regression analyses on various types of urban greening and infrastructure.
Second, although we differentiate between trees, bushes, and grass within the streetscape physical environment, we did not account for factors like tree health or aesthetic quality, which are suggested by the “broken windows theory” to be key factors; this emphasizes the importance of active street monitoring and maintenance. Future studies should explore how these factors influence the effectiveness and sustainability of urban greening projects, as maintaining green spaces in a healthy, aesthetically pleasing state can significantly enhance their ecological and social benefits.
Despite these limitations, the findings of this study provide valuable insights into the role of urban greening in shaping socio-economic dynamics and highlight important areas for further research in the field of urban sustainability and resilience.