1. Introduction
As cities continue to develop, a notable shift from urban expansion to urban renewal has occurred. The increasing demand for urban environmental quality and livability has become more pronounced [
1,
2]. In this new urban development phase, visual-spatial quality and the built environment have become top priorities [
3,
4]. Visual complexity, a key concept in environment perception, is pivotal in building design and assessing environment visual quality. It not only shapes people’s emotional responses and aesthetic experiences [
5] but is also closely linked to the visual appeal of the environment [
6], its therapeutic benefits [
7], and individual behavior patterns [
8]. Thus, scientifically measuring visual complexity is crucial for creating high-quality urban built environments.
Although methods for quantifying visual complexity have been widely applied in fields such as cognitive psychology [
6], computer science [
5,
9], and the arts [
10,
11], research in environment visual studies remains limited, and a comprehensive methodological system has yet to be established. Most studies still rely on manual evaluation as the primary data collection method. There are two main approaches: (1) expert-based objective evaluation, which follows predefined standards to systematically assess visual complexity based on factors such as image color and landscape elements; and (2) public-based subjective evaluation [
12,
13], which often encounters challenges such as high time consumption, limited evaluation scope, delayed feedback, and interference from subjective preferences.
As machine learning and computer vision technologies continue to advance, methods for evaluating image complexity are also evolving. Many studies have begun to employ machine learning techniques to assist manual scoring [
14,
15], aiming to evaluate large-scale datasets efficiently. However, this approach often faces challenges in terms of accuracy. Meanwhile, assessment methods based on the combination of image features have emerged as the mainstream [
16,
17], as these models are better at understanding and learning the underlying complex patterns in the data. Additionally, the decision-making process of such models is more transparent, and they tend to achieve higher accuracy. Cavalcante et al. utilized spatial frequency statistics and local contrast to calculate perceived complexity in the streetscape, but their method has limitations regarding stability and accuracy across different street types [
18]. Liang et al. integrated traditional assessment methods with machine learning techniques to analyze six perceptual attributes of streetscape images, demonstrating that images with higher levels of complexity tend to elicit more positive perceptions [
19]. Conversely, Ma et al. classified streetscape types based on three-dimensional features of complexity—texture, shape, and color—and developed a digital streetscape indexing system. However, their primary focus was not on quantifying visual complexity [
20]. Supervised learning, a branch of artificial intelligence, has demonstrated substantial potential in simulating human behavior and decision-making, especially in integrating multiple features and addressing complex problems.
Visual complexity is characterized by a high density of available information, where the ratio of available to total information in an environment, when optimized, enhances its attractiveness and arousal potential for individuals [
21]. Psychologist Mehrabian proposed that environmental information primarily stimulates individual responses through three dimensions: intensity, complexity, and novelty [
22]. As a critical indicator influencing visual perception, defining complexity is a fundamental prerequisite and a key focus for conducting quantitative research. Kaplan and Kaplan (1989) linked complexity with the number and organization of visual elements in an environment, suggesting that highly complex yet coherent landscapes can provide visual richness, thereby enhancing visual aesthetics [
23,
24]. Ode, Tveit, and others defined complexity as the diversity and richness of landscape elements, as well as the dispersion of visual patterns and variability of features [
25]. Ewing viewed complexity as visual richness, emphasizing the importance of diversity in the physical environment [
26]. Most experts consider complexity from the perspective of human subjective experience, focusing on its influence on environment perception and aesthetic experience. They connect complexity with visual characteristics and environmental perception, highlighting the contribution of diversity and richness to visual complexity [
15,
27].
Additionally, visual perception is driven by two strategies [
28]: bottom-up, which is initiated by low-level image features such as color and texture, and top-down, which is guided by high-level semantic information [
29,
30]. Traditional methods for quantifying subjective complexity rely primarily on low-level visual features, which present certain limitations. To more accurately reflect human high-level perception, this study introduces hierarchical complexity based on semantic image segmentation as a form of high-level semantic information. This innovative approach simulates human understanding of images by extracting multi-level semantic information, thus bridging the gap between low-level visual features and high-level human semantics [
31].
As illustrated in
Figure 1, this study introduces a novel method for measuring visual complexity. It employs machine learning to establish a mapping between objective image features and subjective complexity perception, enabling the prediction of subjective streetscape complexity based on objective image characteristics. This method efficiently processes large-scale datasets. The study identifies six key indicators—compression ratio, symmetry, fractal complexity, color complexity, grayscale contrast, and hierarchical complexity—and utilizes multi-feature fusion training. This approach offers a comprehensive explanation of the multidimensional nature of image complexity, closely mirroring human visual perception. The model was subsequently applied to assess the visual complexity of streetscapes in key districts of Tianjin. The primary contributions of this research are as follows:
Introduction of a new measurement dimension: The study proposes hierarchical complexity as an advanced image feature to more accurately capture the intricacies of street scenes while bridging semantic gaps and validating its effectiveness.
Development of a quantitative model: A quantitative model for streetscape visual complexity is constructed, with an analysis of the contributions of various indicators.
Empirical validation and geographic analysis: The quantitative model is empirically validated, and an in-depth analysis of the geographic distribution of visual complexity is conducted within Tianjin’s Xiaobailou and Wudadao Districts.
3. Results
3.1. Contribution of Features in the SVM Classification Model
The contribution of each feature in the SVM classification model is typically determined by the model parameters obtained during training, which reflect the extent of each feature’s contribution to the model. The features are ranked by their contribution in the following order: compression ratio, grayscale contrast, hierarchical complexity, fractal dimension, color complexity, and symmetry (
Table 1).
Despite its proven close relationship with visual complexity, the compression ratio has rarely been used as a metric to quantify image complexity in previous studies. In this paper, the compression ratio is employed to measure visual complexity and is identified as the feature with the highest contribution. This validates the strong connection between the compression ratio and complexity perception. Experimental results show that images with a higher compression ratio often contain extensive details and complex structures, such as intricate building textures and plant details in street scenes. These images are difficult to compress without significant loss of information, resulting in minimal differences in file size before and after compression.
High grayscale contrast areas, characterized by their distinct light-dark distinctions, tend to capture greater attention, thereby enhancing the resolution of visual details and improving the perception of three-dimensionality in the image [
60]. As a result, grayscale contrast is particularly crucial in evaluating the complexity of visual scenes. In street scenes, certain elements exhibit high grayscale contrast due to significant brightness differences from their surroundings, while most buildings or ground structures show more gradual brightness changes, resulting in lower grayscale contrast.
Hierarchical complexity demonstrates a strong performance in the SVM model’s contribution analysis. It reflects the diversity of the street and offers insights into its vibrancy, pedestrian activity, and traffic volume. This complexity assessment, which utilizes advanced image semantic segmentation techniques, enhances traditional methods by capturing often-overlooked details. By leveraging deep learning on extensive streetscape datasets, semantic segmentation models can accurately extract high-level semantic information from images, thereby providing a more nuanced simulation of human perception of complexity.
A higher fractal dimension indicates greater self-similarity and more complex details across different scales [
46]. Streetscape images often contain numerous self-similar structures, such as the branching of trees, the arrangement of building windows, and the patterns in brick walls. They reflect the visual complexity and rich detail layers of street scene images and align well with subjective perceptions of complexity.
Previous research has suggested that visual features like color are key factors in increasing landscape diversity and significantly impact complexity perception in urban environments [
61,
62]. However, Ciocca et al. (2015) presented a contrasting view, finding that metrics such as color richness do not correlate as strongly with complexity perception as expected [
9]. This paper also explores the impact of color complexity on complexity perception and finds that while color complexity plays a role, its influence is relatively limited. In street scene image analysis, despite the rich visual information provided by color variation, grayscale information may more accurately reflect image complexity in certain contexts [
38]. For instance, under low light or adverse weather conditions, color information in street scene images may be diminished, whereas grayscale contrast and structural information maintain higher stability.
Symmetry contributes the least to the model, providing minimal information on complexity and not being a primary evaluative feature. Common symmetrical structures in street scenes include the symmetrical design of buildings and the mirrored layout of roads. However, these symmetrical elements often do not dominate the overall image, especially in scenes featuring complex natural landscapes. Street scene images frequently contain dynamic elements, such as pedestrians and vehicles, whose positions and shapes vary randomly, increasing the image’s asymmetry and complexity. Features like compression ratio, grayscale contrast, fractal dimension, and hierarchical complexity better capture the details and multi-scale information of images, thus more effectively reflecting the complexity of street scene images.
Although color complexity and symmetry contribute less to the model’s performance, they still marginally improve its accuracy. Therefore, these two features were retained in the current model. Additionally, the model developed in this study exhibits strong generalization capabilities. In future applications, when computational resources are limited or model simplification is necessary, omitting the color complexity and symmetry features could optimize computational efficiency without significantly impacting the model’s overall performance.
3.2. Practical Application of the SVM Classification Model
Through an in-depth analysis of 1800 coordinate points in the Xiaobailou and Wudadao Districts of Tianjin, we extracted six key features from 6844 street scene images: compression ratio, symmetry, color complexity, grayscale contrast, fractal dimension, and hierarchical complexity. These six features were input into a trained SVM classification model to predict image complexity perception.
The results indicated that images rated one point typically exhibited a lack of street features and homogeneity, with monotonous color composition and often large areas of blank space. In contrast, images rated three points demonstrated rich landscape features and diverse elements, densely distributed throughout the image.
Images rated with a score of two exhibit moderate complexity, typically containing a balanced amount of landscape features, falling between simplicity and high complexity. Buildings in these images often display some degree of detail but are neither overly dense nor highly intricate, usually showing a uniform and orderly distribution. The streetscape includes basic greenery elements, such as trees along the roadside, small greenbelts, or flowerbeds. Additionally, pedestrians and vehicles (e.g., cars, bicycles) appear in moderate numbers without causing a sense of overcrowding or chaos. Overall, these images maintain visual balance, offering enough detail to capture interest without overwhelming the viewer, resulting in a harmonious and moderately complex visual experience.
The visual complexity predictions for each streetscape image were mapped to their respective geographic coordinates. Each coordinate point had four different streetscape images, so the average complexity for each coordinate point was calculated. This approach effectively mitigates potential biases that could arise from single-direction images, ensuring a more comprehensive and objective assessment of the complexity of the street environment. As
Figure 8 shows, the visual complexity scores for the Xiaobailou and Wudadao Districts in Tianjin ranged from one to three, with colors varying from dark to light. Areas in the central and southwestern parts of the map showed lighter colors of complexity, corresponding to higher scores, indicating greater complexity. These regions exhibit higher land development intensity, predominantly comprising residential and commercial areas. The variety of buildings—including low-rise and high-rise residential structures as well as office towers—combined with well-maintained greenery contributes to a richer and more diverse landscape. Additionally, the high pedestrian and vehicular traffic in these areas likely contributes to the increased complexity of the street scenes.
Conversely, areas with darker colors located along the edges of the map showed lower visual complexity. These areas generally have moderate land development intensity, shorter buildings, fewer detailed structures, and lower building density. For instance, the Wudadao Cultural Tourism District, located in the southwestern part of the map, primarily features low-rise buildings, including detached and row houses, with a floor area ratio mostly less than 1. These buildings, inspired by classical revival, romanticism, and eclectic styles popular in Europe and the United States, are also infused with traditional Chinese architectural elements, resulting in a more minimalist overall style [
63]. Additionally, the lower skyline of these low-rise buildings provides open vistas and comfortable living spaces, which may contribute to the lower complexity scores in image feature analysis.
3.3. Distribution of Image Features in the Study Area
To better visualize the geographic distribution of different image feature data and to gain a detailed understanding of various dimensions of visual complexity across certain streets in Tianjin, we conducted a geographic mapping of these six features. The value at each latitude and longitude point represents the average score of street views in different directions at that location.
Among the features, the visual distribution of compression ratio (
Figure 9a), hierarchical complexity (
Figure 9b), and fractal dimension (
Figure 10a) appear quite similar. The darker-colored regions on the maps are primarily concentrated in the northern and central parts. These areas mainly cover the Xiaobailou District, which is home to numerous historical buildings such as the former French Consulate, the former Chosun Bank, and the former Beiyang Commercial Bank. These buildings, mostly constructed in the 1930s, showcase the essence of modern Western architecture, including Baroque and Rococo styles [
64]. The Xiaobailou District is a vibrant commercial hub characterized by high foot traffic. The district’s elaborate storefronts and distinctive signage contribute to its unique visual and cultural landscape, resulting in high complexity scores for hierarchical complexity, compression ratio, and fractal dimension.
In contrast, the lighter-colored regions are mainly located in the lower and peripheral areas of the map, encompassing the Wudadao Cultural Tourism Area. This area features low-rise buildings, predominantly standalone and row villas, offering relatively open views. The lack of detailed landscape elements and the more uniform vegetation in these regions likely contribute to the lower complexity scores in compression ratio, hierarchical complexity, and fractal dimension.
In the analysis of grayscale contrast (
Figure 10b), we observed a broad range of score fluctuations, ranging from 500 to 2500. Significant differences in grayscale contrast were found across various areas. High-contrast points are scattered throughout the map but are particularly concentrated in certain areas of Liuzhou Road, Heilongjiang Road, and Binjiang Road. In contrast, areas with lower grayscale contrast are more evenly distributed across the map. This uniformity likely reflects consistent building materials and design or stable lighting conditions in these areas, leading to overall lower grayscale contrast.
The color complexity (
Figure 11a) mapping indicates that the central and southern areas of the Wudadao District display greater color complexity, which accurately reflects the current conditions. The architectural styles in these areas are highly diverse, with unique variations in the color combinations of buildings and other visual elements such as vegetation. The Wudadao District is a prominent tourist area in Tianjin, rich in cultural elements, with extensive use of color in architectural carvings, signage, decorations, and advertisements. These factors collectively contribute to the high scores in color complexity in this region.
In terms of symmetry (
Figure 11b) scores, locations with higher scores are predominantly concentrated in the Wudadao District. This may be attributed to the architectural style of the Wudadao District, which tends to feature lower-rise buildings and more uniform architecture on both sides of the streets. Additionally, Dagu North Road also exhibits notable symmetry. However, Dagu North Road scores poorly in color complexity and grayscale contrast. This may be related to the modern architectural style of high-rise office buildings on either side of the road, which tends toward uniformity in design, resulting in a highly consistent visual experience on both sides of the street. Since the evaluation of symmetry primarily relies on the analysis of grayscale histograms, symmetry scores are naturally higher when the color distribution on both sides of the street is similar. Therefore, Dagu North Road scores high in symmetry, while its scores for grayscale contrast and color complexity are relatively low.
Visual complexity arises from the interplay of multiple visual features, with each feature contributing differently to the overall complexity. Taking the Wudadao District as an example, it performs well in color complexity and symmetry but is relatively average in compression ratio, grayscale contrast, hierarchical complexity, and fractal dimension. Nonetheless, these four image features contribute to over 90% of the overall visual complexity, leading to a lower overall complexity score for the Wudadao District.
3.4. Complexity Scores and Typical Streets in the Study Area
3.4.1. Complexity Scores of Streetscapes
The streetscape of Tianjin is renowned for its diverse characteristics and the fusion of historical and modern elements. This study focused on the Xiaobailou District and the Wudadao District, encompassing 76 streets of various classifications, including major roads, secondary roads, and alleys. To ensure the reliability and validity of the research results, a rigorous selection process was employed. Streets with fewer than five sampling points or fewer than 20 street view images were excluded, as their shorter lengths and insufficient sample sizes would not provide adequate statistical representativeness and analytical value. After this selection process, a total of 57 streets were included in the study. The average complexity scores of these streets are shown in
Figure 12, while the distribution of complexity levels across different streets is depicted in
Figure 13.
Through a detailed analysis of
Figure 12 and
Figure 13, we identified significant variations in visual complexity levels among the streets within the study area. As depicted in
Figure 12, the average complexity scores for most streets range between 1.6 and 2.4, suggesting that the streetscapes generally display a moderate level of visual complexity. Additionally,
Figure 13 shows that images with a complexity score of two are the most prevalent across the majority of streets. This observation corroborates the findings from
Figure 12, reinforcing the conclusion that the overall visual complexity of the streetscapes in the study area is moderate.
3.4.2. Moderate Complexity Streets
As illustrated in
Figure 12 and
Figure 13, the visual complexity of the streetscape in the Xiaobailou and Wudadao Districts predominantly remains at a moderate level. Shanxi Road, Qufu Road, and Hebei Road are typical examples of streets with moderate complexity within the study area. These streets not only represent the historic districts of Tianjin, reflecting the city’s rich historical and cultural characteristics but also hold significant commercial value.
Figure 14 shows the basic distribution of street views, revealing distinct visual styles among the three roads. Shanxi Road and Hebei Road, being one-way streets, are relatively narrow, with predominantly residential buildings that are low-rise and inclined toward Western classical architectural styles. In contrast, Qufu Road, as a secondary arterial road, is wider and more accessible, with modern high-rise buildings. The color-coded annotations in the figure represent different complexity levels, with a central concentration of moderate complexity and fewer at the extremes. Most street views exhibit moderate complexity, contributing to a balanced, appealing urban environment.
3.4.3. High-Complexity Streets
Although most streets in the study area exhibit moderate visual complexity, certain streets significantly exceed this level. As shown in
Figure 13, Yueyang Ave, Yantai Road, Xi’an Ave, Shashi Ave, Liuzhou Road, and Dalian Road have streetscape images where the complexity score of three exceeds 40%.
Figure 12 and
Figure 13 indicate that Liuzhou Road has the highest average visual complexity, with over 70% of the images scoring three. As in
Figure 15, there are only three images with a score of one and six images with a score of two, while the majority received the highest rating of three. Although some studies suggest that excessively complex built environments are less favored in building environments [
65], Many scholars have noted that the relationship between complexity and preference is not straightforward. Seemingly chaotic environments might lack consistency, but highly consistent landscapes can still positively influence environmental preferences even if they are complex [
23,
66].
Liuzhou Road, as shown, features a rich array of visual elements and diverse landscape characteristics. The area is densely populated with commercial elements, resulting in a rich green layer and a well-defined skyline. The streets are lined with commercial advertisements and signs, creating visual contrasts and rhythms between distant high-rise buildings and nearby low-rise structures, which adds depth and complexity to the landscape. High pedestrian and vehicle traffic contributes to a rich visual layer with compact spatial usage within the district. Despite its high complexity, the street maintains a degree of consistency, helping to avoid visual confusion.
3.4.4. Low-Complexity Streets
Conversely, the streets Taierzhuang Road, Nanning Road, Jiefang North Road, Hejiang Road, and Datong Road, including Baoding Bridge, have a proportion of street view images with a complexity score of one exceeding 40%. These streets show relatively weak performance in terms of visual diversity and richness. For instance, Nanning Road (
Figure 16), with the lowest average visual complexity in the study area, has most of its street view images scoring between one and two. The overall streetscape of Nanning Road is characterized by uniformity and lack of appeal. The low-rise buildings, primarily mid-to-low-rise, feature simple facades and minimal greenery, failing to meet the visual preferences of nearby residents, which may affect overall environmental satisfaction. Additionally, low-complexity streets do not provide sufficient sensory stimulation, potentially leading to a lack of psychological relaxation and enjoyment [
45]. Future planning and optimization are necessary to enhance the overall landscape effect and improve residents’ experience.
4. Discussion
4.1. SVM Quantification Model
Visual complexity is a crucial dimension in theories such as urban quality assessment [
26], arousal theory [
21], and environmental preference theory [
67]. Traditional assessment methods rely heavily on subjective evaluations, which present several challenges. First, they involve high time costs, as evaluators must invest considerable effort in researching and analyzing subjects to establish standardized criteria. Second, the scope of evaluation is limited, with manual assessments typically confined to small-scale images and exhibiting poor generalizability. Lastly, there is a feedback delay issue, where rapidly changing environments can quickly render evaluations obsolete.
To address these problems, this study developed an objective and reliable visual complexity quantification model by training the mapping relationship between image features and human subjective perception. We employed the SVM as the core model and selected six key image features as inputs: compression ratio, symmetry, fractal dimension, color complexity, grayscale contrast, and hierarchical complexity. These features encompass both high-level semantic understanding of images and low-level feature computations, offering a comprehensive perspective on quantifying visual complexity in landscapes.
The SVM visual complexity quantification model demonstrates significant advantages, achieving an accuracy of 84.05%. It efficiently and accurately predicts visual complexity in large-scale streetscape data using a small sample for training. Compared to existing complexity quantification methods, this study innovatively introduces the concept of hierarchical complexity, specifically designed to quantify visual richness in landscapes. This approach effectively simulates human perceptual processes, bridging the gap between high-level semantic understanding and low-level image features, thus covering a broader range of dimensions in visual complexity.
The hierarchical complexity is based on the Cityscapes dataset, which segments streetscape images into 19 labeled categories. Although real-world streetscapes feature various types of buildings and vegetation, using hierarchical complexity alone to directly reflect the diversity of these elements has certain limitations. This is primarily due to the reliance of hierarchical complexity on semantic segmentation techniques and the limited availability of mature computer vision models suitable for this task. Additionally, constraints in computational resources further increase the difficulty. However, other complexity features introduced in this study effectively compensate for these limitations. For instance, fractal dimension measures the complexity of shapes, color complexity captures the richness of hues, compression ratio indirectly reveals the density of information, and grayscale contrast captures detailed variations within the image. By integrating these features for analysis, we can achieve a more comprehensive assessment of the overall visual complexity of street scene images despite the inherent limitations of hierarchical complexity.
Furthermore, this study focused on the Xiaobailou and Wudadao Districts in Tianjin to predict and analyze the visual complexity of streets in these regions while validating the effectiveness of the SVM quantification model. The results indicate that most streets in the study area exhibit moderate visual complexity. High- and low-complexity streetscapes show distinct visual differences, which can be broadly categorized into three aspects: pedestrian and vehicular traffic volume, the richness of architectural façade decorations, and the layering and arrangement of vegetation. Streets with high visual complexity are typically associated with higher development intensity, featuring abundant commercial facilities and diverse landscape structures, providing a richer and more dynamic visual experience. This finding aligns with existing knowledge. Conversely, streets with low visual complexity exhibit lower development levels, with relatively uniform visual effects and lack of appeal.
To further validate the effectiveness of the evaluation results in this study, we compared them with the street conditions reported in the “Tianjin Heping District 2023 Government Work Report” [
68]. We found that the visual complexity scores from our study are consistent with the actual street conditions. Streets highlighted in the report as needing improvement, such as Binjiang Road and Hebei Road, have an average visual complexity score slightly above two. This reflects issues with the streetscape, such as inconsistent styles of street-facing shops and outdated residential buildings, leading to high visual complexity but also visual disorder. Future efforts should focus on enhancing the development and updating of these streets, continually optimizing spatial layouts to further enhance their visual appeal.
4.2. Analysis of Streetscape Complexity Features
Visual complexity is a crucial explanatory factor influencing environmental preferences [
7]. It directly affects the visual quality assessment of streetscapes and, in turn, impacts residents’ satisfaction with their environment. This study identifies several factors that shape streetscapes’ visual complexity: compression ratio, grayscale contrast, hierarchical complexity, fractal dimension, color complexity, and symmetry. Notably, compression ratio, grayscale contrast, hierarchical complexity, and fractal dimension have particularly significant effects on street visual complexity. When designing street views, evaluating performance across these dimensions and adjusting image features accordingly can help strategically enhance or reduce visual complexity.
Specifically, compression ratio and grayscale contrast primarily affect image detail, which can be adjusted by modifying texture or fine structural elements. Hierarchical complexity is closely related to the layering and diversity of the environment, which can be altered by increasing or decreasing the layout of streetscape elements and plants. The fractal dimension is linked to the richness of natural elements, which can be adjusted by varying the distribution of trees and grass to enhance the natural aesthetics of the street view. Detailed analysis and adjustments based on these findings can ensure that street designs are visually attractive while harmoniously integrating into their environment.
4.3. Strategies for Optimizing Street Visuals
In the study area, most streetscapes exhibit a medium level of complexity, with some streetscapes showing high or low complexity. It is widely believed that environments with moderate complexity are aesthetically pleasing [
69,
70], as they provide a balanced amount of information and avoid monotony or information overload. The study reveals that there is an optimal cognitive load level when processing visual information. Overly simplified landscapes may lack appeal, often leading to feelings of boredom and depression [
19], while excessively complex landscapes can cause visual fatigue and lead to information overload [
21]. Medium-complexity environments achieve visual balance and harmony, stimulating brain activity without causing stress, which is more likely to induce pleasure. Alpak’s research further confirms that people exhibit a marked preference for urban built environments with moderate complexity [
67]. These environments not only resonate with residents’ visual aesthetics and environmental preferences but also have a positive impact on their psychological well-being. To further enhance the visual appeal and cultural atmosphere of streets with moderate complexity, future efforts could focus on regular maintenance of plantings and building facades, the introduction of vertical greenery, and the incorporation of public art installations, thereby elevating their overall aesthetic value.
High-complexity streetscapes are closely related to the volume of information they convey, presenting greater challenges for design. During the design process, it is essential to meticulously coordinate various elements of the environment and employ a layered design approach, which includes the organized arrangement of the foreground, middle ground, and background. By skillfully integrating architectural features and selecting appropriate paving materials, color schemes, and plants of varying heights, designers can maintain overall visual richness while avoiding visual clutter.
Low-complexity streetscapes often suffer from insufficient development, leading to minimal greenery and beautification, monotonous architectural styles, a lack of commercial facilities, and inadequate infrastructure, which can also affect foot traffic. Design strategies should focus on adding detail to increase visual complexity. For example, incorporating diverse commercial signage systems and integrated service facilities, as well as creating visual depth with varied plant configurations and decorative patterns on building facades, can attract visual interest and enhance the overall appeal of the environment.
In urban environment design, the visual complexity of streetscapes is a key factor affecting their attractiveness and harmony. Adjusting street designs to balance visual complexity can optimize the visual effects of streetscape. This approach helps maintain street attractiveness while avoiding excessive complexity or monotony, thereby creating aesthetically pleasing and harmonious street environments.
4.4. Limitations and Future Directions
This study has some limitations. Although the proposed streetscape complexity quantification model predicts visual complexity with an accuracy of 84.05%, there remains some discrepancy compared to human subjective perception. This discrepancy may arise from the following reasons. (1) Visual complexity dimensions: Human subjective perception of complexity encompasses a more diverse range of dimensions than the indicators proposed in this study, which may not fully cover all aspects. (2) Technological limitations: current computer vision technologies are not yet fully mature, and the accuracy of algorithms for quantifying complexity indicators needs improvement. For instance, image semantic segmentation, required for calculating hierarchical complexity, is inconsistent across different types and scenes, affecting the precise calculation of complexity indicators. (3) Image conditions: despite preprocessing, variations in image capture conditions—such as lighting, exposure, and focal length—can affect calculation accuracy. For example, lighting changes can influence edge detection, and inadequate or excessive exposure can lead to increased noise or loss of detail.
Despite numerous studies confirming a close relationship between perceived complexity and theories of environmental quality assessment, environmental preferences, arousal theory, and restorative benefits, a unified definition of these relationships has not yet been established due to the challenges of quantifying complexity and technological limitations. This research aims to provide a more scientific quantitative approach for future studies, facilitating a deeper exploration of these theoretical connections. Additionally, differences in image features and human environmental preferences have been observed, suggesting that future research could further investigate this direction.
Moreover, the model training samples in this study were primarily sourced from the Xiaobailou and Wudadao Districts, which may introduce some regional heterogeneity in capturing area-specific features. There may be significant differences in the numerical distribution of streetscape image features across different regions. Therefore, for future studies using this model to predict streetscape complexity in other areas, it is recommended to use local street view images as training samples to achieve more accurate predictions of perceived complexity.
5. Conclusions
This research aims to apply artificial intelligence technology to develop an objective quantification model for streetscape complexity perception, addressing limitations in traditional visual complexity assessments such as scope restrictions, data acquisition challenges, and measurement accuracy. The Xiaobailou and Wudadao Districts in Tianjin were selected as the study area. Streetscape images were collected using the Baidu Maps API, with a small subset of images used for model training. The trained model was then applied to evaluate the complexity perception of the remaining images. The main findings of the study are the following:
This study introduces an innovative dimension for measuring visual complexity—hierarchical complexity. This dimension serves as a tool designed specifically for assessing landscape visual complexity and bridges the gap between low-level semantic features in streetscape images and high-level human semantic cognition. Hierarchical complexity is based on high-level semantic information from images, allowing for a more comprehensive capture of visual complexity in streetscapes and reflecting the richness of streetscape elements. The DeepLabv3+ model was utilized for precise semantic segmentation of streetscape images, marking the smallest environment elements and creating corresponding nodes in the image’s semantic tree structure. By analyzing the inclusion relationships between these nodes, a hierarchical semantic tree was constructed, with the number of nodes in the hierarchy serving as an indicator of hierarchical complexity.
This study developed a SVM-based model for quantifying streetscape visual complexity, achieving an average accuracy rate of 84.05%. This model can automate the processing of large volumes of image data, significantly improving efficiency and reducing the influence of subjective factors, thereby ensuring consistency and objectivity in measurements. The SVM model incorporates six input features across multiple dimensions, encompassing both low-level image features and high-level semantic information. The contribution of these features, ranked from highest to lowest, is as follows: compression ratio, grayscale contrast, hierarchical structural complexity, fractal dimension, color complexity, and symmetry. Among these, compression ratio, grayscale contrast, hierarchical structural complexity, and fractal dimension are the key factors influencing visual complexity. This ranking facilitates a more precise identification and understanding of each feature’s role in streetscape visual complexity. This, in turn, allows urban planners and environmental designers to tailor street designs and optimize visual complexity, thereby improving both the aesthetic appeal and functional quality of urban spaces.
The quantification model for streetscape visual complexity was applied to the Xiaobailou and Wudadao Districts in Tianjin. Detailed statistical and analytical evaluations were conducted for each street within the study area. Most streets exhibited moderate levels of visual complexity, with fewer streets falling into low- or high-complexity categories. High-complexity streetscapes typically featured diverse building types, rich vegetation, and high foot and vehicle traffic, such as in Yueyang Ave, Yantai Road, Xi’an Ave, Shashi Road, Liuzhou Road, and Dalian Road. Conversely, low-complexity street views often had uniform landscape elements and large blank areas, as seen in Taierzhuang Road, Nanning Road, Jiefang North Road, Hejiang Road, Datong Road, and Baoding Bridge. Based on the typical streets of different complexity levels, corresponding improvement recommendations have been proposed.
This study successfully developed a quantification model for urban streetscape visual complexity, offering a more scientific and precise tool for future research. It also emphasizes the importance of visual complexity in urban space design, offering a new perspective on balancing urban aesthetics and functionality. We hope that the development of this model will advance in-depth discussions in the fields of built environment and visual perception, providing urban planners and environmental designers with a robust data foundation. This will enable more precise assessment and adjustment of visual complexity in urban spaces, ultimately optimizing urban environments, enhancing residents’ quality of life, and elevating the aesthetic value of cities.