Analyzing Land Shape Typologies in South Korean Apartment Complexes Using Machine Learning and Deep Learning Techniques

Yoon, Sung-Bin; Hwang, Sung-Eun

doi:10.3390/buildings14061876

Open AccessArticle

Analyzing Land Shape Typologies in South Korean Apartment Complexes Using Machine Learning and Deep Learning Techniques

by

Sung-Bin Yoon

¹

and

Sung-Eun Hwang

^2,*

¹

Department of Architecture, Graduate School, Seoul National University of Science and Technology, Seoul 01811, Republic of Korea

²

Department of Architectural Engineering, Hyupsung University, Hwaseong 18330, Republic of Korea

^*

Author to whom correspondence should be addressed.

Buildings 2024, 14(6), 1876; https://doi.org/10.3390/buildings14061876

Submission received: 18 May 2024 / Revised: 8 June 2024 / Accepted: 17 June 2024 / Published: 20 June 2024

(This article belongs to the Special Issue Advanced Technologies for Urban and Architectural Design)

Download

Browse Figures

Versions Notes

Abstract

:

In South Korea, the configuration of land parcels within apartment complexes plays a pivotal role in optimizing land use and facility placement. Given the significant impact of land shape on architectural and urban planning outcomes, its analysis is essential. However, studies on land shape have been limited due to the lack of definitive survey criteria. To address these challenges, this study utilized a map application programming interface (API) to gather raw data on apartment complex layouts in South Korea and processed these images using a Python-based image library. An initial analysis involved categorizing the data through K-means clustering. Each cluster’s average image was classified into four distinct groups for comparison with the existing literature. Shape indices were employed to analyze land configurations and assess consistency across classes. These classes were annotated on a parcel level using the Roboflow API, and YOLOv8s-cls was developed to classify the parcels effectively. The evaluation of this model involved calculating accuracy, precision, recall, and F1-score from a confusion matrix. The results show a strong correlation between the identified and established classes, with the YOLO model achieving an accuracy of 86% and demonstrating robust prediction capabilities across classes. This confirms the effective typification of land shapes in the studied apartment complexes. This study introduces a methodology for analyzing parcel shapes through machine learning and deep learning. It asserts that this approach transcends the confines of South Korean apartment complexes, extending its applicability to architectural and urban design planning on a global scale. Analyzing land shapes earmarked for construction enables the formulation of diverse design strategies for building placement and external space arrangement. This highlights the potential for innovative design approaches in architectural and urban planning worldwide.

Keywords:

machine learning; deep learning; YOLOv8; image classification; apartment complex

1. Introduction

In South Korea, a parcel is defined as an independent land unit depicted on the cadastral map, an official document detailing the location, shape, and area of land parcels. Each is assigned a unique identifier, providing essential data for understanding ownership, usage conditions, purpose, and boundaries. Parcels are primarily used in real estate transactions, land use planning, and legal disputes. For land use planning, parcel information is crucial for assessing a land’s purpose and potential for development during urban and architectural planning [1]. Historically, regulations such as road diagonals dictated building shape and density according to road frontage dimensions. These regulations have since been abolished, increasing the influence of land shape on building form and density [2].

Parcel shapes in South Korea pose challenges due to the difficulty in logically distinguishing them and the ambiguity of survey standards, leading to research uncertainties. It is hypothesized that parcel shapes can be classified into patterns that combine geometric structure with the effective area, allowing for analysis through characteristic patterns by organizing geometric land information into structured sequences [3].

While newly developed or restructured areas typically feature regular-shaped parcels, irregular parcels remain common in older downtown districts. Previously, redevelopment and land readjustment efforts often disrupted and reorganized existing urban spatial structures into regular shapes. However, current trends favor gradual, restorative development, like urban regeneration, which preserves existing parcel shapes. Understanding the effects of parcel shapes on architectural forms and densities, particularly how irregular shapes influence building behaviors, is increasingly vital.

In South Korean apartment complexes, the parcel is integral to the design and development process, determining land use efficiency and optimal facility placement, as well as managing and preserving environmental elements such as green spaces. Parcels, which legally define land boundaries, are essential in obtaining building permits, transferring property, and conducting real estate transactions. The shape of the parcel significantly influences architectural and urban planning, affecting development conditions and necessitating a careful analysis of parcel typology to tailor development plans accordingly.

Despite its importance, research on parcel typology is limited, largely due to the complexity and resource-intensive nature of analyzing distinctly labeled data. To address these challenges, this study proposes using unsupervised learning techniques to efficiently organize large datasets into meaningful groups without prior labeling, followed by a detailed analysis and typification using supervised deep learning methods. This study lays the groundwork for typifying and analyzing unlabeled raw data in architecture and urban design. It presents a methodology employing machine learning for data classification and deep learning for analysis and verification. Additionally, it proposes that this approach extends beyond South Korean apartment complexes, holding promise for global application. The analysis of land shapes can inform diverse design strategies for building and external space layouts tailored to site configurations. This suggests the potential for evolving design strategies worldwide based on the landform analysis.

This study aims to classify parcel shapes in South Korean apartment complexes using machine learning, building upon previous studies to develop a deep learning model that accurately identifies and categorizes these shapes. This study departs from traditional methodologies that heavily relied on visual inspection for parcel analysis. Through the utilization of artificial intelligence, including machine learning and deep learning, this research mitigates the subjectivity of researchers, enabling an objective analysis. This approach sets it apart from previous studies, signifying a significant advancement in the field. The goal is to enable the detailed analysis of apartment complex designs based on parcel type, facilitating strategic planning for building arrangements and external spaces.

2. Related Work

2.1. Survey and Analysis of Parcel Shapes

2.1.1. Classification of Parcel Shapes

According to domestic guidelines, parcel shapes are classified into six categories: square, horizontal rectangle, vertical rectangle, trapezoid, irregular, and sack, with irregular parcels being difficult to categorize into a specific shape [4]. A square parcel closely resembles a square, a horizontal rectangle has its longer side along the road, a vertical rectangle has its shorter side along the road, a trapezoid mirrors the geometric shape, and a sack-shaped parcel is either narrow or shaped like an inverted triangle or trapezoid, with the point meeting the road.

Irregular parcels are defined as those with shapes or triangular configurations where over one-third of the area is deemed lost according to the minimum bounding box standard. The area within the minimum bounding box that constitutes the effective area is distinguished from the lost area. Prior studies have utilized machine learning to typify such irregular parcels, thereby enabling automatic rule learning from data without human-coded rules [5,6]. These studies employed the K-means algorithm from the scikit-learn library, a leading machine learning tool in Python, to analyze 500 sample parcels, categorizing them into six types of irregular shapes.

Unlike previous research that manually measured and categorized land into six shapes, this study uses K-means clustering to objectively classify 3000 apartment complex parcels, further verifying the classifications through deep learning, therefore marking a significant advancement over earlier efforts.

2.1.2. Analysis of Parcel Shapes

Numerous studies have examined the irregularity of parcels by utilizing indices that quantify this irregularity, predominantly employing the shape index (SI). The SI is calculated based on the ratio of a shape’s perimeter to its area, with higher values indicating increased irregularities as the perimeter lengthens disproportionately compared to the area [3,7]. Additionally, research has considered previously defined parcel types that incorporate the minimum bounding box, applying the standard index (STI). The STI measures parcel irregularity by dividing the parcel’s area by that of its minimum bounding box, producing values between 0 and 1. Values approaching 1 signify a regular shape, whereas values nearing 0 denote greater irregularity. Furthermore, the width–depth ratio (WR) of the minimum bounding box has been employed to assess parcel shapes, with lower values suggesting shapes closer to a square [8]. In research analyzing parcel shapes, a method was proposed to develop a Parcel Shape Index (PSI). This index utilizes parameters such as perimeter, area, internal angles, and the number of external points to assign scores, differentiating parcels based on their shapes. However, this approach entails a complex computation process to apply the classification system, posing efficiency limitations [9].

2.2. Trends in Architecture Using AI

2.2.1. Machine Learning in Architecture and Urban Studies

In the realm of machine learning methodologies, these are categorized into supervised and unsupervised learning. Supervised learning has been applied in architecture and urban planning to predict real estate indices [10], estimate building energy consumption [11], determine occupancy rates [12], and simulate land use changes [13]. Conversely, unsupervised learning involves techniques such as clustering, which groups similar entities together and is particularly useful in handling unlabeled data [14]. Within unsupervised learning, methods extend beyond K-means clustering to include DBSCAN, hierarchical clustering, and spectral clustering. Among these, K-means clustering is favored for its simplicity, rapid execution, and robust performance [15]. It groups similar objects together, aiding in the classification of subjects such as apartment building shapes [16], disaster intensities [17], and land use categories [6].

Furthermore, research aimed at enhancing efficiency in architecture and urban management through machine learning includes studies on various approaches. These include investigations into detecting road potholes and their locations [18]; comparing the effectiveness of machine learning algorithms such as support vector machine (SVM), Maximum Likelihood (ML), and Random Trees (RT) for extracting impervious surfaces in residential complexes and illustrating their spatial distribution [19]; and developing a machine learning technique called Building Detection with Shadow Verification (BDSV) based on high-resolution satellite images to automatically detect buildings within urban areas [20]. These studies explore the applicability of machine learning in the field of architecture and urban planning and contribute to solving relevant problems.

2.2.2. Research in Urban and Architectural Fields Using Deep Learning

Studies in urban and architectural engineering increasingly leverage deep learning, reflecting a notable rise in research from 2010 to 2020. This surge in artificial intelligence, machine learning, and deep learning applications, particularly in structural engineering, utilizes algorithms such as artificial neural networks (ANNs), genetic algorithms (GAs), genetic programming (GP), and support vector machines (SVMs). While these studies generally lack a defined optimal dataset size, they typically involve dividing the dataset to train models. Performance metrics used include the correlation coefficient (R), mean squared error (MSE), and root mean squared error (RMSE) [21].

Additionally, deep learning has spurred innovative changes in evaluating the durability of building materials and the structural integrity of infrastructure. For example, research employing deep convolutional neural networks and particle swarm optimization to predict the compressive strength of cement-based materials exposed to sulfate in marine environments [22] and studies utilizing an ensemble of CNNs and data fusion techniques to assess corrosion and coating defects at coal handling and preparation facilities [23] have overcome the limitations of traditional assessment methods. These studies offer higher accuracy and efficiency, making significant contributions to advancements in the field of structural engineering.

In architectural planning and design, deep learning is extensively used to analyze two-dimensional images like floor plans, with convolutional neural networks (CNNs) demonstrating exceptional ability in this area. This capability is particularly advantageous for studies that convert architectural spaces into two- or three-dimensional grid points for further analysis. Research trends indicate that applications like YOLO have been instrumental in constructing spatial relationships within architectural spaces. These studies train YOLO models on house floor plan images to identify components such as rooms, doors, and windows, thereby forming spatial relationships depicted in bubble diagrams [24].

Following the recognition of floor plans through deep learning, extensive drawing information is interpreted into graph diagrams. Algorithms are developed to automatically extract information from these plans, transforming it into graph form. This process aids in creating datasets that facilitate the planning stages of architectural design, suggesting models that recommend spatial relationships [25]. Additionally, GAN-based methodologies are utilized to segment site areas, design layout plans, and propose furniture arrangements [26]. Bubble diagrams also serve as inputs for GANs to generate room masks, or nodes, within floor plans and to artificially create interior designs [27].

3. Materials and Methods

3.1. Apartment Complex Parcel Shape Dataset

3.1.1. Scope of Apartment Complexes

Regarding data sources, this study relies on information from the Public Housing Management Information System (K-apt) [28], which provides extensive details on multi-unit housing complexes with 100 or more households, as required by South Korean public housing regulations. These regulations mandate the disclosure of management fees and other critical data, including the address, completion year, number of households, buildings, and floors for over 18,000 complexes. For this study, we focused on apartment complexes with at least 300 households that are legally required to include facilities like senior citizen centers, playgrounds, daycare centers, and community facilities [29]. From this dataset, 13,264 apartment complexes meeting these criteria were selected as the study population for analyzing apartment complex layouts.

3.1.2. Parcel Shape Types

According to prior studies, parcel shapes are distinguished between regular and irregular forms (see Figure 1). Regular types are identified as square, horizontal rectangle, vertical rectangle, trapezoid, irregular, and sack. These classifications were initially based on visual inspections conducted by government agencies and were later validated through comparisons using indices such as the SI, STI, and WR. On the other hand, irregular parcel types, including Avocado, Potato, Corner, Bell, Stick, and L-shape, were categorized using K-means clustering on a dataset of 500 sample parcels. The optimal number of clusters (K) was established via the Elbow method, identifying the point where the graph shows a significant bend as the most suitable value. Following the determination of 20 clusters, the average shape image of each cluster was visually analyzed, and types were assigned, with subsequent validations through SI, STI, and WR comparisons [30].

3.2. Image Dataset

3.2.1. Image Acquisition

In the context of apartment layouts, KakaoMap (see Figure 2) [31], a mapping application programming interface (API) similar to Google Maps used in South Korea [32], facilitated the collection of various details such as complex boundaries, main building layouts, playgrounds, badminton courts, waterfront areas, roadways, sidewalks, and parking lots. The addresses of each apartment complex were input into KakaoMap, enabling the capture and preservation of images to secure map files for each complex. These maps were then refined by cropping out the boundaries of the apartment complexes as depicted on the map, thereby focusing solely on the layouts of each complex. This process utilized a database previously compiled by the same researcher; further details are available in Yoon, S. B. (2024) [33].

3.2.2. Image Processing

The map service indicates boundaries corresponding to each address as pink areas, based on national cadastral maps. It was observed that these boundaries slightly deviate from the actual site boundaries. In instances where the address-based areas did not accurately define the complex boundaries, these discrepancies were exception-handled and data refinement was conducted. As a result, layouts for 3000 apartment complexes were securely acquired. According to prior studies, these images require normalization for application in K-means clustering. This process involves converting image pixel values to a standardized resolution of 100 × 100. However, for subsequent applications in deep learning models, the images were resized to a more detailed resolution of 640 × 640. Although this resizing might increase memory usage in K-means clustering, it is essential for preserving detail and enhancing the accuracy of detailed clustering performance. It also improves performance in deep learning models.

The image resizing step is critical for standardizing the data format by centrally positioning the parcel and setting the image boundary to a bounding box, crucial for calculating the minimum bounding box in future analyses of parcel shapes. Initially, the parcel images, which were in vector format, were converted into raster images made up of pixels. To normalize the image data, which ranged from 0 to 255, the pixel values were divided by 255, converting them into a numpy matrix of positive values. This normalization process, a common preprocessing technique in image processing, was employed to transform the data into a format containing 10,000 pixel values per parcel, ultimately producing 3000 black-and-white images [5]. The Python image library (PIL) was utilized for this data acquisition.

Following the data analysis, the performance of several clustering methods—K-means clustering, hierarchical clustering, DBSCAN, and spectral clustering—was compared. K-means clustering emerged as the most effective method among the options evaluated due to its simplicity and efficiency. Leveraging the scikit-learn library, a widely used Python machine learning library, K-means clustering was implemented to categorize the parcels into K clusters. Average images for each cluster were then derived, which facilitated the classification of the parcels according to the types identified in previous research.

To develop a deep learning model capable of learning and classifying data by parcel type, each image was labeled according to its identified parcel type. For the classification, the YOLOv8 network model was employed, with original images annotated using the Roboflow API [34]. A total of 3000 apartment complex parcel images were categorized into six classes—Bell, Corner, Potato, Square, Stick, and Trapezoid—reflecting parcel types identified in previous research as well as findings from this study. To accommodate the requirements of the pre-trained models, the images were resized to 640 × 640 and expanded to enhance the dataset size, thus helping to prevent overfitting. The dataset was proportionally divided into a training set, testing set, and validation set, with ratios of 7:2:1, respectively. These image (see Figure 3) preprocessing steps ensured the dataset was adequately prepared for subsequent deep learning tasks.

3.3. Research Methods

This study aimed to typify and analyze the shapes of apartment complex parcels based on the methodology illustrated in Figure 4. In summary, the study examined the classification system and analytical methodology for parcel shapes, drawing on prior research. Raw data underwent processing to train machine learning and deep learning models. K-means clustering was employed to typify the shapes of apartment parcels, which were subsequently trained on a YOLO model to analyze and validate the parcel shapes. The detailed analysis method is as follows.

3.3.1. Image Clustering

Image clustering involves grouping images into clusters by identifying inherent patterns or structures within the data without the use of pre-assigned labels. This process is intended to elucidate the structure of an image dataset or classify images based on specific characteristics. Several clustering methods are typically employed, including K-means clustering, hierarchical clustering, DBSCAN, and spectral clustering, each with its own set of strengths and weaknesses. K-means clustering is noted for its simplicity and efficiency, even with large datasets, although it requires the number of clusters to be predetermined. Hierarchical clustering generates a dendrogram presenting a hierarchical structure of clusters but is computationally intensive and inflexible once clusters are defined. DBSCAN, which does not presuppose the number of clusters or their shapes, is sensitive to parameter settings and may perform inconsistently across datasets with varying densities. Spectral clustering excels in identifying clusters of diverse shapes and is suitable for nonlinear data structures, yet it demands significant computational resources, and the choice of parameters critically influences its performance. In this study, after evaluating the advantages and disadvantages of these methods, K-means clustering was selected as the most fitting due to its straightforwardness and efficiency.

In this study, the Elbow method was employed to determine the optimal K value for K-means clustering, addressing one of the method’s inherent challenges—the difficulty in selecting the ideal number of clusters. This method utilizes the point at which the slope of the graph markedly changes, or “bends”, as an indicator of the appropriate K value [5]. However, pinpointing a definitive bending point proved challenging, leading to the selection of 20 as the most suitable K value after multiple trials. Through K-means clustering, 3000 images were effectively categorized into 20 clusters. Consistent with prior research, average images for each cluster were generated, facilitating the typification of regular and irregular parcel shapes and enabling the initial clustering of unlabeled image data. In the K-means clustering process, key parameters include the number of clusters (K), initialization method, maximum number of iterations, distance measurement method, and tolerance. The number of clusters (K) was optimized using the Elbow method, which assesses the rate of decrease in the cost function for each cluster number to determine the appropriate K value. For initialization, the K-means++ method was employed, which considers the distance between data points to achieve more stable results. Furthermore, the maximum number of iterations was set at 300, the Euclidean distance was utilized as the distance measurement method, and the tolerance was set at 0.0001 for clustering. These parameter settings were implemented using the features of the scikit-learn library. They were determined as the optimal experimental conditions through multiple experiments.

3.3.2. Image Classifier

Image classification is commonly executed using pre-trained models, drawing on vast image databases like ImageNet [35], COCO dataset [36], and CIFAR [37]. ImageNet, for instance, features more than a thousand varied classes and houses over a million images, offering a rich resource that significantly enhances the generalization capabilities of models. Pre-trained models on this dataset include ResNet [38], VGGNet [39], EfficientNet [40], GoogleNet [41], MobileNet [42], and YOLO [43], each engineered with distinct attributes and applications. While ResNet and VGGNet are characterized by a high number of parameters, rendering them heavier and slower, other models like EfficientNet and GoogleNet suffer from high computational demands despite their sophistication. MobileNet, though advantageous for mobile and edge device applications, may struggle with more complex tasks. YOLO stands out as a One-Stage Detector that facilitates real-time object detection and classification, boasting ease of training and scalability across diverse datasets, which makes it particularly adaptable for modifications and customizations.

Given the vast volume of images and the need for rapid inference, the YOLO model was chosen for its efficiency in processing and its capability to analyze objects of varying sizes swiftly and effectively.

3.3.3. You Only Look Once (YOLO) Model

YOLO typically features a standard structure for object detectors that comprises two main components: the head and the backbone. The backbone is tasked with extracting feature maps from images, which are essential for classifying objects. The head, on the other hand, focuses on localizing objects by interpreting the feature maps produced via the backbone [44]. Originally, YOLO was designed primarily to identify objects and their locations within images. However, the latest iteration, YOLOv8, has seen significant enhancements in terms of its structure and architecture. These improvements make YOLOv8 more user-friendly, fast, and accurate, thus broadening its applicability to a range of tasks including object detection, image segmentation, and classification.

During the planning and execution phases of this study, the latest version available was YOLOv8, which incorporated the most advanced technology and had been verified for stability. While newer versions such as YOLOv9 and YOLOv10 have since been released, YOLOv8 was utilized for this research.

One of the standout features of YOLOv8 is its scalability. It is designed as a comprehensive framework that supports all previous versions of YOLO, making it easier for users to transition between different versions and compare performance effectively. Key advancements in YOLOv8 include a new backbone network, an anchor-free detection head, and a novel loss function (see Figure 5). These features enable YOLOv8 to execute efficiently on a variety of hardware platforms, from CPUs to GPUs, providing enhancements in speed and accuracy compared to its predecessors [45,46].

For the purposes of this study, the YOLOv8x-cls model trained on ImageNet—a repository that includes over 1000 classes—was considered. Among the available models, YOLOv8s-cls was selected. It is the second fastest model in the series and operates with relatively fewer parameters, making it well suited for the high-performance classification needs of this study, particularly in analyzing the diverse characteristics of parcel shapes in apartment complexes [45].

3.3.4. Analysis of Parcel Shapes

Following the classification of parcel shapes, the differences in related indices were analyzed. The indices were calculated using methods from previous research applied to the newly classified types. One such index is the SI, calculated as follows (Equation (1)): the perimeter length of the parcel is divided by the square root of the area, then multiplied by 0.25. A higher SI value indicates a greater irregularity in shape, as it suggests a longer perimeter relative to the area, thereby pointing to a higher degree of irregularity.

S I = 0.25 \times (\frac{P e r i m e t e r o f P a r c e l}{\sqrt{A r e a o f P a r c e l}}) .

(1)

The STI is a measure of parcel irregularity, calculated as shown in Equation (2): the area of the parcel divided by the area of its minimum bounding box. This index ranges from 0 to 1, with values approaching 1 indicating more regular shapes, and those closer to 0 denoting more irregular shapes.

S T I = \frac{A r e a o f P a r c e l}{A r e a o f M i n i m u m B o u n d i n g B o x} .

(2)

The WR is defined in Equation (3) as the aspect ratio of the minimum bounding box, calculated by dividing the width by the depth of the parcel. This ratio approaches 1 as the parcel becomes more square-like and increases as the shape elongates.

W R = \frac{W i d t h o f M i n i m u m B o u n d i n g B o x}{L e n g t h o f M i n i m u m B o u n d i n g B o x} .

(3)

To calculate SI, STI, and WR, the Python OpenCV library was utilized to compute the area and perimeter of the images and specific code was developed to determine the bounding boxes that encapsulate each parcel (see Figure 6). Based on these computations, average values of these indices for each type of parcel were derived and subsequently analyzed.

3.3.5. Evaluation Metrics Accuracy

The evaluation of the classification model’s performance utilized a confusion matrix, which is a tool that helps represent the number of data points correctly classified by the model in comparison to the actual categories. As illustrated in Figure 7, this matrix includes a TP in the first row and first column, indicating a true positive (TP) with the prediction of ‘1’ and an actual category of ‘1’; a TN in the second row and second column, representing a true negative (TN) with both the prediction and actual value of ‘0’; an FP in the first row and second column, indicating a false positive (FP) with the prediction of ‘1’ but an actual value of ‘0’; and an FN in the second row and first column, indicating a false negative (FN) with the prediction of ‘0’ but an actual value of ‘1’.

The confusion matrix is instrumental in calculating key performance metrics such as accuracy, sensitivity, precision, and F1-score.

Each of these metrics ranges between 0 and 1, with a precision and sensitivity of 0.8 or higher considered indicative of high performance. A small difference between these metrics suggests that the model does not have significant issues with bias or variance. An F1-score of 0.9 or higher is typically regarded as exemplary, denoting an excellent classification model [47]. Accuracy, defined as the ratio of correctly classified data to the total dataset, serves as a fundamental measure of the model’s overall performance.

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(4)

In this study, precision is defined as the proportion of correct positive predictions made using the model, reflecting its reliability in making accurate positive identifications. Meanwhile, recall (also known as sensitivity) measures the proportion of actual positives the model correctly identifies relative to the total number of actual positives. These metrics are crucial for assessing the model’s accuracy and completeness, respectively. The formulas for calculating these scores are detailed in Equations (2) and (3). The F1-score, which is the harmonic mean of precision and recall, is particularly useful for evaluating model performance when dealing with unbalanced datasets. A high F1-score is indicative of robust model performance, balancing both precision and recall effectively.

P r e c i s i o n (P) = \frac{T P}{T P + F P}

(5)

R e c a l l (R) = \frac{T P}{T P + F N}

(6)

F 1 S c o r e = 2 \times \frac{P \times R}{P + R}

(7)

4. Results and Discussions

For this study, apartment complexes with at least 300 households were selected. Using a map API, the back data of complex layouts were retrieved and subsequently transformed into rasterized, black-and-white images. These images were then converted into numerical numpy arrays through a normalization process to prepare a dataset for classifying parcel shapes. The shapes were initially typified using K-means clustering, and then categorized into four classes—Avocado, Potato, Trapezoid, and Stick—based on both previous research and findings from this study. Parcel shapes were labeled using the Roboflow API data annotation tool. Subsequently, a model for analyzing the apartment complex parcel dataset was developed based on the YOLOv8s-cls model, and the research findings are presented in the following subsections.

4.1. Parcel Shape Clustering

4.1.1. K-Means Clustering

The optimal K value for K-means clustering was determined using the Elbow method (see Figure 8), a technique used to identify the number of clusters in which the addition of another cluster does not give much better modeling of the data. This method indicated a lack of a clear bending point, prompting the exploration of various K values—20, 30, 50, and 100—with 20 ultimately being selected as the most appropriate for classifying parcel shapes.

The analysis of the average images derived from each cluster highlighted that similar parcel shapes often differed based on their orientation, an observation attributed to pixel positioning, which varies based on image orientation. Table 1 summarizes the shapes and numbers (N) of parcels in each cluster. This detailed classification and the derived images aid in our understanding of parcel shape variability within the dataset.

4.1.2. Parcel Shape Classification and Analysis

The analysis of average images derived from K-means clustering revealed four distinct parcel types (see Table 2), refining the classification from previous research that distinguished between five regular and six irregular types. Observations indicate that, unless an area is newly developed or extensively restructured, parcels generally exhibit irregular shapes, with only a few closely resembling regular forms. Moreover, the categorization of irregular shapes in earlier studies showed some ambiguity; for instance, irregular types such as Avocado, Potato, Corner, Bell, Stick, and L-shape were identified, yet only a small number of parcels actually conformed to the Bell (48 parcels) and L-shape (13 parcels) classifications. The Bell type closely resembles the Potato type in shape, and the L-shape is predominantly elongated, akin to the Stick type. The Corner type, which shares a similar WR with the Avocado but is more asymmetric, can be reasonably grouped with the Avocado due to shared characteristics.

Consequently, this study streamlined the categorization of irregular parcel shapes from previous studies into three types and introduced a new type, close to regular, named Trapezoid.

The analysis of the characteristics of the four parcel shapes derived from K-means clustering was conducted by comparing the average values of SI, STI, and WR for each type, as presented in Table 3.

SI revealed that the Stick type exhibited the highest average SI at 1.321, indicating the greatest degree of irregularity due to more convolutions in its shape. This was followed by the Avocado type at 1.194, Potato at 1.129, and Trapezoid at 1.104. Higher SI values suggest more complex shapes with greater irregularities.

The STI compares the parcel area to the area of the minimum bounding box, with higher values closer to 1 indicating shapes that are more rectangular. The Trapezoid type displayed the highest average STI at 0.845, signifying its closer resemblance to regular shapes. This was followed by Potato at 0.801, Avocado at 0.729, and Stick at 0.687. Notably, the Stick type was identified in previous research as having a loss area greater than 30%, indicating significant irregularity. The Avocado type also showed notable irregularity with a 27.1% loss area.

The WR is a key index used to describe the aspect ratio of the minimum bounding box and serves as an indicator of shape elongation. A higher WR value suggests a more elongated shape, while a value closer to 1 indicates a more square-like configuration. In this analysis, the Stick type displayed the highest WR at 2.269, marking it as the most elongated among the types. This was followed by the Avocado type at 1.619, which is less elongated but still significantly non-square. The Trapezoid and Potato types showed WR values of 1.329 and 1.278, respectively, indicating that these types are bulkier and closer to square than the others.

The application of machine learning-based shape classification has demonstrated that the identified parcel shapes maintain a consistent relationship with established shape indices. The next step in this research is to validate these classified types through deep learning models specifically tailored to these four classes.

4.2. Parcel Shape Type Classifier

4.2.1. Evaluation Matrix

In a series of experiments utilizing the YOLOv8 model, an optimized model configuration was achieved, as depicted in Figure 9. The model was configured with approximately 0.5 million parameters, a balance that optimizes both efficiency and performance. The computational cost was quantified at 12.6 GFLOPs, and the learning rate was set at 0.01, paired with a momentum of 0.937 to ensure model convergence and stability. Additional settings such as a weight decay of 0.005 and a batch size of 16 were implemented to enhance model generalization performance and learning stability.

The performance of the developed classification model was systematically evaluated using a confusion matrix, visualized in Figure 9. The resulting metrics—accuracy, precision, recall, and F1-score—were computed based on the matrix data. The model achieved a commendable accuracy of 0.86, indicating a strong performance. Precision in this context quantifies the accuracy of positive predictions made by the model, while recall measures the model’s ability to identify all relevant instances correctly. The F1-score, being the harmonic mean of precision and recall, is a critical measure that reflects the balance between these two metrics. A high F1-score in this study suggests that the model performs well in both precision and recall, confirming its effectiveness in classifying parcel shapes based on the identified categories.

The performance metrics for each parcel type as classified by the YOLOv8 model demonstrate robust accuracy and a balanced evaluation between precision and recall, reflecting high performance across the board (see Table 4).

For the Avocado type, the model achieved a precision of 0.86, meaning that 86% of parcels classified as Avocado by the model were correct. The recall of 0.85 indicates that 85% of actual Avocado-type parcels present in the dataset were successfully identified by the model. The corresponding F1-score of 0.85 highlights a balanced performance between precision and recall, illustrating the model’s effectiveness in classifying this particular shape with high reliability.

For the Potato type, the precision was slightly higher at 0.87, suggesting that 87% of predictions made by the model for the Potato type were accurate. The recall also stood at 0.86, indicating that 86% of all Potato-type instances were correctly detected. An F1-score of 0.86 further indicates that the performance between precision and recall is evenly matched, showing no significant discrepancy and suggesting robust classification capabilities for this type. The Trapezoid type showed equal precision and recall values of 0.87. This indicates that 87% of parcels predicted as Trapezoid were correctly identified, and 87% of all actual Trapezoid parcels were detected by the model. The F1-score of 0.87 underscores a highly balanced performance, reflecting the model’s capability to classify this regular-approaching shape with high accuracy.

Lastly, the Stick type exhibited a precision of 0.84 and a slightly higher recall of 0.89. This means that while 84% of parcels identified as Stick by the model were correct, the model was able to detect 89% of all actual Stick-type parcels, indicating a slightly greater sensitivity. The F1-score of 0.87 shows that despite the minor difference between precision and recall, the overall performance is still highly effective, demonstrating the model’s strength in identifying this more elongated parcel type.

Overall, these metrics not only reflect the individual strengths of the model in recognizing each parcel type but also emphasize the model’s overall efficacy in handling a variety of shapes with high precision and recall, confirming its suitability for complex classification tasks in urban planning and architecture.

4.2.2. Test Data Results

This study divided the dataset into training, validation, and test sets in a 7:2:1 ratio. Excluding the test set, the YOLOv8s-cls model was trained and optimized, yielding results as detailed in Section 4.2.1. The derived confusion matrix and evaluation matrix confirmed that the model effectively classifies each class. To analyze and discuss the reliability and generalization ability of the developed model, results from applying the unutilized test data in model training were examined.

Out of the total 3000 dataset items, 300 (10%) comprised the test set applied to the YOLOv8s-cls model, and the outcomes were visualized in a confusion matrix as shown in Figure 10. Based on this, the accuracy, precision, recall, and F1-score for the test set were calculated (see Table 5). The accuracy achieved was 0.86, consistent with the results obtained from the training and validation sets. A comparative analysis of each class’s evaluation results from the training and validation phases showed that, while precision for the Potato type decreased slightly from 0.87 to 0.86, recall rates for Avocado and Stick types were 0.84 and 0.87, respectively, and the F1-score for the Stick type was 0.85—marginally lower than the training and validation results but still maintaining a comparable level.

The findings from the test set indicate that the model’s performance on new, unseen data is similar to the outcomes based on the training and validation data. Although some metrics for specific types were slightly lower, the differences were minimal, suggesting that the model has secured a strong capability to generalize. This demonstrates the model’s consistent performance and robustness.

5. Conclusions

5.1. Summary

This study aimed to classify parcel shapes in South Korean apartment complexes using machine learning techniques informed by prior research and to refine these classifications using a deep learning model. The initial data collection was performed using a map API, with the data being categorized into 20 clusters via K-means clustering. These clusters were subsequently refined into four classes based on comparisons with average shapes and previous typologies. The classes were then validated using SI, STI, and WR as analytical methods, assessing the irregularity and convexity of each shape.

For model evaluation, the YOLOv8 classification model was trained with these labeled data and assessed using precision, recall, and F1-score metrics, which collectively demonstrated high accuracy and predictive performance. Despite the high performance of the model, it is important to acknowledge the possibility of errors in classifying land shapes. Some shapes may not clearly belong to any of the predefined classes, leading the model to predict the class with the highest probability. In such cases, the potential for classification across multiple classes should be considered. To address these challenges, enhancing the diversity of training data, optimizing parameters, utilizing high-performance YOLO models, and applying data augmentation techniques are crucial for improving model performance. Furthermore, analyzing the limitations of the dataset and its impact on model performance is essential for exploring new research directions to overcome the model’s constraints. K-means clustering presents limitations, notably the requirement to predefine the K value. Determining an optimal K value can be challenging without a clear inflection point, as observed in this study, potentially leading to time-consuming trials. However, clustering unlabeled, computer-analyzed data continues to offer significant time and resource efficiencies, particularly when human analysis is impractical. However, interpreting these data from an architectural and urban planning perspective still necessitates professional expertise, affirming that final shape classification remains a researcher’s responsibility.

Contrary to previous studies that categorized parcel shapes into eleven types, this study simplified the classification to four types. Although this reduction may appear overly simplistic, it is justified by various experiments. For example, training the deep learning model initially with 20 clusters without expert input resulted in low accuracy. Subsequent re-clustering and further analysis failed to improve performance, indicating that a more streamlined classification could be more effective.

This finding underscores the necessity of expert insight and analysis. Moreover, models trained with classifications identical to those from previous studies exhibited low performance, likely due to differences in the study subjects. This highlights the need for new research and analyses tailored specifically to apartment complexes, as successfully demonstrated in this study where optimal experimental conditions led to a streamlined classification of parcel shapes into four distinct classes.

This study is noteworthy for acquiring extensive data on parcel shapes in South Korean apartment complexes and for developing a novel methodology that utilizes machine learning to classify unlabeled raw data and deep learning to validate these classifications. The use of machine and deep learning in analyzing parcel shapes enhances time efficiency and objectivity during the planning and development stages of apartment complexes, providing a data-driven approach to architectural and urban planning.

Future research should explore different apartment complex types based on these parcel classifications to develop strategic plans for the arrangement of main buildings and external spaces. Such studies could significantly advance strategic planning for apartment complexes.

5.2. Limitations

However, the study’s scope was limited to using K-means clustering for analyzing parcel types and did not include a comparative analysis with other machine learning methodologies. Additionally, the deep learning analysis was conducted solely with the YOLOv8s-cls model, without comparison to other models. However, the two methodologies selected in this study have been confirmed through prior research to outperform other methods. DBSCAN and spectral clustering suffer from performance degradation in datasets with uneven density due to their sensitive parameter settings, and hierarchical clustering faces high computational costs and difficulty in modifying established clusters. This validates that K-means clustering, with its simplicity and efficiency, is more suitable for handling large volumes of data like those used in this research. Additionally, while there are various deep learning models such as ResNet, VGGNet, and EfficientNet, most require a significant number of parameters and have relatively slow processing speeds, complicating the analysis of large datasets due to their complex structures. In contrast, the YOLO model offers the flexibility to be modified and customized as needed, and with fewer parameters, it provides faster inference speeds, making it an ideal choice for efficiently analyzing objects of various sizes. Despite these limitations, this research marks a significant step forward in demonstrating how architectural and urban planning can benefit from advanced analytical methods that utilize unlabeled raw data.

Future studies will utilize the dataset from this research to conduct comparative analyses using upgraded versions such as YOLOv9 and YOLOv10, as well as evaluating other non-YOLO deep learning models. While the YOLOv8s-cls model proposed in this study demonstrated excellent performance under basic noise conditions, its robustness under specific conditions has not yet been fully verified. Therefore, future research will involve artificially introducing various types of noise, such as Gaussian and salt-and-pepper noise, to assess the model’s response. Additionally, data augmentation techniques including geometric transformations, color modifications, and random cropping will be employed to enhance the robustness of the data, enabling performance evaluation across different noise levels. These efforts are expected to contribute to the development of models optimized for parcel shape classification.

Moreover, this approach will enable a comprehensive survey and analysis of the entire dataset of apartment complexes in Korea. By doing so, cross-sectional and regression analyses based on various indicators such as construction periods and regional differences will facilitate an in-depth analysis of trends and impact relationships in Korean apartment complexes.

Author Contributions

Conceptualization, S.-B.Y.; Methodology, S.-B.Y.; Software, S.-B.Y.; Validation, S.-E.H.; Investigation, S.-B.Y.; Resources, S.-B.Y.; Data curation, S.-E.H.; Writing—original draft, S.-B.Y.; Writing—review & editing, S.-E.H.; Visualization, S.-B.Y.; Supervision, S.-E.H.; Project administration, S.-E.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No.RS-2022-0016605712482147660003) and the Hyupsung University Research Grant of 2024).

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Lee, I.; Lim, S.; Kim, C. Analysis of the effect of parcel shape on the development density. J. Urban Des. Inst. Korea 2009, 10, 151–162. [Google Scholar]
Ko, Y. A Comparative Analysis of Floor Area Ratio According to the Standardization of Small-Size Irregular Parcel. Master Thesis, Seoul National University of Science and Technology, Seoul, Republic of Korea, 2016. [Google Scholar]
Choi, J.; Lee, S. A study on the method of land shape recognition using the geometric feature information of parcel and the technique of decision tree. J. Real Estate Anal. 2017, 3, 19–34. [Google Scholar] [CrossRef]
MOLIT. Guidelines for Investigation and Calculation of Individual Official Land Price Applied in 2023; MOLIT: Sejong, Republic of Korea, 2023; pp. 105–107.
Park, H. Self-Study Machine Learning and Deep Learning; Hanbit Media: Seoul, Republic of Korea, 2020. [Google Scholar]
Hong, I. Land use classification using LBSN (Location-Based Social Network) data and machine learning technique. J. Korean Cartogr. Assoc. 2017, 17, 59–67. [Google Scholar] [CrossRef]
McGarigal, K.; Marks, B.J. FRAGSTATS: Spatial Pattern Analysis Program for Quantifying Landscape Structure; USDA Forest Service General Technical Report PNW-351, Corvallis; U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station: Portland, OR, USA, 1995.
Choi, J.; Sim, J.; Kim, B. A Study on Improvement of Land Characteristic Survey; Research Institute of REB: Daegu, Republic of Korea, 2015. [Google Scholar]
Demetriou, D.; See, L.; Stillwell, J. A Parcel Shape Index for Use in Land Consolidation Planning. Trans. GIS 2013, 17, 861–882. [Google Scholar] [CrossRef]
Lee, J.; Park, S.; Cho, S.; Kim, J. Comparison of Models to Forecast Real Estates Index Introducing Machine Learning. J. Archit. Inst. Korea 2021, 37, 191–199. [Google Scholar]
Yoo, D.; Kim, K.; Choi, C.; Cho, S.; Jang, H. Development of a Machine Learning-based Low-rise Residential Building Energy Consumption Prediction Model. J. KIAEBS 2021, 15, 152–165. [Google Scholar]
Shin, D. Discovering Anomalous Power Usage Patterns in Rental Housing Through Small-Scale Data. J. Archit. Inst. Korea 2024, 40, 81–88. [Google Scholar]
Yun, S.B.; Mun, S.; Park, S.Y.; Kim, T. Data-driven analysis for future land-use change prediction: Case study on Seoul. J. Broadcast Eng. 2021, 25, 176–184. [Google Scholar]
Min, M. Classification of Seoul metro stations based on boarding/alighting patterns using machine learning clustering. J. Inst. Internet Broadcast. Commun. 2018, 18, 13–18. [Google Scholar]
Kang, M.; Jung, Y.; Jang, D. A study on the search of optimal aquaculture farm condition based on machine learning. J. Inst. Internet Broadcast. Commun. 2017, 17, 135–140. [Google Scholar] [CrossRef]
Han, S.; Seo, J.; Purwaningati, S.; Oh, J.; Kim, J. Application and development of a machine learning based model for identification of apartment building types. J. Korean Assoc. Geogr. Inf. Stud. 2023, 26, 55–67. [Google Scholar]
Lee, S.; Baek, S.; Lee, J.; Kim, K.; Kim, S.; Kim, H. Development of disaster severity classification model using machine learning technique. J. Korea Water Resour. Assoc. 2023, 56, 261–272. [Google Scholar]
Talha, S.A.; Manasreh, D.; Nazzal, M.D. The Use of Lidar and Artificial Intelligence Algorithms for Detection and Size Estimation of Potholes. Buildings 2024, 14, 1078. [Google Scholar] [CrossRef]
Sobieraj, J.; Fernández, M.; Metelski, D. A Comparison of Different Machine Learning Algorithms in the Classification of Impervious Surfaces: Case Study of the Housing Estate Fort Bema in Warsaw (Poland). Buildings 2022, 12, 2115. [Google Scholar] [CrossRef]
Ghandour, A.J.; Jezzini, A.A. Autonomous Building Detection Using Edge Properties and Image Color Invariants. Buildings 2018, 8, 65. [Google Scholar] [CrossRef]
Tapeh, A.T.G.; Naser, M.Z. Artificial Intelligence, Machine Learning, and Deep Learning in Structural Engineering: A Scientometrics Review of Trends and Best Practices. Arch. Computat. Methods Eng. 2023, 30, 115–159. [Google Scholar] [CrossRef]
Yu, Y.; Zhang, C.; Xie, X.; Yousefi, A.M.; Zhang, G.; Li, J.; Samali, B. Compressive strength evaluation of cement-based materials in sulphate environment using optimized deep learning technology. Dev. Built Environ. 2023, 16, 100298. [Google Scholar] [CrossRef]
Yu, Y.; Hoshyar, A.N.; Samali, B.; Zhang, G.; Rashidi, M.; Mohammadi, M. Corrosion and coating defect assessment of coal handling and preparation plants (CHPP) using an ensemble of deep convolutional neural networks and decision-level data fusion. Neural Comput. Appl. 2023, 35, 18697–18718. [Google Scholar] [CrossRef]
Park, H.J. A Study on Architectural Spatial Relationship Interpretation and Recommendation Model Using Deep Learning. Ph.D. Thesis, Kyungpook National University, Daegu, Republic of Korea, 2022. [Google Scholar]
Choo, S.Y.; Seo, J.H.; Park, H.J.; Ku, H.M.; Lee, J.K.; Kim, K.T.; Park, S.H.; Kim, J.S.; Song, J.Y.; Lee, S.H.; et al. AI-Based Architectural Design Automation Technology Development; Korea Agency for Infrastructure Technology Advancement: Seoul, Republic of Korea, 2020. [Google Scholar]
Chaillou, S. ‘Archigan: A Generative Stack for Apartment Building Design’; NVIDIA Corporation: Santa Clara, CA, USA, 2019. [Google Scholar]
Nauata, N.; Chang, K.H.; Cheng, C.Y.; Mori, G.; Furukawa, Y. ‘House-gan: Relational generative adversarial networks for graph-constrained house layout generation’. In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; pp. 162–177. [Google Scholar] [CrossRef]
K-apt. Available online: http://www.k-apt.go.kr/cmmn/main.do (accessed on 23 October 2023).
Park, I.S.; Park, N.H.; Chun, H.S. Changes in Apartment Unit Plan Caused by the Revision of Regulations for Area Calculating Criteria and Balcony Use. J. Korean Hous. Assoc. 2014, 25, 27–36. [Google Scholar] [CrossRef]
Hong, S.J.; Lee, Y.S. Typification of Irregular Shaped Land Parcels Using Machine Learning. J. Archit. Inst. Korea 2022, 38, 59–67. [Google Scholar]
KakaoMap. Available online: https://map.kakao.com/ (accessed on 23 October 2023).
Google Map. Available online: https://www.google.com/maps/?hl=ko (accessed on 23 October 2023).
Yoon, S.-B.; Hwang, S.-E.; Kang, B.S.; Lee, J.H. An analysis of South Korean apartment complex types by period using deep learning. Buildings 2024, 14, 776. [Google Scholar] [CrossRef]
Roboflow. Available online: https://roboflow.com/ (accessed on 20 December 2023).
Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the 2014 European Conference on Computer Vision (ECCV), Zurich, Switzerland, 6–12 September 2014. [Google Scholar]
Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images; Technical Report; University of Toronto: Toronto, ON, Canada, 2009. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015; pp. 1–14. [Google Scholar]
Tan, M.; Le, Q. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
Howards, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Myung, H.J.; Song, J.W. Deep Learning-based Poultry Object Detection Algorithm. J. Digit. Contents Soc. 2022, 23, 1323–1330. [Google Scholar] [CrossRef]
Ultralytics. Available online: https://github.com/ultralytics/ultralytics (accessed on 21 December 2023).
Medium. Available online: https://sidharkal.medium.com/image-classification-with-yolov8-40a14fe8e4bc (accessed on 21 December 2023).
Lee, J.K.; Son, Y.H. Development of Image Classification Model for Urban Park User Activity Using Deep Learning of Social Media Photo Posts. J. Korean Inst. Landsc. Archit. 2022, 50, 42–57. [Google Scholar] [CrossRef]

Figure 1. Parcel shape types derived from previous research.

Figure 2. Apartment layout data using maps.

Figure 3. Example of data labeling by parcel shape type in apartment complexes.

Figure 4. Workflow of the proposed methodology: (a) Reviewing parcel shape classification and analysis methodologies based on prior studies. (b) Initial classification of parcel shapes within an apartment complex using K-means clustering, followed by type verification through YOLO classification, after processing the raw data. (c) Analyzing irregular shapes according to the parcel shape analysis methodology and evaluating the performance of the YOLO model.

Figure 5. YOLOv8 architecture.

Figure 6. Data detection example for parcel shape analysis.

Figure 7. Example of a confusion matrix.

Figure 8. Results of the Elbow method for determining the optimal K value.

Figure 9. Apartment complex parcel type model learning results: (a) confusion matrix; (b) training graphs.

Figure 10. Test data results of apartment complex parcel type.

Table 1. Parcel shape clustering results for apartment complexes.

Clusters	Results of K-Means Clustering	Center Images
Cluster 0 (N = 135)
Cluster 1 (N = 134)
Cluster 2 (N = 128)
Cluster 3 (N = 191)
Cluster 4 (N = 177)
Cluster 5 (N = 160)
Cluster 6 (N = 197)
Cluster 7 (N = 136)
Cluster 8 (N = 171)
Cluster 9 (N = 187)
Cluster 10 (N = 158)
Cluster 11 (N = 148)
Cluster 12 (N = 97)
Cluster 13 (N = 68)
Cluster 14 (N = 129)
Cluster 15 (N = 220)
Cluster 16 (N = 146)
Cluster 17 (N = 153)
Cluster 18 (N = 79)
Cluster 19 (N = 186)

Table 2. Parcel shapes classification for apartment complexes.

Type	Clusters
Avocado (N = 900)	Cluster 5	Cluster 11	Cluster 12

	Cluster 14	Cluster 15	Cluster 16

Potato (N = 678)	Cluster 1	Cluster 8	Cluster 9

	Cluster 19

Trapezoid (N = 876)	Cluster 3	Cluster 4	Cluster 6

	Cluster 10	Cluster 17

Stick (N = 546)	Cluster 0	Cluster 2	Cluster 7

	Cluster 13	Cluster 18

Table 3. Analyzing parcel shape types for apartment complexes.

Type	Clusters	SI	STI	WR
Avocado	Cluster 5	1.222	0.723	1.827
	Cluster 11	1.157	0.705	1.316
	Cluster 12	1.153	0.737	1.306
	Cluster 14	1.206	0.727	1.663
	Cluster 15	1.172	0.779	1.712
	Cluster 16	1.251	0.703	1.894
Average		1.194	0.729	1.619
Potato	Cluster 1	1.129	0.752	1.159
	Cluster 8	1.126	0.826	1.355
	Cluster 9	1.138	0.803	1.404
	Cluster 19	1.125	0.824	1.193
Average		1.129	0.801	1.278
Trapezoid	Cluster 3	1.099	0.837	1.265
	Cluster 4	1.148	0.813	1.593
	Cluster 6	1.078	0.874	1.245
	Cluster 10	1.067	0.886	1.090
	Cluster 17	1.126	0.816	1.451
Average		1.104	0.845	1.329
Stick	Cluster 0	1.262	0.708	1.929
	Cluster 2	1.358	0.668	2.425
	Cluster 7	1.306	0.714	2.340
	Cluster 13	1.274	0.681	1.958
	Cluster 18	1.405	0.662	2.692
Average		1.321	0.687	2.269
Total		1.187	0.766	1.624

Table 4. Evaluation of the parcel shape classification model for apartment complexes.

Class	Precision	Recall	F1-Score
Avocado	0.86	0.85	0.85
Potato	0.87	0.86	0.86
Trapezoid	0.87	0.87	0.87
Stick	0.84	0.89	0.87
Accuracy	0.86
Macro-F1	0.86
Weighted-F1	0.86

Parameters (Millions): 0.5, GFLOPs: 12.6, Learning: 0.01, Momentum = 0.937, Decay: 0.005, Batch: 16, Epochs: 40.

Table 5. Evaluation of test dataset of the parcel shape classification model.

Class	Precision	Recall	F1-Score
Avocado	0.86	0.84	0.85
Potato	0.86	0.86	0.86
Trapezoid	0.87	0.87	0.87
Stick	0.84	0.87	0.85
Accuracy	0.86
Macro-F1	0.85
Weighted-F1	0.86

Parameters (Millions): 0.5, GFLOPs: 12.6, Learning: 0.01, Momentum = 0.937, Decay: 0.005, Batch: 16, Epochs: 40.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yoon, S.-B.; Hwang, S.-E. Analyzing Land Shape Typologies in South Korean Apartment Complexes Using Machine Learning and Deep Learning Techniques. Buildings 2024, 14, 1876. https://doi.org/10.3390/buildings14061876

AMA Style

Yoon S-B, Hwang S-E. Analyzing Land Shape Typologies in South Korean Apartment Complexes Using Machine Learning and Deep Learning Techniques. Buildings. 2024; 14(6):1876. https://doi.org/10.3390/buildings14061876

Chicago/Turabian Style

Yoon, Sung-Bin, and Sung-Eun Hwang. 2024. "Analyzing Land Shape Typologies in South Korean Apartment Complexes Using Machine Learning and Deep Learning Techniques" Buildings 14, no. 6: 1876. https://doi.org/10.3390/buildings14061876

APA Style

Yoon, S. -B., & Hwang, S. -E. (2024). Analyzing Land Shape Typologies in South Korean Apartment Complexes Using Machine Learning and Deep Learning Techniques. Buildings, 14(6), 1876. https://doi.org/10.3390/buildings14061876

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Analyzing Land Shape Typologies in South Korean Apartment Complexes Using Machine Learning and Deep Learning Techniques

Abstract

1. Introduction

2. Related Work

2.1. Survey and Analysis of Parcel Shapes

2.1.1. Classification of Parcel Shapes

2.1.2. Analysis of Parcel Shapes

2.2. Trends in Architecture Using AI

2.2.1. Machine Learning in Architecture and Urban Studies

2.2.2. Research in Urban and Architectural Fields Using Deep Learning

3. Materials and Methods

3.1. Apartment Complex Parcel Shape Dataset

3.1.1. Scope of Apartment Complexes

3.1.2. Parcel Shape Types

3.2. Image Dataset

3.2.1. Image Acquisition

3.2.2. Image Processing

3.3. Research Methods

3.3.1. Image Clustering

3.3.2. Image Classifier

3.3.3. You Only Look Once (YOLO) Model

3.3.4. Analysis of Parcel Shapes

3.3.5. Evaluation Metrics Accuracy

4. Results and Discussions

4.1. Parcel Shape Clustering

4.1.1. K-Means Clustering

4.1.2. Parcel Shape Classification and Analysis

4.2. Parcel Shape Type Classifier

4.2.1. Evaluation Matrix

4.2.2. Test Data Results

5. Conclusions

5.1. Summary

5.2. Limitations

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI