1. Introduction
In the architectural and interior design fields, referring to previous design works to solve new problems is a common approach [
1,
2]. In particular, reference images are useful visual materials for communication during the design process because the general layperson (e.g., the client) prefers images that represent the design results [
3]. Reference data must be appropriately interpreted, structured, classified, and organized to be reused efficiently [
4]. Reference-sharing platforms generally provide reference images with various kinds of information about, for example, the designer, site, cost, and materials. Several types of information, such as the cost, designer, site location, and area, are static and do not need interpretation. By contrast, other types of information, such as the design style, are qualitative and change according to the different criteria or individuals.
In general, a style is derived from common visual features that repeatedly appear in multiple design instances [
5]. Many researchers have defined distinctive styles by identifying the common features of each type of interior design. However, a design work can be understood in various styles. This is because a design style is associated with the culture, time, region, philosophy, or individuals [
6]. Moreover, the definitions of styles are represented by texts or images. Thus, recognizing the style of a given design is a matter of interpretation. In reality, people often match reference images and design keywords by their own [
7]. This behavior can cause confusion when trying to manage and use design reference images with style information.
To overcome this problem, a deep learning-based image recognition method was adopted to determine the interior design style for reference images. Deep learning and convolutional neural networks (CNNs) are data-driven approaches for the independent identification of significant patterns of given images [
8]. Deep learning and CNNs have demonstrated that computers distinguish general objects (e.g., dogs or cats) and specific domain things (e.g., faces and room usage) [
9]. They can also be applied to qualitative problems, such as the Go game, in which the best rather than the correct answer is determined [
10]. In this study, rather than defining what styles of interior design are or finding a classification criterion, a data-driven method to infer design styles of given interior design reference images using a CNN is proposed, and an application for improving the efficiency of the use of design reference images using design style recognition model is depicted.
2. Literature Review
2.1. Usage of References in Architectural and Interior Design
Previous design projects have generally been used as references to derive ideas or solutions in the fields of architectural and interior design [
2,
11]. In general, design reference data are represented in diverse forms (e.g., design documents, drawings, specifications, 3-D digital models, rendered images, and photographs). Therefore, many researchers have proposed approaches for managing and using design reference data efficiently. These approaches include case-based design and precedent-based design systems [
12,
13,
14,
15]. To reuse references from previous design projects efficiently, important information must be carefully stored and managed [
4].
Among the types of information used for design references, visual materials, such as drawings or photographs, were used as primary sources to support analyses and distinguish the different types of designs [
16]. Although designers are accustomed to using visual materials in which design intentions are abstractly and conceptually expressed, the general public (including clients) prefers photographs or images that directly present the design results [
3]. Therefore, design reference image databases containing interior photographs or facades help communicate with clients in the early stages of the design process.
In general, each architectural or interior design firm has an in-house design reference database or library. Web-based reference-sharing platforms where anyone can collect and share design reference images with well-organized information are growing in popularity. For example, “Houzz.com” is the most visited interior design reference platform. It provides over 10 million interior design image references with useful information for interior design decision making.
Table 1 lists examples of the provided information with reference images for the platforms. In particular, room usage is provided as fundamental information for residential design. The platforms also provide information about the design style, area, firm, location, budget, color, and material for each design component.
However, current approaches for storing and managing reference images and related information have several limitations. Some types of information, such as the cost, designer, site location, and area, do not require interpretation and are static. By contrast, information, such as the design style, is qualitative and change according to different criteria or individuals. The same style term can be interpreted differently by different people. However, the references do not indicate who entered the style information or which criteria were used; this can confuse users when they retrieve and browse reference data.
2.2. Design Style of Interior Design Reference Images
According to Simon, H. A., (1975), style in design includes various aspects, such as the common features presented in design works and the manner of designing something [
17]. Style plays an important role in classifying design instances into meaningful categories because well-defined keywords for style can help users understand different designs and communicate with each other effectively during the design process. Based on different combinations of features, design style can be defined in various ways. In architectural and interior design, design style is typically associated with culture, time, region, philosophy, or individuals [
6].
A certain style of architectural design can be recognized according to the common features that repeatedly appear in design instances [
5]. Chan, C.-S., (2000) proposed a method for identifying style with the theory of feature matching [
18]. Design style can be measured according to the degree of style rather than with an absolute determination. The degree of style is affected by the number of common features and their quality. Therefore, the recognition of design styles can be considered a matter of interpretation. Features can include the shape, pattern, material, texture, and color. Some styles can be defined as a combination of features that were popular during specific periods or in a specific culture or region, such as the Baroque, Rococo, Victorian, Zen, and Tropical styles. Other styles can be defined as combinations of features that elicit specific feelings, such as natural, minimal, or casual styles [
19]. Many researchers have defined categories of design styles to help users understand the design style in South Korea [
7,
20,
21,
22,
23,
24,
25]. Although the categories and terms of design styles defined in previous studies exhibit differences,
Table 2 lists the commonly considered terms of design style categories.
Among these terms, the modern, natural, classic, and casual styles are handled at the highest frequency. Because they can be easily understood by non-design experts [
7,
26], this study focused on these four design styles. The general definitions of the styles from previous research studies are provided in
Table 3. Modern style is generally defined by simple spaces with clean lines and a lack of decoration. In modern style, monochrome colors are common, and primitive colors, such as red or blue, are sometimes used as accent colors. Glass, marble, and metal materials are commonly used. Natural style is generally defined by cozy spaces with natural rather than artificial elements. In natural style, colors and materials that can be derived from nature are commonly used. Classic style is generally defined by gorgeous and luxurious spaces with traditional decorations and patterns. In classic style, deep and dark colors are generally used, and wood, gold, and silk materials are used in complex patterns. Furthermore, casual style is generally defined by warm, comfortable, and informal spaces with colorful materials.
The detailed definition of each interior design style differed from that of the researcher, but it was possible to extract common contents as shown in
Table 2. Based on these general definitions of design style, interior design references images are collected and classified and used for deep learning model training.
2.3. Deep-Learning Algorithms for Interior Design Image Recognition
Deep learning has achieved great success in various machine learning fields, such as computer vision and natural language processing. Deep-learning mechanisms enable computational models composed of multiple processing layers to learn data representations with multiple levels of abstraction [
8]. Convolutional neural networks (CNNs) are most commonly applied in computer vision applications, such as facial detection [
9]. The convolutional and pooling layers in CNNs can be considered imitations of human neurons [
27]. A CNN consists of a feature extractor and classifier. Before deep learning and CNNs were developed, the features of images were manually extracted by domain experts. By contrast, the feature extractor of a CNN learns to calculate the main features of images for classification. The calculated feature data are used to train a classifier consisting of neural network layers and a classification layer, such as a support vector machine or softmax activation function.
In the architectural and interior design domain, some researchers have proposed CNN-based approaches for the recognition of design-related information in images. Zhou, B., et al. (2017) constructed a large image database of outdoor and indoor locations for deep learning and provided the source code and weights for a CNN model trained on their database [
28]. Their database contains approximately 10 million images representing more than 400 space categories. Liu, X., et al. (2019) used deep learning to explore the trends of interior design in different regions [
29]. Moreover, Kim, J. and Lee, J.-K., (2018) implemented auto-recognition of room usage in indoor images of apartments in South Korea as a component of an intelligent management system for interior reference images [
30].
In addition, researchers have investigated individual design elements. For example, Hu, Z., et al. (2017) proposed a visual classification method for furniture styles (e.g., the American, Baroque, Rococo, modern, Japanese, and Chinese styles) with a CNN model [
31]. Traditional handcrafted approaches that consider the furniture details (e.g., the shape, color, material, and size) were compared with a CNN-based approach to extract visual feature maps. The CNN-based approach achieved an accuracy of 68.5%, which was approximately 5.5% greater than the accuracy of the traditional approach. Kim, J., et al. (2019) proposed a recognition method for design components and their information with a CNN model [
32]. The functional features, materials, seating capacity, style, and type of seating furniture in input interior design images were automatically recognized. Furthermore, Bell, S. and Bala, K., (2015) proposed an approach for learning the visual similarities between interior design components. The visual similarities were used to search various design cases based on an input design component [
33].
In summary, various approaches for the automatic recognition of interior spaces and components based on deep learning and CNNs have been proposed. Although data-driven design recognition models have been presented, there has been little discussion on how the results can be implemented in interior design practice. Hence, a deep-learning-based approach that enables the more efficient use of design references based on interior design styles is proposed in this paper.
3. Method for Recognition of Interior Design Styles of Reference Images Using the CNN Model
This section describes the research methods for the implementation of the interior design style recognition model. The research steps can be summarized as follows: (1) Propose a conceptual model of interior design style information based on a style recognition model; (2) prepare an interior design style reference image dataset based on survey data for interior design styles; (3) train and evaluate the interior design style recognition model with a CNN; and (4) apply the quantitative style information and reference databases with interior design style recognition (
Figure 1).
3.1. Conceptual Modeling for Interior Design Information of Reference Images
Design reference images are useful because they contain significant information for design decision making. In practice, most platforms on which design projects are shared provide reference images with various types of information. The provided information includes image metadata, general design information, qualitative design information, etc.
Figure 2 presents a conceptual structure of the information contained in design reference images. A proposed style recognition model automatically appends style and its related data on given images.
Image metadata that can be automatically appended by a device includes date, time, and geolocation. These data do not provide any information directly related to an interior design project. On the other hand, general design information that is basic information related to a design project includes the area and usage of a target space, budget, designers, and sites. These can be useful for identifying proper references for a specific design project. In general, such information is manually treated by users or database managers. The general information can be quantitatively identified by calculating rather than interpretating or judging. Whereas, qualitative information, such as the design style, must be interpreted and can vary depending on the standards and subjects of interpretation. To ensure the reliability of such qualitative information, the interpretation and assessment of well-trained experts of specific rules for classification may be important.
However, even with experts, such rule-based approaches yield deterministic values. As discussed in
Section 2.2, style is recognized based on common features that repeatedly appear in design instances. A keyword of style is defined by a description of common features and how they are combined. Therefore, style information in reference images contains, for example, common features, descriptions (definitions), and style names. Design references with evident style information are more useful. However, the current manual approaches for entering such information are very time-consuming and require significant human resources.
This paper focuses on the deep learning-based approach to treating a qualitative information in terms of efficient management and utilization of interior design references. The proposed deep-learning-based style recognition model automatically appends inferred style names and their probabilities of given reference images as reference data.
3.2. Interior Design Style Image Data Preparation
This section describes the development process for an interior design style image dataset for training a deep-learning model. In this study, the target interior design images for training the style recognition model showed living rooms with some common design styles of residential design in South Korea. Explicit definitions of specific design styles are outside the scope of this study.
Based on previous research studies of the definition and categorization of design style in South Korea, a dataset of interior design style reference images was constructed, which can be used to train and evaluate a design style model. The target data were limited to living room design images and collected from the interior design information sharing platforms of “Daum real estate”, “Ohouse”, Ggumim”, “Houzz”, “Zipdoc”, “Zipdeco”, and “Zigbang” from February to June 2020. Each platform provides images with tag information about, for example, the residence type and room usage. First, data were acquired with the living room tag, and only images with a good view of the living room and components were retained. In the next step, the images were classified into different design styles according to the general definitions and characteristics of design styles described in
Section 2.2. Finally, an image dataset was constructed with a total of 480 images, where each style had 120 images.
Figure 3 presents representative examples of each design style that was collected and classified.
Before the training of the CNN model, the interior design images had to be preprocessed. Preprocessing included (1) labeling the collected image data, (2) resizing data, and (3) augmenting data. A dataset for deep learning is generally divided into a training and a test dataset. The training dataset includes training and validation data. In general, the data are separated according to a fixed ratio. The augmentation of the training dataset is an optional step to enhance the performance of a trained model. In this study, various subsets of training datasets generated with different augmentation options were used to train models, and the evaluation results were compared. The test dataset for evaluating the models was fixed.
Figure 4 describes a flow of data preparation.
Resizing the data is necessary for both model training and model utilization. Convolutional neural network (CNN) models use image data with specific horizontal and vertical pixel dimensions as inputs. For example, the VGG model accepts images with dimensions of 224 × 244 pixels, and the Inception model requires an input size of 299 × 299 pixels. However, different interior design images generally have different aspect ratios and pixel sizes. Therefore, each image must be converted into an image with a specific aspect ratio and image size. The aspect ratio of an image data instance is transformed based on the greater dimension of the original image, and the nearest-neighbor method is applied to prevent image degradation in image resizing. This transformation was implemented with the Python Keras and PIL libraries.
Data augmentation is useful in the training process of a model. Deep-learning algorithms operate directly on data, and the number of available training data is a crucial factor. Data augmentation can compensate for insufficient data through image transformation. Commonly used methods include (1) rotation, (2) shifting, (3) rescaling, (4) flipping, (5) shearing, (6) zooming, and (7) stretching. However, excessive deformation can lead to model overfitting for a modified dataset. Therefore, the characteristics of indoor reference image data must be considered in the modification. Examples of reasonable transformations include flipping left and right, zooming, and rotating.
The process of augmentation enabled the production of a flipped dataset containing 800 images and a processed dataset containing 4000 images. In summary, the original interior design style image dataset has 480 images with 120 per class. The test set has 80 images per class. Training and evaluation datasets with 100, 200, 400, 800, and 4000 images were constructed separately through data processing.
3.3. Implementation of the Interior Design Style Recognition Model with the Pre-Trained CNN Model
Experiments with transfer-learning, different backbone networks, and trained weights were performed to compare the performance characteristic of the models. The CNN-based image classification model includes a backbone network and classifier (
Figure 5). The backbone network extracts visual feature data from the input images. The visual feature data are multi-dimensional vectors that represent the features of an image. Subsequently, these multi-dimensional vectors are inputted into classifiers to determine the styles of the corresponding images. In this study, a VGG16 network was used as a backbone network.
Transfer learning is a deep-learning approach, in which a model trained for one problem is reused for similar new problems. For example, a model trained on a dog and cat dataset can extract the proper visual features from other animal images. However, such a model would have difficulties extracting the proper features from room images. If the dataset used for pre-training does not include images similar to those in the new dataset used for retraining, transfer learning may not be a suitable approach. This is because a pre-trained model uses weights to extract proper visual feature data from image data. This approach generally yields an enhanced performance and saves computational power and time compared to training from scratch. There are various official image datasets available for training deep-learning models. The ImageNet dataset is a large dataset for general objects [
34]; Places365 is a popular dataset for various types of scenes (e.g., indoor, outdoor, office, and cafeteria). In this study, a model pre-trained on the Place365 dataset was chosen. The transfer learning process for interior design style detection can be summarized as follows:
- (1)
Begin with the backbone network of a model pre-trained on the Place365 dataset;
- (2)
Select layers for retraining by reinitializing the weights of those layers;
- (3)
Add a new classifier on top of the backbone network;
- (4)
Train the new model on the interior design style dataset.
The new classifier consisted of a fully connected layer with 256 outputs, a dropout (0.5) layer, and a classification layer with softmax activation (Bishop, 2006). For the classification with multiple classes, support vector machines and softmax functions are typically used. The softmax function calculates the probability of a class
ci for a given input
X, where
zi represents the score for the
ith class among
C total classes:
The classifier was constructed with a softmax function such that the four style labels could be calculated probabilistically. The purpose of training a CNN is to find the optimal loss value (also called “cost” or “error”). In each epoch, a loss function measures the error of the classification of the CNN and computes new parameters to minimize the loss value. TensorFlow, the Keras library, and Colab were used to train the CNN model for interior design style recognition. TensorFlow is a deep-learning framework developed by Google that supports GPU-based deep learning with the Nvidia GPUs, CUDA Toolkit, and cuDNN library. Keras is a Python-based deep-learning library that provides high-level methods to help simplify the construction of deep-learning models. Moreover, Colab provides a deep-learning research environment and high computing power through a Tesla K80 GPU.
4. Training and Evaluation of Interior Design Style Detection
This section describes the results of the training and evaluation interior design style detection model and demonstrates the trained model application to the interior design reference database. The trained mode is evaluated by measuring the accuracy, precision, recall, and F1-scores of test dataset recognition. To assess the performance of the trained design style recognition model, a comparison of the accuracy between the proposed model and the other domain model is described. It is also presented how the amount of training data affects the improvement of the model’s performance. Finally, this paper demonstrates how the proposed interior design style detection can be used for management interior design reference.
4.1. Results of Training and Evaluation
A dataset consisting of 120 images for each of the four target styles was used for training and evaluation. The test dataset has 80 images with 20 images per style class that were randomly selected from the dataset. The remaining 400 images were augmented twice by flipping left and right and then used to construct a training dataset. Finally, 80% of the resulting 800 images were used as training data, and 20% were used as validation data during the training. A VGG16 model with Place365-trained weights was used as a backbone network. The batch size of the training dataset was set to 32, and the batch size of the validation dataset was set to 160, which is the total size of the validation dataset. In addition, the learning rate was set to 0.00001. The Adam optimization algorithm was applied to update the network weights during training, and cross entropy was used as the loss function. The maximal number of training epochs was 500; however, the training was halted at 145 epochs to prevent overfitting.
Subsequently, the accuracy, precision, recall, and F1-scores were measured to evaluate the trained model. The F1-score, which is the harmonic average of precision and recall, is one of the most popular measures for evaluating the performance of machine learning models when the datasets are imbalanced [
35]. A true positive outcome indicates that the model correctly predicted the class of an input data sample. A false positive outcome indicates that the model predicted the class of an input data sample as A; however, the actual class was B or C. A false negative outcome indicates that the real class of an input data sample was B or C; however, the model predicted the class of the input data sample as A. Precision, recall, and F1-Score are defined as follows:
Figure 6 shows the results for the recognition test as confusion matrices. of results of. The X axis represents the style predicted by the trained model and the Y axis represents the true style of given test images.
Figure 6a describe the numbers of correct and incorrect predictions with count values by each style.
Figure 6b describes the normalized value of correct predictions. The true positive, false positive, and false negative of prediction results per style class can be calculated using the confusion matrix. A true positive value of the casual style class is calculated as 18, a false positive is calculated as 8, and a false negative is calculated as 2. A true positive value of the classic style class is calculated as 13, a false positive is calculated as 3, and a false negative is calculated as 7. A true positive value of the modern style class is calculated as 17, a false positive is calculated as 10, and a false negative is calculated as 3. Using their calculated values, precision, recall, F1-score, and accuracy are calculated to measure the performance of the style recognition model.
Table 4 presents the precision, recall, F1-score, and accuracy per class and their macro average values to measure the trained model performance. The precision value of classic style prediction is the highest at 0.812. This means that the model and the recall value for the casual style is the highest (0.900). In addition, the casual style yields the highest F1-score. The F1-scores of the casual, classic, and modern styles are all 0.7 or higher; that of the natural style is 0.452. The mean values for precision, recall, and F1-score are 0.693, 0.688, and 0.670, respectively.
There was a difference in the recognition performance for each class. The model predicted more frequencies of the modern and casual style then classic and natural. The number of modern style prediction cases was 27 times, and the number of casual style prediction cases was 26 times. In both cases, the model mispredicted the natural style more than any other style.
4.2. Performance of Interior Design Style Detection Models
To assess the performance of the proposed design style recognition model, the performance characteristics of the models trained for general objects and location images other than the targeted domains were compared to that of the proposed design style recognition model. The performance characteristics of the general object image recognition models, such as those trained on ImageNet, are evaluated with top-1 and top-5 accuracy. The recent models exhibit top-5 accuracies of 0.94–0.98 and top-1 accuracies of 0.78–0.88. Human top-5 accuracy is typically around 0.95, and the average top-1 accuracy of similar models is 0.80 (Image Classification on ImageNet, 2020).
Therefore, the proposed method can compete with methods trained on other datasets, even if it does not possess a human level.
Table 5 compares the accuracies between models for other domains and the proposed model. In the ImageNet challenge, the VGG16 model adopted in this study achieved a top-1 accuracy of 0.744 and top-5 accuracy of 0.919. By contrast, with the Place365 dataset, which contains data similar to the target data in this study, the VGG16 model achieved a top-1 accuracy of 0.552 and top-5 accuracy of 0.850. The model trained in this study achieved a top-1 accuracy of 0.688, which differs from the ImageNet challenge results by 5.6%. However, it is 13.6% more accurate than the Place365 challenge results. In other words, the accuracy of the proposed model is not significantly lower than those of models for other domains.
In addition, the accuracy can be improved by increasing the size of the dataset.
Table 6 presents the changes in the test accuracy with respect to the size of the training dataset. Each test was conducted with the same testing set. To evaluate the changes in the model learning performance according to the size of the training dataset, the training was conducted by randomly selecting 100, 200, 400, 800, and 4000 images. The augmented datasets were prepared by preprocessing all 400 original images. The first augmented dataset only included 800 images that were generated by flipping the original images left and right. The second augmented dataset included a total of 4000 images, which were generated with left and right inversion, rotation, shifting, zooming, and distortion.
The F1-score is the lowest for the training with 100 images (0.324). Training with 800 images, where all 400 original images are horizontal, yields the highest F1-score (0.670). In general, the F1-score increases with an increasing number of original images. In addition, guided data augmentation (e.g., flipping) improves the F1-score of the trained model by 0.07. By contrast, random data augmentation only slightly improves the F1-score of the trained model by 0.017.
4.3. Application to Stochastic Detection of Interior Design Styles
This section describes the application of design reference images supported by stochastic detection (
Figure 7). In this application, the design style recognition model and design reference image database play important roles. The proposed model detects the styles of the input reference images and extracts visual feature data from images. These data are used to retrieve reference images based on keywords or to recommend similar design style reference images. Scenario (1) demonstrates that the interior design styles of reference images that users input are automatically identified and saved in the database. Designers and reference managers can use the database to create their own collections of references. Common clients can identify which styles they want and view the related styles. Furthermore, scenario (2) demonstrates that a more detailed retrieval of references is possible with design style keywords and probability values. In scenario (3), users can implement and use the proposed style recognition model constructed with the reference datasets according to user-defined styles.
The proposed framework is a powerful tool that can be applied to novel knowledge-based design approaches during design communication and decision-making processes. Data processing and information recognition are handled by a deep-learning model based on design knowledge. Based on the results of this study, it is expected that designers will be able to contribute to a new design environment that will allow them to communicate with clients and focus more on their original creative design tasks. At the beginning of the design phase, clients can input their own design references to obtain information about the style of a desired design and recommendations about similar cases. Choosing a few of the recommended references can help to start the design phase because information about the designer responsible for the recommended references is provided.
4.3.1. Scenario (1): Interior Design Reference Database with Style Recognition
This scenario demonstrated an application that automatically recognizes and stores style information of a reference image using an interior design style learning model. The model infers not only style information and probability, but also an inference model instance and training data are stored together.
Figure 8 illustrates the process for constructing a database with design style recognition.
The data returning from the style recognition are stored in a recognition result table and visual feature map table in the database. The recognition result table contains recognition IDs, recognized style names, and probability values as attributes. One image can have multiple style names (this study considered four values).
By using the trained model and Python graphical user interface (GUI) interface development library, a prototype application was developed (
Figure 9). A user can input design reference images that do not contain design style information. The application provided an option to select the recognition model. The recognition results are presented in table view with the image, so the user can check each recognition case and selectively change the results if a result is not reliable.
It took about 10 s to recognize the style information of the 480 images used in this study. It was able to process about 48 pieces of data per second, which can increase in proportion to the performance of the hardware. The recognition results can be saved in a database format, such as ACCESS, EXCEL, etc., through which data management, analysis, and utilization are possible.
4.3.2. Scenario (2): Retrieval of Design Style References with Quantitative Style Information
This scenario describes the retrieval of design style references with the design style database and style recognition model. The proposed retrieval method uses style names and probability values. Users can input conditions with and/or logic operations. For example, users can search for interior images with >50% probability of corresponding to the modern style and >20% probability of corresponding to the natural style.
Figure 10 presents example results of the retrieval of reference images with various input value conditions.
The existing design reference search approach used on the file name or style name as a query parameter, and basis for the order of the results was not clearly provided. On the other hand, through the proposed method, it is possible to search for design references with a certain style name and their probability values, which helps to find specific types of designs. It also possible to provide query results by sorting according to inference probability. The complex 6+keyword and probability-based searches can be useful for searching interior design cases that show complex styles. This is based on a simple file name, and a data-based efficient search is possible unlike the method used by existing users.
4.3.3. Scenario (3): User-Customized Style Recognition Model
This scenario describes the implementation of another interior design style recognition method with different style image datasets.
Figure 11 presents the results of the CNN model training for two different styles: (1) a traditional East Asian style including traditional South Korean, Japanese, and Chinese styles; and (2) a traditional European style including Baroque, Rococo, and Victorian styles. Users can easily train a new model with a custom dataset and the GUI. Such a custom style recognition model can then be used to develop additional data tables for each reference image.
As described in
Section 3, style classification rules do not need to be specified for implementing design style recognition with a CNN model. Instead, the data collection and preparation process are the most important factors. Evidently, there are complex techniques available for fine-tuning a CNN model to achieve a high performance for style recognition. However, if the representative images for each style are well constructed, a new style recognition model can be trained relatively easily.
5. Discussion
In the architecture, engineering, construction, and facility management (AEC-FM) industry, the trends of automation and digital transformation are attracting significant interest in the fourth industrial revolution era. The Korean government has promoted various projects, such as the “Smart Construction 2025” and “Digital New Deal”, to improve the quality and productivity in the AEC-FM industry. Therefore, effectively collecting, classifying, and using data is crucial in the AEC-FM industry. This research study presents an effort to automate and digitalize the interior design domain. The automated recognition and quantification of interior design styles can improve the management and use of big data in the interior design domain and enhance the efficiency of design communication and decision making with reference data.
The main contributions of the stochastic detection method for the interior design styles of reference images described in this paper can be summarized as follows:
(1) Increased efficiency of finding design reference images
Quantitative design style information (e.g., the probabilities for styles and names) and style recognition models can help users search and browse references with more detailed and restrictive conditions. It is expected that people who are unfamiliar with interior design will be able to identify the style information in selected reference images in real time and will use the results to identify other design instances that are similar or related.
(2) Automation of reference data management
A design style represents qualitative data that have been inputted based on the interpretations of designers or reference managers. Therefore, style information is inputted manually according to various criteria for each individual. With the method developed in this study, styles can be automatically appended onto reference images that do not contain style information. This can reduce the time and effort required by reference managers, designers, and users to classify and store reference images according to different design styles. The proposed method can process great amounts of data in real time, which can increase the size of available reference pools. As a result, users can access more reference data.
(3) User-customized style recognition model
In this study, a deep-learning model that can identify four interior design styles was implemented. Users can easily train new design style recognition models by constructing new datasets. Rather than defining specific style classification rules or algorithms, a user can train the proposed style recognition model by collecting and classifying data according to consistent criteria. Therefore, one image can have various kinds of style information that are recognized by many different models; thus, users can search for reference data by selecting a model that matches their preferred style. In other words, trained recognition models can serve as new design reference data to find suitable reference images.
This study was an introduction to the question whether processing qualitative design information automatically is possible with artificial intelligence technologies, such as deep learning. It was clearly demonstrated that the method can automatically determine the design styles of reference images. In the future, it will be necessary to study how such a technology can be used to solve problems and difficulties in an actual design process, such as an interior remodeling project.
The recognizable space usage was limited to living room images in this study. The design process must consider the styles of individual rooms and entire living spaces. Therefore, the target space usages must be expanded to determine additional styles.
In summary, a method for learning and determining the overall characteristics of an image for style recognition was proposed. Key elements, such as the shapes and materials and the colors of floors, walls, and furniture, were analyzed to determine the style. In the future, a style recognition method that considers these detailed factors comprehensively will be developed.
6. Conclusions
This paper proposed a form of quantitative interior design style data for design reference images and a deep-learning-based recognition method for appending automatically such style data onto references. The goal of this study was to infer and quantify automatically style information that had been manually analyzed by experts. In addition, a novel information system for design styles presented in design images was proposed, and a style recognition method using deep-learning-based image recognition technology was implemented and presented.
By using a data-driven CNN algorithm, the style recognition model that can identify casual, classic, modern, and natural styles in living room design images was trained. The structure and weights of a pre-trained VGG model were used to train the new model. In addition, living room design images were collected and categorized according to four styles (casual, classic, modern, and natural). A total of 20 images were extracted for each style and used to construct a test dataset containing 80 images. Moreover, 100 images were extracted for each style and used to construct a training dataset containing 400 images. The results of recognition with the test dataset were calculated in terms of F1-score values to evaluate the trained model.
The F1-score increased with an increasing number of training data. Furthermore, the model trained on a total of 100 images yielded an F1-score of 32.4%, that trained on 200 images yielded an F1-score of 54.3%, and that trained on 400 images yielded an F1-score of 59.8%. The appropriate data augmentation techniques helped improve the F1-score of the model further. The F1-score of a model trained on a dataset that increased the 400 original images to 800 images with the left and right flipping technique was the highest (69.9%). However, the F1-score of a model trained on a dataset with 4000 images generated by randomly applying various image distortion techniques (e.g., left and right flipping, shifting, and zooming) was 61.5%, which is only approximately 1% higher than the F1-score of the model trained on 400 images. The accuracy of a model trained on an ImageNet dataset for classifying object images was similar to that of a human (77.4%); however, the accuracy of a model trained on the Place365 dataset was only 55.2%. Compared to those of other domains, the accuracy of the model developed in this study is significant.
By using the trained model, the quantitative information from reference images was identified, and a database was constructed. It only took 30 s to collect style information from 480 images. Each reference image was appended with information about, for example, the model used for recognition, scope of styles that the model can identify, and the probability and name of each style. Finally, the retrieval of reference images based on complex conditions regarding the style probability was demonstrated.