Mixed Reality-Based Concrete Crack Detection and Skeleton Extraction Using Deep Learning and Image Processing

Shojaei, Davood; Jafary, Peyman; Zhang, Zezheng

doi:10.3390/electronics13224426

Open AccessArticle

Mixed Reality-Based Concrete Crack Detection and Skeleton Extraction Using Deep Learning and Image Processing

by

Davood Shojaei

^*

,

Peyman Jafary

and

Zezheng Zhang

Centre for Spatial Data Infrastructures and Land Administration, Department of Infrastructure Engineering, The University of Melbourne, Melbourne, VIC 3053, Australia

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(22), 4426; https://doi.org/10.3390/electronics13224426

Submission received: 9 October 2024 / Revised: 1 November 2024 / Accepted: 9 November 2024 / Published: 12 November 2024

(This article belongs to the Special Issue Advances in Digital Signal and Image Processing, Techniques, and Computations with Multidisciplinary Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Advancements in image processing and deep learning offer considerable opportunities for automated defect assessment in civil structures. However, these systems cannot work interactively with human inspectors. Mixed reality (MR) can be adopted to address this by involving inspectors in various stages of the assessment process. This paper integrates You Only Look Once (YOLO) v5n and YOLO v5m with the Canny algorithm for real-time concrete crack detection and skeleton extraction with a Microsoft HoloLens 2 MR device. The YOLO v5n demonstrates a superior mean average precision (mAP) 0.5 and speed, while YOLO v5m achieves the highest mAP 0.5 0.95 among the other YOLO v5 structures. The Canny algorithm also outperforms the Sobel and Prewitt edge detectors with the highest F1 score. The developed MR-based system could not only be employed for real-time defect assessment but also be utilized for the automatic recording of the location and other specifications of the cracks for further analysis and future re-inspections.

Keywords:

mixed reality; crack detection; crack skeleton extraction; deep learning; YOLO v5; image processing

1. Introduction

Structures such as buildings, bridges, and pavements are subject to fatigue stresses, deterioration, and damage [1,2]. Cracks are among the major indices of the degradation and poor technical conditions of the structures, which can be formed and propagated in different types of construction materials such as concrete, brick, slab, asphalt, and beam [3,4]. Different factors such as aging, cyclic loading, poor workmanship and design, environmental parameters, and natural disasters can lead to the development of cracks [5,6,7]. Defect assessment and the maintenance of structures are extremely expensive exercises [6]. Accordingly, the early, fast, and efficient detection of cracks provides an opportunity to prevent further damage and minimize the risk of failure of building objects by taking necessary measures in a timely manner [4,5]. If cracks remain unknown, they can spread on the surface of structures and finally collapse them, which may cause various financial and safety problems [6].

Traditional approaches to defect assessment and crack detection are based on visual inspection. Such manual surveys, however, are associated with different challenges, including the following:

A high dependency on the experience and skill set of individual inspectors;
The subjective judgments of building surveyors;
The need for excessive time, cost, and labor;
Placing inspectors in dangerous situations [3,8,9].

To address these challenges, the literature emphasizes the increasing trend of the application of automatic procedures developed based on image processing (IP), computer vision, and Artificial Intelligence (AI) for the accurate and rapid classification, detection, and segmentation of cracks [3,10]. These cutting-edge technologies support the efficient evaluation and detection of surface defects to ensure the reliability of structure services, public safety, and financial stability [10]. Edge detection methods in IP, such as the Canny and Sobel operators, have been widely used for the automatic extraction of the skeleton of the cracks of concrete surfaces [11,12,13]. However, using IP methods alone may not lead to very high accuracies in crack detection due to factors such as a high sensitivity to the quality of the used images, noisy outputs, and a dependency on the appropriate selection of threshold parameters [14,15,16]. Accordingly, the application of Machine Learning (ML) and deep learning (DL) techniques has been considered to increase the accuracy of automatic crack detection practices [17,18]. A convolutional neural network (CNN) is the most developed and widely used DL method for various purposes such as classification, segmentation, recognition, detection, etc. [19]. In terms of defect assessment, a CNN is utilized for the extraction of deep convolutional features from crack and non-crack visual data and images through learning techniques [20]. Since the mid-2000s, deep CNNs (DCNNs) have been used to develop efficient and fast damage detection systems for civil structures [15]. In addition, more sophisticated CNN-based methods have been developed both for crack detection, e.g., the region-based CNN (RCNN) family, a Single Shot Detector (SSD), and the You Only Look Once (YOLO) family, and segmentation, e.g., U-Net, SegNet, and DenseNet [17,21,22].

For the development of state-of-the-art crack detection algorithms, the required visual input data can be provided by digital cameras, smartphones [23], Unmanned Aerial Vehicles (UAVs) [10], laser scanners [24], Ground Penetrating Radar (GPR) devices [25], etc. However, the developed techniques only provide input data for human inspectors. Despite the mentioned limitations regarding manual surveys, the inspectors’ expertise and experiences are critical [26]. Therefore, another solution could be the integration of human capabilities with IP and AI-based crack detection techniques in a systematic manner [27] to tackle the main challenges of manual inspections [28]. Accordingly, augmented reality (AR) and mixed reality (MR) can provide a great opportunity to capture, visualize, and process visual data while involving human expertise in different stages [29,30]. Among different devices that have been developed for AR and MR, Microsoft HoloLens has been introduced as a tool for the monitoring, maintenance, and management of civil structures. It has been equipped with different cameras and sensors to provide inputs and a powerful Central Processing Unit (CPU) and Graphics Processing Unit (GPU) for the automatic detection of cracks [26,30,31,32,33].

Considering the importance of developing advanced crack detection methods while benefiting from human proficiency, this paper aims to develop an integrated crack detection and skeleton extraction method based on IP and DL through MR capabilities. Previous studies have demonstrated the high accuracy of DL-based crack detection and segmentation models in structural images by using semantic segmentation networks and specialized convolutional models [34,35,36,37,38]. While these approaches achieve significant accuracy in controlled environments, they are computationally intensive, often requiring dedicated GPU resources and processing times that are impractical for real-time applications on MR devices such as the Microsoft HoloLens 2. In particular, end-to-end segmentation models like Mask RCNN and U-Net necessitate substantial processing power, which may reduce the frame rate and negatively impact the user experience in real time [39]. Therefore, this study adopts a hybrid approach, integrating CNN-based crack detection with IP-based edge detection for skeleton extraction. This combination leverages the strengths of DL for object detection and the efficiency of IP algorithms for edge detection, optimizing real-time performance within hardware constraints. Through this integrated approach, we develop an interactive HoloLens application that empowers human inspectors to facilitate automatic crack detection. The primary contributions of this paper are as follows:

The introduction of an MR-based interactive system using Microsoft HoloLens 2. This system enables human inspectors to perform real-time concrete crack detection and skeleton extraction, enhancing the assessment process.
The integration of DL and IP methods: combining these techniques facilitates the real-time assessment of concrete surfaces, balancing high accuracy and processing speed.
Comprehensive performance analysis and model comparison. This paper evaluates different models to identify the most accurate and fastest solutions for integration into the MR system.

The remainder of this paper is organized as follows. In Section 2, related works are reviewed. Section 3 introduces the proposed method based on the integration of the YOLO v5 model for crack detection and different considered IP-based edge detectors. Section 4 discusses the transformation of the designed integrated defect assessment of concrete surfaces to the MR environment with Microsoft HoloLens 2. Section 5 provides the experimental settings and evaluation results of the considered methods, followed by related discussions. Finally, the conclusion and future work are presented in Section 6.

2. Related Works

This section reviews related works to crack detection using DL, crack edge detection using IP, and possible applications of AR and MR in the defect assessment of concrete structures. These three concepts are the main foundations of this paper, and they provide an integrated crack detection and skeleton extraction model through an MR-based system.

2.1. Crack Detection Using Deep Learning

Deep learning (DL), as a branch of Machine Learning (ML), uses multilayered neural networks (MNNs) for learning data features and feature extraction [17]. The term “deep” represents the large number of layers between the input and output layers [21]. DL methods have been used in different fields of science for classification, segmentation, recognition, detection, identification, sentiment analysis, prediction, social media services, medical services, robotics, machine translation, and surveillance with promising results [19]. With regard to crack detection, DL has the following capabilities:

Classification: labelling the image patch to crack or non-crack;
Detection: localization of the crack by drawing a bounding box around the crack region;
Segmentation: dividing the image pixels into crack and non-crack pixels [16,21].

DL-based crack detection methods have less dependency on the quality of the input images, less sensitivity to noise, and a higher speed [16]. In DL, a CNN, introduced in 1980 [40], is the most established, innovative, and widely used neural network structure for image analysis [41].

For crack detection, three major deep learning architectures have been used for the localization of the cracks (i.e., drawing a bounding box around the crack region), including the RCNN family, SSD, and the YOLO family. RCNNs, Fast RCNNs, Faster RCNNs, and Mask RCNNs are members of the RCNN family [17,42]. An RCNN is a two-step detection process. In the first, the region proposal algorithm is utilized in order to generate the region proposals with the possibility of containing an object. Subsequently, the CNN features are extracted from the region proposals, and the objects are classified using the extracted features to obtain the final bounding boxes [43]. Kim et al. [44] used an RCNN for the crack detection of concrete bridges based on UAV images. Concrete road crack detection was also carried out using the Faster RCNN method by Hacıefendioğlu and Başağa [45]. Attard et al. [46] utilized a Mask RCNN to localize the cracks on concrete surfaces. In spite of the high accuracy of the Faster RCNN algorithm in crack detection, with a higher speed in comparison to other members of the RCNN family, it is still difficult to achieve real-time detection due to the large amount of time required to train the network through a two-step detection process [43].

The SSD and YOLO families are among other types of CNN-based detection algorithms that were introduced to perform all the computations in a single network instead of considering an extra module to discover the candidate regions of objects in the image space. This process is called “one-stage detection” [47,48]. For the crack detection aim, Yan and Zhang [49] used an SSD for asphalt highway pavement crack detection and highlighted the higher accuracy and speed of the one-stage detection method compared to two-stage object detection networks like the RCNN family.

2.1.1. YOLO for the Crack Detection of Civil Structures

After the first presentation of YOLO as a new approach for object detection in 2015 [50], many researchers have sought to investigate the application of YOLO for the detection of cracks in civil structures such as concrete walls [51], pavements [52,53], and bridges [43]. In addition to having a much higher speed in YOLO compared to a Faster RCNN, the accuracy of YOLO is significantly higher than a Faster RCNN, especially in some cases [54,55,56,57,58]. Furthermore, some authors pointed out the superiority of YOLO for gaining higher accuracies in civil crack localization in comparison to an SSD [54,58,59]. By a comparison of the ability of 11 well-known CNN models for concrete crack detection, Teng et al. [60] asserted that the YOLO method can lead to outstanding detection results if an appropriate feature extractor model and layer, training epoch, and testing image size are selected.

2.1.2. YOLO v5

YOLO v5 is a version of the YOLO architecture series with a very high detection accuracy, fast inference speed, and the fastest detection speed, which is up to 140 frames per second [61]. Some recent studies have attempted to investigate the capabilities of YOLO v5 for the detection of cracks in civil structures such as concrete structures and pavements.

Regarding the detection of cracks in asphalt pavements, Liu et al. [62] developed a transfer learning method called the Deep Domain Adaptation-based Crack Detection Network (DDACDN) based on YOLO v5 and highlighted its detection accuracy in comparison to other methods. Li, Gu, Xu, Xu, Zhang, Liu, and Dong [52] investigated the accuracy and speed of different versions of YOLO for the detection of concealed cracks in asphalt pavements using GPR data. Accordingly, they highlighted the highest accuracy for YOLO v5 and the highest speed for YOLO v4. In addition, they indicated that, among different series of YOLO v5, the small model called “YOLO v5s” has the highest speed. The higher speed of YOLO v5s for detecting cracks in pavements was also demonstrated in a study conducted by Hu et al. [63]. They compared different series of YOLO v5 and also asserted a higher accuracy for YOLO v5x.

In relation to defect assessment in concrete surfaces, Agyemang et al. [64] improved YOLO v5 using a non-local means (NLM) filter and feature refinement capabilities of a Convolutional Block Attention Module (CBAM) and highlighted the accuracy of the model for concrete crack detection. Zhao et al. [65] also detected the crack regions in images of concrete structures by YOLO v5 using bounding boxes and segmented the cracks by a Crack Feature Pyramid Network (FPN). Finally, an integrated approach for crack detection using YOLO v5 and crack quantification using IP led to a very high accuracy in a study conducted by Yu et al. [66].

2.2. Crack Edge Detection Using Image Processing

IP techniques detect cracks through the utilization of different filters, thresholds, morphological analysis, statistical methods, and percolation procedures [6]. The general structure of IP methods for crack detection includes image acquisition, image pre-processing, image processing, crack detection, and crack feature extraction [2]. Edge detection, thresholding and segmentation, region growing, and percolation-based techniques are different aspects of crack detection using IP [21].

For the skeleton extraction of cracks, IP-based edge detection can be used. Edge detectors are mathematical algorithms that are employed to identify the pixels in images at which the gray-level intensity varies dramatically. These pixels can be grouped into areas called edges. Since crack locations in concrete surfaces are highly associated with pixels having a discontinued gray-level intensity, edge detection methods can be significantly appropriate for identifying cracks in such surfaces [13].

IP techniques have been investigated in various studies to assess their capabilities for the recognition of the edges of cracks in civil structures, like pavements [67] and concrete surfaces [13], and have obtained promising results. These edge detection algorithms include the Canny [68], Sobel [69], Prewitt [13], Roberts [70], Fourier Transform [11], Wavelet Transform [71], Fast Haar Transform [11], Gaussian high-pass filter [70], bidimensional empirical mode decomposition (BEMD) [72], and Laplace edge detection [73] methods.

Many authors emphasize that crack detection using IP is a challenging process since cracks possess poor continuity and low contrast among neighboring pixels [19]. However, if the cracks are detected and the bounding boxes are drawn around them using DL technology, IP techniques are able to identify the skeleton of the cracks inside the bounding boxes with a high accuracy and a high speed [13,66].

2.3. AR and MR

During the last decades, state-of-the-art technologies such as AR and MR have promoted digitization and automation in the simulation and visualization of information [74]. AR superimposes virtual objects into the real world. In an AR system, the physical world is supplemented with virtual elements as computer-generated content in the same space [75,76,77]. MR, however, takes the process a step further and embeds the virtual content into the real world in such a way that users can interact with it [75]. Figure 1 illustrates the difference between AR and MR. In AR, digital information overlays the real world, enhancing the environment without enabling direct interaction between real and virtual elements. For instance, Pokémon GO exemplifies AR by layering digital objects on physical surroundings but without deeper interaction. MR, on the other hand, integrates real and virtual elements to create an interactive experience. This allows users to engage directly with both digital and physical elements, manipulating them in real time through advanced sensing and imaging technologies. MR thus enables a more immersive experience, where users can simultaneously be present in both real and virtual environments, as shown in the figure. This capability opens up new possibilities in areas like gaming, professional work, and, as in our study, defect inspection by merging human expertise with digital automation in real-world settings [78].

Milgram and Kishino [79] were the first to introduce and define MR as a particular subset of immersive technologies while incorporating the virtual world within real environments in a “reality–virtuality continuum” context. Since then, different research and varied relevant technologies have been developed, but there is no universally agreed definition for MR [80]. Microsoft HoloLens is the most widely used device for the development of MR-based applications. It is the first self-contained holographic computer device that provides users with the experience of astonishing holographic representations, engaging with high-fidelity holographic three-dimensional (3D) models, and interacting with a hybrid reality [74,81].

2.3.1. AR and MR in Civil Engineering

In the field of building industry and civil engineering, some researchers have attempted to investigate the capabilities of AR and MR for design, engineering, construction, and facility management. Kamat Vineet and El-Tawil [82] studied the practical application of AR for assessing earthquake-induced building damage. In the same year, the application of the Global Positioning System (GPS) and Three Degrees of Freedom (3DoF) angular tracking for addressing the registration problem during the interactive visualization of construction graphics in outdoor AR environments was investigated [83]. Bae et al. [84] presented a context-aware, vision-based mobile augmented reality system to enable field teams to query and access 3D information on-site. Later, a conceptual design was developed to improve structural inspections using the HoloLens [85]. Baek et al. [86] developed an AR-based system for facility management based on an image-based indoor localization method. The presented system has the ability to estimate the indoor position and orientation of the users by comparing their perspective to Building Information Modelling (BIM) using DL. Napolitano, Liu, Sun, and Glisic [29] presented a framework aiming at the documentation and visualization of the data inside and outside of the built environment based on the fusion of image-based documentation and AR.

2.3.2. AR and MR for Defect Detection in Civil Structures

Specifically for defect detection and maintenance operations using AR and MR, scholars have recently developed some frameworks and methodologies. Table 1 provides the aims, tools, and models in these studies. The point that should be highlighted here is that, despite emphasizing the importance and capabilities of MR-related devices, especially Microsoft HoloLens, the practical use of these technologies, while benefiting from advanced crack detection algorithms in DL and IP, is limited. The lack of a practical implementation and evaluation of the proposed models, the lack of providing information on the accuracy and speed of the developed models, a low accuracy, and the lack of the application of advanced crack detection methods based on IP and DL are among the different limitations of these relevant studies. Accordingly, further studies are required in the field to develop advanced crack detection methods using AR and MR capabilities to incorporate human expertise into the automated detection process with a high accuracy and speed.

3. An Integrated Method for Crack Detection and Skeleton Extraction

As previously mentioned, implementing a fully end-to-end DL approach for both crack detection and edge extraction can be computationally intensive and may reduce processing speed, especially on devices with limited hardware capacity, such as the Microsoft HoloLens 2. IP-based crack detection algorithms have been found to have lower accuracies in comparison to DL techniques. However, if the cracks are detected by advanced DL methods like YOLO, the crack edge detection methods in IP could be employed to extract the skeleton of the detected cracks by the DL techniques with an acceptable accuracy and a high speed [13]. Hence, developing an integrated DL and IP-based approach for crack detection and quantification can lead to promising results [66]. Accordingly, to achieve a balance between speed and accuracy for real-time inspection, this study attempts to develop an integrated methodology for crack detection and skeleton extraction (Figure 2) and subsequently implement the developed model in a Microsoft HoloLens.

In order to achieve this aim, different series of YOLO v5 are first adopted for real-time crack detection. YOLO v5 is selected as the primary DL model for crack detection due to its well-established balance of accuracy, processing speed, and computational efficiency [90], which are critical for real-time applications on resource-constrained MR devices like the Microsoft HoloLens 2. While newer versions of YOLO have been recently introduced, their capabilities in crack detection tasks have not yet been thoroughly investigated. YOLO v5’s extensive validation in recent crack detection studies [91,92,93,94] further supports its reliability and suitability for our application. By using YOLO v5, we achieve an effective trade-off between detection accuracy and real-time processing requirements, ensuring a high performance and responsiveness in the MR environment.

Subsequently, after crack detection, the performance of the three edge detection algorithms of Canny, Sobel, and Prewitt for the skeleton extraction of the detected cracks are compared to choose the most accurate edge detector. Indeed, after edge detection using YOLO v5 and drawing bounding boxes around the cracks, the selected edge detector is used for the extraction of the skeleton of the cracks inside the bounding boxes. The YOLO v5 and these edge detection methods will be further described in the following sub-sections.

3.1. YOLO v5 Algorithm Network

For the detection of objects and drawing a bounding box around them, the YOLO v5 utilizes four main stages in its architecture, including the input, backbone, neck, and head [65] (Figure 3). These stages will be further discussed below:

In the input part, mosaic data augmentation, adaptive tracing frame calculation, and adaptive picture scaling are followed to enrich the surface dataset, calculate the optimal anchor point box, and avoid creating a lot of redundant information, respectively. These steps will improve training efficiency [63,95].
The backbone module focuses on the feature extraction of the input image data. It is based on the focus structure, Cross Stage Partial Network (CSPNet), and Spatial Pyramid Pooling (SPP). The focus structure is mainly used for image slicing. The CSPNet is used to extract abundant information features from the input image. Finally, the SPP converts feature maps of all sizes into feature vectors with fixed sizes, thereby enhancing the acceptance field of the network and capturing functions of various sizes [61,95].
The neck module of the YOLO v5 is responsible for performing the multi-scale feature fusion. It is established using FPN and Path Aggregation Network (PAN) structures that are mainly used to generate a feature pyramid. The FPN is a top–down structure that is used to fuse the feature information with the backbone network part to realize the communication of semantic features, whereas the PAN is a bottom–up structure that is used to realize the communication of strong localization feature information [63,66,96].
The head module of the YOLO v5 is the final detection stage. Anchor boxes are used in this part to build final output vectors with class probabilities, objectness scores, and bounding boxes. Based on the Intersection Over Union (IOU) (Equation (1)), the generalized Intersection Over Union (GIOU) (Equation (2)) is employed as the loss function of the bounding box during the prediction step [95,96].

$I O U = \frac{|A \cap B|}{|A \cup B|}$

(1)

where $A$ is the target box and $B$ is the prediction box (Figure 4).

$G I O U = I O U - \frac{|\frac{C}{A \cup B}|}{|C|}$

(2)

where $C$ is the minimum closed box that encloses both $A$ and $B$ .

YOLO v5 is divided into different basic network structures based on network depth and width, including YOLO v5n (nano), YOLO v5s (small), YOLO v5m (medium), YOLO v5l (large), and YOLO v5x (extra-large). This categorization is based on the width and depth of the BottleneckCSP module, while the backbone, neck, and head are kept the same [61]. Out of these network structures, Zhao et al. [97] introduced YOLO v5s as the best architecture for real-time defect detection, considering that it met the real-time requirements of the production line. In this study, the possible application of different types of YOLO v5 for crack detection using MR is analyzed in terms of speed and accuracy to employ the best and most compatible model with the Microsoft HoloLens 2.

3.2. Image Processing-Based Crack Edge Detection

This study assesses the application of the Canny, Sobel, and Prewitt edge detection methods to select the best detector to be integrated with the YOLO v5 crack detection model. Short descriptions of these edge detection algorithms are presented in this section.

3.2.1. Sobel

The Sobel edge detector [98] recognizes edges by using two 3 × 3 convolution kernels of

G_{x}

(Equation (3)) and

G_{y}

(Equation (4)) to compute the derivatives in both the x and y directions, respectively.

G_{x} = [\begin{matrix} 1 & 0 & - 1 \\ 2 & 0 & - 2 \\ 1 & 0 & - 1 \end{matrix}]

(3)

G_{y} = [\begin{matrix} 1 & 2 & 1 \\ 0 & 0 & 0 \\ - 1 & - 2 & - 1 \end{matrix}]

(4)

The final outputs of the gradient approximations are combined to provide the gradient magnitude using Equation (5).

G_{s} = \sqrt{{(G_{x} \cdot N_{I S})}^{2} + {(G_{y} \cdot N_{I S})}^{2}}

(5)

where

N_{I S}

denotes an image neighborhood with a size of 3 × 3 pixels.

Furthermore, a threshold value TS identifies the presentation of the output image with all edges being displayed. If the Sobel gradient values of pixels are less than the threshold value TS, they are substituted by the threshold value [13].

3.2.2. Canny

The Canny edge detector [99] employs a multi-step algorithm in order to detect a wide range of edges and concurrently suppress noises in the images. At the first step of the edge detection, a Gaussian filter (Equation (6)) is applied for image smoothing and removing the noise and redundant details or textures.

g (m, n) = G_{σ} (m, n) \times f (m, n)

(6)

where

G_{σ}

is a Gaussian function with the variance of

σ^{2}

as Equation (7),

m

and

n

are the indices used to specify the location of a pixel within an image, and

f (m, n)

represents an image neighborhood at the pixel coordinate of

(m, n)

.

G_{σ} = \frac{1}{\sqrt{{2 π σ}^{2}}} e x p (- \frac{m^{2} + n^{2}}{{2 σ}^{2}})

(7)

In the next step, the gradient of

g (m, n)

, using a certain gradient operator (e.g., Sobel), is computed as Equation (8).

g_{m, n} (m, n) = \sqrt{g_{m}^{2} (m, n) + g_{n}^{2} (m, n)}

(8)

Subsequently, a non-maximum suppression is performed to convert the blurred edges in the image of the gradient magnitudes to sharp ones. After the suppression step, it is still possible that some pixels that are noise or color variations have been incorrectly specified as edges. Hence, a double thresholding strategy is considered. Accordingly, the edge pixels that are stronger than the upper threshold T2 are marked as strong edges. Meanwhile, those edge pixels that are weaker than the lower threshold T1 are suppressed. Furthermore, the edge pixels that range between T1 and T2 are identified as weak edges. Finally, the Canny algorithm suppresses all the edge pixels not connected to the strong edge pixels, which belong to the weak edge group [100].

3.2.3. Prewitt

The Prewitt algorithm was first developed by Prewitt [101]. It is based on two 3 × 3 filters, which facilitate the estimation of the derivatives of each location within an image: one for horizontal (Equation (9)) and one for vertical changes (Equation (10)).

P_{x} = [\begin{matrix} - 1 & - 1 & - 1 \\ 0 & 0 & 0 \\ 1 & 1 & 1 \end{matrix}]

(9)

P_{y} = [\begin{matrix} - 1 & 0 & 1 \\ - 1 & 0 & 1 \\ - 1 & 0 & 1 \end{matrix}]

(10)

These two horizontal and vertical filters are combined to calculate the gradient magnitude using Equation (11).

P_{p} = \sqrt{{(P_{x} \cdot N_{I P})}^{2} + {(P_{y} \cdot N_{I P})}^{2}}

(11)

where

N_{I P}

denotes the source image [13].

4. Transformation of the Model to the HoloLens

The experiments were conducted using the Microsoft HoloLens 2, an advanced MR headset equipped with a Qualcomm Snapdragon 850 Compute Platform and a custom Holographic Processing Unit. It features 4 GB of DRAM and 64 GB of UFS 2.1 storage, designed to meet the extensive computational demands of spatial computing and interactive applications. The visual output is facilitated by a high-resolution 2k 3:2 light engine, which delivers a resolution of 1440 × 936 pixels per eye. The field of view provided by the device is 43 degrees horizontally, enhancing the immersive experience necessary for detailed visual analysis. The HoloLens 2 also includes an RGB camera with a resolution of 2272 × 1278, utilized for capturing the video essential for concrete crack detection. This setup is critical for the accurate operation of deep learning models and image processing algorithms in our research.

For the implementation of the integrated crack detection and skeleton extraction model in the Microsoft HoloLens 2, first, the YOLO v5 detection models were trained through the PyTorch 1.11.0 framework [102]. Subsequently, the trained PyTorch models were transformed into the Open Neural Network Exchange (ONNX) format, which is a lightweight model format compatible with the HoloLens and Unity real-time development platform [103]. To test the YOLO v5 models, a C# 9.0 with .NET 5.0 application was implemented with the post-processes, including parsing the detection result and applying a non-maximum suppression.

Since the nature, length, thickness, and other characteristics of the cracks could be different, the MR-based system was developed in such a way that, if necessary, a combination of different real-time crack detection and skeleton extraction models could be used. To be more specific, the accuracy and speed of different types of YOLO v5 methods for crack detection and different considered IP-based edge detection algorithms were assessed. The real-time system was subsequently developed in such a way to provide the opportunity for users to change the models and compare the results on the fly. It meant users could swap between the models in HoloLens. The MR application was developed in Unity, which has capabilities for integrating various applications in HoloLens.

5. Experimental Analysis, Results, and Discussion

This section analyzes and evaluates the performance of the proposed MR-based crack edge detection and skeleton extraction algorithm.

5.1. Dataset Preparation

Two datasets were used in this study: one dataset for the training, validation, and testing of the YOLO v5 crack detection algorithm and one dataset for the evaluation of the edge detection methods. These two datasets are further discussed in the following sub-sections.

5.1.1. Dataset for Crack Detection

For the acquisition of the images required for crack detection using DL, the initial image dataset, including 410 images at a resolution of 256 × 256 pixels, was obtained through Bianchi and Hebdon [104] and Maguire et al. [105]. The input images were then manually annotated. Larger input images can improve the training speed and accuracy of the model [54]. Hence, in order to expand the dataset, some augmentation methods were automatically applied to the initial images as advised by different authors to enhance the performance of the models, including the following:

Flip: horizontal and vertical [106];
Grayscale [107];
Brightness [108]: between −25% and +25%;
Exposure [109]: between −15% and +15%;
Blur [110]: up to 0.5 px.

These augmentation processes not only increased the number of input images to 996 but also provided images in the dataset with different lighting conditions. The latter helped to maximize the training accuracy of the detection algorithm under various complex lighting conditions and to avoid over-fitting [111].

After the finalization of the image dataset for crack detection using YOLO v5, 78% of the images in the dataset were randomly selected for the training of the model, 8% of the images were selected for the validation of the model, and the remaining images were used for the testing. Figure 5 shows some of the images used for crack detection using YOLO v5 and their annotated bounding boxes with different conditions.

5.1.2. Dataset for Crack Skeleton Extraction

For the evaluation of the edge detection algorithms, another dataset was utilized, including 444 images, which were downloaded from [112]. The dataset includes 4032 × 3024 high-resolution images together with their alpha maps indicating the crack presence. Figure 6 shows a sample of the used images with their relevant ground truths.

5.2. Assessment Metrics

An accuracy assessment of both the crack detection (bounding box identification) and crack skeleton extraction procedures can be carried out using the classical evaluation metrics of precision (P) (Equation (12)), recall (R) (Equation (13)), and F1 score (Equation (14)).

P = \frac{T P}{T P + F P}

(12)

R = \frac{T P}{T P + F N}

(13)

F 1 = \frac{2 \times P \times R}{P + R}

(14)

where

T P

(true positive) is the number of the target boxes that are correctly predicted as defects,

F P

(false positive) is the number of the boxes that are incorrectly predicted as defects, and

F N

(false negative) is the number of the target boxes that are incorrectly predicted as non-defects [54,63]. Typically, an IOU ≥ 0.5 is referred to as a positive detection (TP), whereas an IOU < 0.5 means a false detection (FP) [16].

P

identifies the number of images that were correctly labelled among all the images labelled positive (including TP and FP) by the classifier, while recall specifies the number of all the positive images in the dataset that were correctly labelled as positive. In addition, the F1 score expresses the performance of the detection algorithm by calculating the harmonic mean of the P and R rates, with the best value equal to 1 and the worst value equal to 0 [113].

Average precision (AP) is also the accuracy of all the categories, standing for the area enclosed by the precision–recall (PR) curve and the coordinate axes. Indeed, the PR curve is a graph with P values on the y-axis and R values on the x-axis. It is calculated as the weighted mean of the precisions achieved at each threshold, with the increase in R from the previous threshold used as the weight. The mean AP (mAP) is subsequently computed as the mean of the AP of all damage modes [54]. The mAP 0.5 is calculated when the IOU ≥ 0.5 and is referred to as a TP. Furthermore, the mAP 0.5 0.95 corresponds to the average AP for the IOU from 0.5 to 0.95 with a step size of 0.05. Both the mAP 0.5 and mAP 0.5 0.95 are considered for the evaluation of the different types of YOLO v5 crack detection methods.

5.3. Model Implementation and Testing

5.3.1. Crack Detection

Considering the aim of this study being real-time crack detection and the limitations of implementing large networks in Microsoft HoloLens 2, three smaller structures and faster types of YOLO v5, including YOLO v5n, YOLO v5s, and YOLO v5m, were selected to be run.

In the first step, the models were pre-trained, and it was found that their loss functions stopped decreasing significantly in about 180 epochs. This is roughly the number of times that the YOLO v5 models explore the whole training dataset. It means they stop learning from the data. Since too many training epochs can lead to over-fitting issues, the models were trained with 150 epochs. Figure 7, Figure 8 and Figure 9 present the changes in the mAP 0.5 and mAP 0.5 0.95 of the training process after the increase in the number of epochs for the three structures of nano, small, and middle, respectively.

After the training and validation of the models, they were tested, and the results are provided in Table 2. Accordingly, YOLO v5n was associated with the highest mAP 0.5 and the highest speed. However, in terms of the mAP 0.5 0.95, YOLO v5m showed the highest accuracy. Indeed, YOLO v5n was the fastest model and the best model in terms of the ability to find cracks, while YOLO v5m was the most accurate model for crack detection. Hence, it was decided to implement both of these models and provide the possibility to the user to try both models in the field. Figure 10 provides an example of the bounding boxes identified around the detected cracks in different images by Yolo v5n (orange), Yolo v5s (blue), and Yolo v5m (green) models.

Our system has been optimized to detect cracks with widths as narrow as 0.06 mm within the Australian standard range, which classifies fine cracks in concrete structures up to 1 mm in width [114]. Additionally, the system operates effectively within the recommended viewing range for the Microsoft HoloLens 2, specifically between 1.25 and 5 m, with an optimal focal distance of approximately 2 m [114]. This range adheres to the HoloLens 2 specifications, maintaining visual clarity and user comfort while enabling accurate and responsive crack detection, as shown in Table 2 and Figure 10.

5.3.2. Crack Edge Extraction

All three considered edge detectors were applied on the dataset prepared, and their P, R, and F1 scores were computed (Table 3). Accordingly, the Canny algorithm led to the highest F1 score. According to Figure 11, which provides the analysis of the P, R, and F1 scores of the three edge detectors, the boxplot comparison of the F1 score and R illustrates that the Inter-Quartile Range (IQR) of the Canny algorithm was shorter. This indicates that the dispersion of the F1 score and p value of the Canny algorithm was smaller than the other two algorithms. This demonstrates a more stable performance of the Canny algorithm. Although the Prewitt algorithm had a better result in the precision box plot, its R and, accordingly, F1 score was very low. The Canny algorithm’s P was also close to the P of Prewitt. Hence, the Canny algorithm was selected to be implemented in the HoloLens 2 for the extraction of the skeleton of the cracks identified by the selected detection methods of YOLO v5n and YOLO v5m.

Figure 12 also provides an example of the outputs of the Sobel, Canny, and Prewitt algorithms for the extraction of the skeleton of the cracks in an image and inside the bounding box based on the output of the applied DL-based crack detection. This figure represents the efficiency of Canny in comparison to the other two methods, since their detections have a high level of noise. On the contrary, the Canny filter leads to less noise, especially inside the bounding box specified around the crack by YOLO v5. Accordingly, the Canny algorithm was selected as the edge detector technique for the extraction of the detected cracks. This indicates the reliability of the developed integrated method in this paper. Indeed, by edge detection using YOLO v5, the cracks are detected with a high accuracy and bounding boxes are drawn around the detected cracks. Subsequently, the skeletons of the cracks are extracted using the Canny algorithm inside the bounding boxes to avoid noise, which is the main error associated with IP-based edge detectors.

5.3.3. Integrated Model in the HoloLens

The MR application was developed in Unity by integrating the YOLO v5n and YOLO v5m crack detection methods and the Canny edge detector with the capability for users to change the crack detection models and compare the detection results of the two models of nano and middle in the field. Accordingly, by wearing the HoloLens 2, the inspector would be able to observe the real world through special glasses, while, simultaneously, with the help of the model developed in the machine, scanning the concrete surfaces and detecting possible cracks and their edges. Therefore, the use of this system can detect and analyze cracks in an MR environment in a real-time mode and provide insight to users for further assessments.

Figure 13 provides some examples of testing the implemented model in the HoloLens 2. In this figure, some cracks in concrete walls and the corresponding detection of those cracks using the HoloLens are presented. For this aim, the real-time experiment of defect assessment in concrete surfaces using the developed MR system was carried out in a building. The videos were recorded through the HoloLens and the corresponding photos were taken with a mobile phone for comparison and visual evaluation. Figure 14 also provides an example of the detection of a crack using both the YOLO v5n and YOLO v5m methods and edge detection using the Canny algorithm in the HoloLens 2.

Despite the acceptable accuracy of the system in detecting cracks in concrete surfaces during practical testing, some limitations were also observed. First, the accuracy of the Canny algorithm varies when the user’s distance and angle to the surface change. Indeed, the best edge detection occurs when the individual looks directly at the crack at a distance of about one meter. In addition, factors such as noise, a lack of sufficient light, and the width of the crack can decrease the accuracy of the Canny algorithm in extracting the edges of the cracks. In the photo on the right in Figure 13, it is observed how shadow and a lack of light can negatively affect the crack edge detection process.

Device movement and stability are critical factors in the performance of MR applications. Our results, detailed in Table 2, show that the YOLO v5n with Canny achieved 4.5 fps, resulting in a new inference every 222 ms approximately, while the YOLO v5s with Canny achieved 3 fps, with a new inference every 333 ms approximately. The YOLO v5n with Canny, at a lower rate of 1 fps, presented a new inference every 1000 ms. Despite the variations in frame rates, the high fps configurations (v5n and v5s) meant that typical device movements, such as head shaking, did not significantly affect the detection results. However, in the v5s configuration, rapid movements of the head and device led to some degradation in detection accuracy, indicating that more intense or quicker movements can still pose challenges to detection stability in the HoloLens 2.

Generally, in the real experiment, both the YOLO v5 nano and middle methods showed a very high level of accuracy. However, the edge detection accuracy of the Canny algorithm was not as high as expected based on the results of the accuracy assessment using the P, R, and F1 score metrics. An alternative solution could be to use DL for both crack detection and segmentation. Nevertheless, the YOLO v5n model, as the smallest and fastest type of YOLO v5 requires 0.23 s to detect cracks in one frame. We estimate that an AI-based segmentation model with a similar performance would require an even longer time to identify the skeleton of the cracks. Therefore, by the integration of two AI-based models, one for detection and one for segmentation, a software cannot run better than 2 FPS in the HoloLens 2 considering the configuration of it based on the Qualcomm Snapdragon 850 Compute Platform SOC [115]. Accordingly, such an application on the HoloLens 2 cannot guarantee an acceptable user experience. That is the main reason for using IP-based crack edge detection algorithms in addition to AI-based crack detection methods in this study. Improvements in the capabilities of the HoloLens in future versions in terms of the CPU and GPU may provide a better opportunity to develop CNN-based models for both crack detection and segmentation in an MR system.

By using YOLO v5, we achieved an effective trade-off between detection accuracy and real-time processing requirements, ensuring a high performance and responsiveness in the MR environment. The promising performance of the developed system, as demonstrated in this study, highlights the suitability of YOLO v5 for real-time crack detection in resource-constrained MR applications. Specifically, the balance between accuracy and processing speed achieved in our results validates YOLO v5 as the optimal choice for delivering both reliable detection and a responsive user experience. These findings support the use of YOLO v5 in MR applications where computational efficiency is essential, indicating that this model can provide a solid foundation for future advancements in real-time structural inspection systems.

6. Conclusions

The automatic assessment of defects in civil structures using advancements in IP and DL technologies while benefiting from the experiences and expertise of inspectors is of major importance. Hence, this study developed an integrated crack detection and skeleton extraction method in the Microsoft HoloLens 2 MR device for real-time crack monitoring. The main novelty of this research is the development of both crack detection and skeleton extraction using the YOLO v5 method and Canny edge detector in an MR environment along with a comprehensive accuracy and speed assessment of different structures of YOLO v5, as well as IP-based edge detectors.

This study’s hybrid approach, combining YOLO v5 for crack detection with the Canny edge detection algorithm, has proven effective for real-time MR-based crack assessments on the Microsoft HoloLens 2. This approach successfully balances processing speed and detection accuracy, optimizing performance within the hardware constraints of current MR devices. Future improvements in the HoloLens hardware may open opportunities to implement fully DL-based detection and segmentation models, further enhancing the accuracy and automation of defect assessment. Continued development in MR technology, coupled with enhancements in model efficiency, will likely enable more sophisticated systems for structural monitoring and maintenance in the years to come.

There are also some other issues that should be addressed in future relevant studies by improving the training process and the dataset used, including the impact of distance, angle, light, shadow, and crack width. Furthermore, newer versions of YOLO have been recently released. Hence, they could also be employed to evaluate the possible increases in the accuracy and speed of crack detection. Finally, further research should be followed to use the existing standards for crack classification, which could be embedded in the MR system. Indeed, after crack detection and skeleton extraction, the system could be advanced in such a way as to help inspectors during their defect assessment practices. The developed MR-based defect assessment model has various advantages for the application of automatic state-of-the-art methods for the management and maintenance of structures. Among them, the possibility of recording the location of the crack (position), name of the building, floor, and other descriptions for future maintenance activities, as well as future investigations, can be highlighted. To be more specific, first, the inspector can quickly scan a concrete surface to detect possible cracks in different parts of the surface. This speeds up evaluation and reduces human error. At the same time, by automatically extracting the crack skeleton, one can use one’s experience and expertise and available instructions to determine feasible actions based on the crack conditions. Second, some cracks may need further assessment to determine possible future changes in their size and width. Recording the specifications and location of such cracks is complex for inspectors. However, the designed MR-based system can help them to automatically save the relevant properties and make future evaluations much faster and more cost-effective.

Author Contributions

Conceptualization, D.S. and Z.Z.; Methodology, D.S. and P.J.; Software, P.J. and Z.Z.; Validation, Z.Z.; Investigation, P.J. and Z.Z.; Supervision, D.S.; Funding acquisition, D.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the University of Melbourne.

Data Availability Statement

The data used in this study were derived from the following resources available in the public domain: https://data.lib.vt.edu/articles/dataset/Concrete_Crack_Conglomerate_Dataset/16625056/1, accessed on 12 November 2024, https://digitalcommons.usu.edu/all_datasets/48/, accessed on 12 November 2024, and https://data.mendeley.com/datasets/jwsn7tfbrp/1, accessed on 12 November 2024.

Acknowledgments

The authors acknowledge the support of the University of Melbourne for providing financial support.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Habbal, F.; Alnuaimi, A.; Shamsi, M.; Alshaibah, S.; Aldarmaki, T. Cracks Detection using Artificial Intelligence to Enhance Inspection Efficiency and Analyze the Critical Defects. In Proceedings of the International Symposium on Automation and Robotics in Construction, Kitakyushu, Japan, 27–28 October 2020. [Google Scholar]
Mohan, A.; Poobal, S. Crack detection using image processing: A critical review and analysis. Alex. Eng. J. 2018, 57, 787–798. [Google Scholar] [CrossRef]
Hoang, N.-D. Detection of Surface Crack in Building Structures Using Image Processing Technique with an Improved Otsu Method for Image Thresholding. Adv. Civ. Eng. 2018, 2018, 3924120. [Google Scholar] [CrossRef]
Stałowska, P.; Suchocki, C.; Rutkowska, M. Crack detection in building walls based on geometric and radiometric point cloud information. Autom. Constr. 2022, 134, 104065. [Google Scholar] [CrossRef]
Dhital, D.; Lee, J.R. A Fully Non-Contact Ultrasonic Propagation Imaging System for Closed Surface Crack Evaluation. Exp. Mech. 2012, 52, 1111–1122. [Google Scholar] [CrossRef]
Munawar, H.S.; Hammad, A.W.A.; Haddad, A.; Soares, C.A.; Waller, S.T. Image-Based Crack Detection Methods: A Review. Infrastructures 2021, 6, 115. [Google Scholar] [CrossRef]
Yang, G.; Liu, K.; Zhang, J.; Zhao, B.; Zhao, Z.; Chen, X.; Chen, B.M. Datasets and processing methods for boosting visual inspection of civil infrastructure: A comprehensive review and algorithm comparison for crack classification, segmentation, and detection. Constr. Build. Mater. 2022, 356, 129226. [Google Scholar] [CrossRef]
Hsieh, Y.-A.; Tsai Yichang, J. Machine Learning for Crack Detection: Review and Model Performance Comparison. J. Comput. Civ. Eng. 2020, 34, 04020038. [Google Scholar] [CrossRef]
Deng, J.; Singh, A.; Zhou, Y.; Lu, Y.; Lee, V.C.-S. Review on computer vision-based crack detection and quantification methodologies for civil structures. Constr. Build. Mater. 2022, 356, 129238. [Google Scholar] [CrossRef]
Munawar, H.S.; Ullah, F.; Heravi, A.; Thaheem, M.J.; Maqsoom, A. Inspecting Buildings Using Drones and Computer Vision: A Machine Learning Approach to Detect Cracks and Damages. Drones 2022, 6, 5. [Google Scholar] [CrossRef]
Abdel-Qader, I.; Abudayyeh, O.; Kelly Michael, E. Analysis of Edge-Detection Techniques for Crack Identification in Bridges. J. Comput. Civ. Eng. 2003, 17, 255–263. [Google Scholar] [CrossRef]
Agaian, S.; Almuntashri, A.; Papagiannakis, A.T. An improved canny edge detection application for asphalt concrete. In Proceedings of the 2009 IEEE International Conference on Systems, Man and Cybernetics, San Antonio, TX, USA, 11–14 October 2009; pp. 3683–3687. [Google Scholar]
Hoang, N.-D.; Nguyen, Q.-L. Metaheuristic Optimized Edge Detection for Recognition of Concrete Wall Cracks: A Comparative Study on the Performances of Roberts, Prewitt, Canny, and Sobel Algorithms. Adv. Civ. Eng. 2018, 2018, 7163580. [Google Scholar] [CrossRef]
An, Q.; Chen, X.; Wang, H.; Yang, H.; Yang, Y.; Huang, W.; Wang, L. Segmentation of Concrete Cracks by Using Fractal Dimension and UHK-Net. Fractal Fract. 2022, 6, 95. [Google Scholar] [CrossRef]
Kheradmandi, N.; Mehranfar, V. A critical review and comparative study on image segmentation-based techniques for pavement crack detection. Constr. Build. Mater. 2022, 321, 126162. [Google Scholar] [CrossRef]
Nguyen, S.D.; Tran, T.S.; Tran, V.P.; Lee, H.J.; Piran, M.J.; Le, V.P. Deep Learning-Based Crack Detection: A Survey. Int. J. Pavement Res. Technol. 2022, 16, 943–967. [Google Scholar] [CrossRef]
Hamishebahar, Y.; Guan, H.; So, S.; Jo, J. A Comprehensive Review of Deep Learning-Based Crack Detection Approaches. Appl. Sci. 2022, 12, 1374. [Google Scholar] [CrossRef]
Laxman, K.C.; Tabassum, N.; Ai, L.; Cole, C.; Ziehl, P. Automated crack detection and crack depth prediction for reinforced concrete structures using deep learning. Constr. Build. Mater. 2023, 370, 130709. [Google Scholar] [CrossRef]
Ali, R.; Chuah, J.H.; Talip, M.S.A.; Mokhtar, N.; Shoaib, M.A. Structural crack detection using deep convolutional neural networks. Autom. Constr. 2022, 133, 103989. [Google Scholar] [CrossRef]
Gupta, P.; Dixit, M. Image-based crack detection approaches: A comprehensive survey. Multimed. Tools Appl. 2022, 81, 40181–40229. [Google Scholar] [CrossRef]
Ali, L.; Alnajjar, F.; Khan, W.; Serhani, M.A.; Al Jassmi, H. Bibliometric Analysis and Review of Deep Learning-Based Crack Detection Literature Published between 2010 and 2022. Buildings 2022, 12, 432. [Google Scholar] [CrossRef]
Li, R.; Yu, J.; Li, F.; Yang, R.; Wang, Y.; Peng, Z. Automatic bridge crack detection using Unmanned aerial vehicle and Faster R-CNN. Constr. Build. Mater. 2023, 362, 129659. [Google Scholar] [CrossRef]
Ni, T.Y.; Zhou, R.X.; Yang, Y.; Yang, X.C.; Zhang, W.Y.; Lin, C. Research on Detection of Concrete Surface Cracks Based on Smartphone Image. Acta Metrol. Sinca 2021, 42, 1–8. [Google Scholar]
Chen, X.; Li, J.; Huang, S.; Cui, H.; Liu, P.; Sun, Q. An Automatic Concrete Crack-Detection Method Fusing Point Clouds and Images Based on Improved Otsu’s Algorithm. Sensors 2021, 21, 1581. [Google Scholar] [CrossRef] [PubMed]
Fernandes, F.M.; Pais, J.C. Laboratory observation of cracks in road pavements with GPR. Constr. Build. Mater. 2017, 154, 1130–1138. [Google Scholar] [CrossRef]
Karaaslan, E.; Bagci, U.; Catbas, F.N. Artificial Intelligence Assisted Infrastructure Assessment using Mixed Reality Systems. Transp. Res. Rec. 2019, 2673, 413–424. [Google Scholar] [CrossRef]
Wang, S.; Zargar, S.A.; Yuan, F.-G. Augmented reality for enhanced visual inspection through knowledge-based deep learning. Struct. Health Monit. 2020, 20, 426–442. [Google Scholar] [CrossRef]
Moreu, F.; Malek, K. Bridge Cracks Monitoring: Detection, Measurement, and Comparison Using Augmented Reality; Transportation Consortium of South-Central States: Baton Rouge, LA, USA, 2021. [Google Scholar]
Napolitano, R.; Liu, Z.; Sun, C.; Glisic, B. Combination of Image-Based Documentation and Augmented Reality for Structural Health Monitoring and Building Pathology. Front. Built Environ. 2019, 5, 50. [Google Scholar] [CrossRef]
Yamaguchi, T.; Shibuya, T.; Kanda, M.; Yasojima, A. Crack Inspection Support System for Concrete Structures Using Head Mounted Display in Mixed Reality Space. In Proceedings of the 2019 58th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE), Hiroshima, Japan, 10–13 September 2019; pp. 791–796. [Google Scholar]
Smith, A.; Duff, C.; Sarlo, R.; Gabbard, J.L. Wearable Augmented Reality Interface Design for Bridge Inspection. In Proceedings of the 2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), Christchurch, New Zealand, 12–16 March 2022; pp. 497–501. [Google Scholar]
Maharjan, D.; Aguero, M.; Lippitt, C.; Moreu, F. Infrastructure Stakeholders’ Perspective in Development and Implementation of New Structural Health Monitoring (SHM) Technologies for Maintenance and Management of Transportation Infrastructure. MATEC Web Conf. 2019, 271, 01010. [Google Scholar] [CrossRef]
Jain, A.; Gajwani, R.; Manjunath, C.R.; Shetty, S. Holographic imaging system to detect fractures. Int. J. Adv. Res. Ideas Innov. Technol. 2018, 4, 63–68. [Google Scholar]
Dais, D.; Bal, İ.E.; Smyrou, E.; Sarhosis, V. Automatic crack classification and segmentation on masonry surfaces using convolutional neural networks and transfer learning. Autom. Constr. 2021, 125, 103606. [Google Scholar] [CrossRef]
Zhang, X.; Rajan, D.; Story, B. Concrete crack detection using context-aware deep semantic segmentation network. Comput. -Aided Civ. Infrastruct. Eng. 2019, 34, 951–971. [Google Scholar] [CrossRef]
Pham, M.-V.; Ha, Y.-S.; Kim, Y.-T. Automatic detection and measurement of ground crack propagation using deep learning networks and an image processing technique. Measurement 2023, 215, 112832. [Google Scholar] [CrossRef]
Miao, P.; Srimahachota, T. Cost-effective system for detection and quantification of concrete surface cracks by combination of convolutional neural network and image processing techniques. Constr. Build. Mater. 2021, 293, 123549. [Google Scholar] [CrossRef]
Ali, L.; Alnajjar, F.; Jassmi, H.A.; Gocho, M.; Khan, W.; Serhani, M.A. Performance Evaluation of Deep CNN-Based Crack Detection and Localization Techniques for Concrete Structures. Sensors 2021, 21, 1688. [Google Scholar] [CrossRef] [PubMed]
Sapkota, R.; Ahmed, D.; Karkee, M. Comparing YOLOv8 and Mask R-CNN for instance segmentation in complex orchard environments. Artif. Intell. Agric. 2024, 13, 84–99. [Google Scholar] [CrossRef]
Fukushima, K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 1980, 36, 193–202. [Google Scholar] [CrossRef]
Li, B.; Wang, K.C.P.; Zhang, A.; Yang, E.; Wang, G. Automatic classification of pavement crack using deep convolutional neural network. Int. J. Pavement Eng. 2020, 21, 457–463. [Google Scholar] [CrossRef]
Xu, X.; Zhao, M.; Shi, P.; Ren, R.; He, X.; Wei, X.; Yang, H. Crack Detection and Comparison Study Based on Faster R-CNN and Mask R-CNN. Sensors 2022, 22, 1215. [Google Scholar] [CrossRef]
Zhang, Y.; Huang, J.; Cai, F. On Bridge Surface Crack Detection Based on an Improved YOLO v3 Algorithm. IFAC-Pap. 2020, 53, 8205–8210. [Google Scholar] [CrossRef]
Kim, I.-H.; Jeon, H.; Baek, S.-C.; Hong, W.-H.; Jung, H.-J. Application of Crack Identification Techniques for an Aging Concrete Bridge Inspection Using an Unmanned Aerial Vehicle. Sensors 2018, 18, 1881. [Google Scholar] [CrossRef]
Hacıefendioğlu, K.; Başağa, H.B. Concrete Road Crack Detection Using Deep Learning-Based Faster R-CNN Method. Iran. J. Sci. Technol. Trans. Civ. Eng. 2022, 46, 1621–1633. [Google Scholar] [CrossRef]
Attard, L.; Debono, C.J.; Valentino, G.; Castro, M.D.; Masi, A.; Scibile, L. Automatic Crack Detection using Mask R-CNN. In Proceedings of the 2019 11th International Symposium on Image and Signal Processing and Analysis (ISPA), Dubrovnik, Croatia, 23–25 September 2019; pp. 152–157. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the Computer Vision—ECCV 2016, Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
Kumar Srinath, S.; Wang, M.; Abraham Dulcy, M.; Jahanshahi Mohammad, R.; Iseley, T.; Cheng Jack, C.P. Deep Learning–Based Automated Detection of Sewer Defects in CCTV Videos. J. Comput. Civ. Eng. 2020, 34, 04019047. [Google Scholar] [CrossRef]
Yan, K.; Zhang, Z. Automated Asphalt Highway Pavement Crack Detection Based on Deformable Single Shot Multi-Box Detector Under a Complex Environment. IEEE Access 2021, 9, 150925–150938. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–15 June 2015. [Google Scholar]
Park, S.E.; Eem, S.-H.; Jeon, H. Concrete crack detection and quantification using deep learning and structured light. Constr. Build. Mater. 2020, 252, 119096. [Google Scholar] [CrossRef]
Li, S.; Gu, X.; Xu, X.; Xu, D.; Zhang, T.; Liu, Z.; Dong, Q. Detection of concealed cracks from ground penetrating radar images based on deep learning algorithm. Constr. Build. Mater. 2021, 273, 121949. [Google Scholar] [CrossRef]
Majidifard, H.; Adu-Gyamfi, Y.; Buttlar, W.G. Deep machine learning approach to develop a new asphalt pavement condition index. Constr. Build. Mater. 2020, 247, 118513. [Google Scholar] [CrossRef]
Jiang, Y.; Pang, D.; Li, C. A deep learning approach for fast detection and classification of concrete damage. Autom. Constr. 2021, 128, 103785. [Google Scholar] [CrossRef]
Li, W.; Shen, Z.; Li, P. Crack Detection of Track Plate Based on YOLO. In Proceedings of the 2019 12th International Symposium on Computational Intelligence and Design (ISCID), Hangzhou, China, 14–15 December 2019; pp. 15–18. [Google Scholar]
Deng, J.; Lu, Y.; Lee, V.C.-S. Imaging-based crack detection on concrete surfaces using You Only Look Once network. Struct. Health Monit. 2020, 20, 484–499. [Google Scholar] [CrossRef]
Nie, M.; Wang, C. Pavement Crack Detection based on yolo v3. In Proceedings of the 2019 2nd International Conference on Safety Produce Informatization (IICSPI), Chongqing, China, 28–30 November 2019; pp. 327–330. [Google Scholar]
Du, Y.; Pan, N.; Xu, Z.; Deng, F.; Shen, Y.; Kang, H. Pavement distress detection and classification based on YOLO network. Int. J. Pavement Eng. 2021, 22, 1659–1672. [Google Scholar] [CrossRef]
Mandal, V.; Uong, L.; Adu-Gyamfi, Y. Automated Road Crack Detection Using Deep Convolutional Neural Networks. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 5212–5215. [Google Scholar]
Teng, S.; Liu, Z.; Chen, G.; Cheng, L. Concrete Crack Detection Based on Well-Known Feature Extractor Model and the YOLO_v2 Network. Appl. Sci. 2021, 11, 813. [Google Scholar] [CrossRef]
Oh, C.; Dang, L.M.; Han, D.; Moon, H. Robust Sewer Defect Detection With Text Analysis Based on Deep Learning. IEEE Access 2022, 10, 46224–46237. [Google Scholar] [CrossRef]
Liu, H.; Yang, C.; Li, A.; Ge, Y.; Huang, S.; Feng, X.; Ruan, Z. Deep Domain Adaptation for Pavement Crack Detection. arXiv 2021, arXiv:2111.10101. [Google Scholar] [CrossRef]
Hu, G.X.; Hu, B.L.; Yang, Z.; Huang, L.; Li, P. Pavement Crack Detection Method Based on Deep Learning Models. Wirel. Commun. Mob. Comput. 2021, 2021, 5573590. [Google Scholar] [CrossRef]
Agyemang, I.O.; Zhang, X.; Adjei-Mensah, I.; Mawuli, B.C.; Agbley, B.L.Y.; Fiasam, L.D.; Sey, C. On Salient Concrete Crack Detection Via Improved Yolov5. In Proceedings of the 2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), Chengdu, China, 17–19 December 2021; pp. 175–178. [Google Scholar]
Zhao, W.; Liu, Y.; Zhang, J.; Shao, Y.; Shu, J. Automatic pixel-level crack detection and evaluation of concrete structures using deep learning. Struct. Control Health Monit. 2022, 29, e2981. [Google Scholar] [CrossRef]
Yu, L.; He, S.; Liu, X.; Jiang, S.; Xiang, S. Intelligent Crack Detection and Quantification in the Concrete Bridge: A Deep Learning-Assisted Image Processing Approach. Adv. Civ. Eng. 2022, 2022, 1813821. [Google Scholar] [CrossRef]
Cubero-Fernandez, A.; Rodriguez-Lozano, F.J.; Villatoro, R.; Olivares, J.; Palomares, J.M. Efficient pavement crack detection and classification. EURASIP J. Image Video Process. 2017, 2017, 39. [Google Scholar] [CrossRef]
Wang, G.; Tse, P.W.; Yuan, M. Automatic internal crack detection from a sequence of infrared images with a triple-threshold Canny edge detector. Meas. Sci. Technol. 2018, 29, 025403. [Google Scholar] [CrossRef]
Yang, J.; Choi, J.; Hwang, S.; An, Y.-K.; Sohn, H. A reference-free micro defect visualization using pulse laser scanning thermography and image processing. Meas. Sci. Technol. 2016, 27, 085601. [Google Scholar] [CrossRef]
Dorafshan, S. Comparing Automated Image-Based Crack Detection Techniques in Spatial and Frequency Domains. In Proceedings of the 26th ASNT Research Symposium, Jacksonville, FL, USA, 13–16 March 2017. [Google Scholar]
Wang, K.C.P.; Li, Q.; Gong, W. Wavelet-Based Pavement Distress Image Edge Detection with À Trous Algorithm. Transp. Res. Rec. 2007, 2024, 73–81. [Google Scholar] [CrossRef]
Ayenu-Prah, A.; Attoh-Okine, N. Evaluating Pavement Cracks with Bidimensional Empirical Mode Decomposition. EURASIP J. Adv. Signal Process. 2008, 2008, 861701. [Google Scholar] [CrossRef]
Qingbo, Z. Pavement Crack Detection Algorithm Based on Image Processing Analysis. In Proceedings of the 2016 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, China, 27–28 August 2016; pp. 15–18. [Google Scholar]
Nguyen, D.-C.; Nguyen, T.-Q.; Jin, R.; Jeon, C.-H.; Shim, C.-S. BIM-based mixed-reality application for bridge inspection and maintenance. Constr. Innov. 2021. ahead-of-print. [Google Scholar] [CrossRef]
Rokhsaritalemi, S.; Sadeghi-Niaraki, A.; Choi, S.-M. A Review on Mixed Reality: Current Trends, Challenges and Prospects. Appl. Sci. 2020, 10, 636. [Google Scholar] [CrossRef]
Hönig, W.; Milanes, C.; Scaria, L.; Phan, T.; Bolas, M.; Ayanian, N. Mixed reality for robotics. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 5382–5387. [Google Scholar]
Azuma, R.; Baillot, Y.; Behringer, R.; Feiner, S.; Julier, S.; MacIntyre, B. Recent advances in augmented reality. IEEE Comput. Graph. Appl. 2001, 21, 34–47. [Google Scholar] [CrossRef]
Intel. VR vs. AR vs. MR: What You Need to Know. Available online: https://www.intel.com/content/www/us/en/tech-tips-and-tricks/virtual-reality-vs-augmented-reality.html (accessed on 29 October 2024).
Milgram, P.; Kishino, F. A Taxonomy of Mixed Reality Visual Displays. IEICE Trans. Inf. Syst. 1994, E77-D, 1321–1329. [Google Scholar]
Speicher, M.; Hall, B.; Nebeling, M. What is Mixed Reality? In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019. [Google Scholar]
Microsoft. What Is Mixed Reality? Available online: https://learn.microsoft.com/en-us/windows/mixed-reality/discover/mixed-reality (accessed on 30 October 2024).
Kamat Vineet, R.; El-Tawil, S. Evaluation of Augmented Reality for Rapid Assessment of Earthquake-Induced Building Damage. J. Comput. Civ. Eng. 2007, 21, 303–310. [Google Scholar] [CrossRef]
Behzadan Amir, H.; Kamat Vineet, R. Georeferenced Registration of Construction Graphics in Mobile Outdoor Augmented Reality. J. Comput. Civ. Eng. 2007, 21, 247–258. [Google Scholar] [CrossRef]
Bae, H.; Golparvar-Fard, M.; White, J. High-precision vision-based mobile augmented reality system for context-aware architectural, engineering, construction and facility management (AEC/FM) applications. Vis. Eng. 2013, 1, 3. [Google Scholar] [CrossRef]
Moreu, F.; Bleck, B.; Vemuganti, S.; Rogers, D.; Mascarenas, D. Augmented Reality Tools for Enhanced Structural Inspection. 2017. Available online: https://www.researchgate.net/publication/322847804_Augmented_Reality_Tools_for_Enhanced_Structural_Inspection (accessed on 30 October 2024).
Baek, F.; Ha, I.; Kim, H. Augmented reality system for facility management using image-based indoor localization. Autom. Constr. 2019, 99, 18–26. [Google Scholar] [CrossRef]
Park, C.-S.; Lee, D.-Y.; Kwon, O.-S.; Wang, X. A framework for proactive construction defect management using BIM, augmented reality and ontology-based data collection template. Autom. Constr. 2013, 33, 61–71. [Google Scholar] [CrossRef]
Dang, N.; Shim, C. BIM-based innovative bridge maintenance system using augmented reality technology. In CIGOS 2019, Innovation for Sustainable Infrastructure; Springer: Singapore, 2020; pp. 1217–1222. [Google Scholar]
Kilic, G.; Caner, A. Augmented reality for bridge condition assessment using advanced non-destructive techniques. Struct. Infrastruct. Eng. 2021, 17, 977–989. [Google Scholar] [CrossRef]
Hussain, M. Yolov5, yolov8 and yolov10: The go-to detectors for real-time vision. arXiv 2024, arXiv:2407.02988. [Google Scholar]
Karimi, N.; Mishra, M.; Lourenço, P.B. Automated Surface Crack Detection in Historical Constructions with Various Materials Using Deep Learning-Based YOLO Network. Int. J. Archit. Herit. 2024, 1–17. [Google Scholar] [CrossRef]
Xing, Y.; Han, X.; Pan, X.; An, D.; Liu, W.; Bai, Y. EMG-YOLO: Road crack detection algorithm for edge computing devices. Front. Neurorobotics 2024, 18, 1423738. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Lu, Y.; Huo, Z.; Li, J.; Sun, Y.; Huang, H. USSC-YOLO: Enhanced Multi-Scale Road Crack Object Detection Algorithm for UAV Image. Sensor 2024, 24, 5586. [Google Scholar] [CrossRef] [PubMed]
Hu, H.; Li, Z.; He, Z.; Wang, L.; Cao, S.; Du, W. Road surface crack detection method based on improved YOLOv5 and vehicle-mounted images. Measurement 2024, 229, 114443. [Google Scholar] [CrossRef]
Liao, D.; Cui, Z.; Zhang, X.; Li, J.; Li, W.; Zhu, Z.; Wu, N. Surface defect detection and classification of Si3N4 turbine blades based on convolutional neural network and YOLOv5. Adv. Mech. Eng. 2022, 14, 16878132221081580. [Google Scholar] [CrossRef]
Li, S.; Wang, X. YOLOv5-based Defect Detection Model for Hot Rolled Strip Steel. J. Phys. Conf. Ser. 2022, 2171, 012040. [Google Scholar] [CrossRef]
Zhao, Z.; Yang, X.; Zhou, Y.; Sun, Q.; Ge, Z.; Liu, D. Real-time detection of particleboard surface defects based on improved YOLOV5 target detection. Sci. Rep. 2021, 11, 21777. [Google Scholar] [CrossRef]
Sobel, I. Neighborhood coding of binary images for fast contour following and general binary array processing. Comput. Graph. Image Process. 1978, 8, 127–135. [Google Scholar] [CrossRef]
Canny, J. A Computational Approach to Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, PAMI-8, 679–698. [Google Scholar] [CrossRef]
Hoang, N.-D.; Quoc-Lam, N.; Van-Duc, T. Automatic recognition of asphalt pavement cracks using metaheuristic optimized edge detection algorithms and convolution neural network. Autom. Constr. 2018, 94, 203–213. [Google Scholar] [CrossRef]
Prewitt, J.M. Object enhancement and extraction. Pict. Process. Psychopictorics 1970, 10, 15–19. [Google Scholar]
PyTorch. Available online: https://pytorch.org/ (accessed on 30 June 2022).
Unity. Unity Real-Time Development Platform. Available online: https://unity.com/ (accessed on 30 June 2022).
Bianchi, E.; Hebdon, M. Concrete Crack Conglomerate Dataset; University Libraries, Virginia Tech: Blacksburg, VA, USA, 2021. [Google Scholar] [CrossRef]
Maguire, M.; Dorafshan, S.; Thomas, R.J. SDNET2018: A Concrete Crack Image Dataset for Machine Learning Applications; Utah State University: Logan, UT, USA, 2018. [Google Scholar] [CrossRef]
Safdar, M.F.; Alkobaisi, S.S.; Zahra, F.T. A Comparative Analysis of Data Augmentation Approaches for Magnetic Resonance Imaging (MRI) Scan Images of Brain Tumor. Acta Inf. Med. 2020, 28, 29–36. [Google Scholar] [CrossRef] [PubMed]
Ryu, S.-E.; Chung, K.-Y. Detection Model of Occluded Object Based on YOLO Using Hard-Example Mining and Augmentation Policy Optimization. Appl. Sci. 2021, 11, 7093. [Google Scholar] [CrossRef]
Abdulghani, A.M.; Abdulghani, M.M.; Walters, W.L.; Abed, K.H. Data Augmentation Using Brightness and Darkness to Enhance the Performance of YOLO7 Object Detection Algorithm. In Proceedings of the 2023 Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE), Las Vegas, NV, USA, 24–27 July 2023; pp. 351–356. [Google Scholar]
Zhang, W.; Kinoshita, Y.; Kiya, H. Image-Enhancement-Based Data Augmentation for Improving Deep Learning in Image Classification Problem. In Proceedings of the 2020 IEEE International Conference on Consumer Electronics—Taiwan (ICCE-Taiwan), Taoyuan, Taiwan, 28–30 September 2020; pp. 1–2. [Google Scholar]
Abdulghani, A.M.; Abdulghani, M.M.; Walters, W.L.; Abed, K.H. Data Augmentation with Noise and Blur to Enhance the Performance of YOLO7 Object Detection Algorithm. In Proceedings of the 2023 Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE), Las Vegas, NV, USA, 24–27 July 2023; pp. 180–185. [Google Scholar]
Liu, J.; Yang, X.; Lau, S.; Wang, X.; Luo, S.; Lee, V.C.-S.; Ding, L. Automated pavement crack detection and segmentation based on two-step convolutional neural network. Comput. -Aided Civ. Infrastruct. Eng. 2020, 35, 1291–1305. [Google Scholar] [CrossRef]
Özgenel, Ç.F. Concrete Crack Segmentation Dataset; 2019. Mendeley Data, V1. Available online: https://data.mendeley.com/datasets/jwsn7tfbrp/1 (accessed on 8 October 2024).
Tan, Y.; Cai, R.; Li, J.; Chen, P.; Wang, M. Automatic detection of sewer defects based on improved you only look once algorithm. Autom. Constr. 2021, 131, 103912. [Google Scholar] [CrossRef]
Cement Concrete and Aggregates Australia. Guide-to-Concrete-Construction. 2020. Available online: https://ccaa.com.au/common/Uploaded%20files/CCAA/Publications/Technical%20Publications/Complete_Guide_to_Concrete_Construction_2020_Edition.pdf (accessed on 30 October 2024).
Microsoft. HoloLens 2. Available online: https://www.microsoft.com/en-us/hololens/hardware (accessed on 16 June 2022).

Figure 1. Augmented reality (AR) vs. mixed reality (MR).

Figure 2. An integrated deep learning and image processing-based crack detection and skeleton extraction.

Figure 3. Architecture of YOLO v5s network.

Figure 4. (a) Area of overlap; and (b) area of union.

Figure 5. Sample of images used for crack detection using YOLO v5 and their annotated bounding boxes [104,105].

Figure 6. Left: a sample of the original images used for the evaluation of crack edge detection algorithms. Right: the relevant ground truth [112].

Figure 7. mAP 0.5 and mAP 0.5 0.95 analysis of YOLO v5n.

Figure 8. mAP 0.5 and mAP 0.5 0.95 analysis of YOLO v5s.

Figure 9. mAP 0.5 and mAP 0.5 0.95 analysis of YOLO v5m.

Figure 10. Example of the bounding boxes identified around the detected cracks by the YOLO v5n (orange), YOLO v5s (blue), and YOLO v5m (green) models.

Figure 11. Precision, recall, and F1 score analysis of the crack edge detectors.

Figure 12. Example of the outputs of the crack edge detectors in an image: (a) the original image with a bounding box around the crack and a smaller area for a better comparison between the outputs; (b) the ground truth of the considered image; (c) the edge detection output using Canny in the whole image, inside the bounding box and inside the smaller area; (d) the edge detection output using Sobel in the whole image, inside the bounding box and inside the smaller area; and (e) the edge detection output using Prewitt in the whole image, inside the bounding box and inside the smaller area.

Figure 13. Real-time testing of the developed model in the HoloLens for defect assessment in concrete surfaces (the percentages are the confidence scores for the cracks).

Figure 14. Example of the execution of the integrated model using the YOLO v5n (left) and YOLO v5m (right) structures in the HoloLens.

Table 1. Studies focusing on the application of AR and MR for the crack detection of civil structures.

Source	Aim	Tools and Models
[87]	Presenting a framework for proactive construction defect management.	Integration of AR, BIM, and ontology-based data collection template.
[26]	MR for smart assessment of bridge cracks.	HoloLens and SSD for crack detection and SegNet for crack segmentation.
[30]	Detection of cracks in concrete structures through MR.	HoloLens and computer vison (feature extraction of matching).
[27]	Real-time automated damage detection using AR smart glasses.	Hands-free Epson BT-300 smart glasses and SSD MobileNet clustering for crack detection and Mask RCNN for segmentation.
[88]	Presenting an innovative bridge maintenance system for storing, manipulating, and sharing inspection data and maintenance history.	HoloLens, BIM, and image processing.
[89]	Using AR for bridge condition assessment and the detection of bridge deck delamination, crack formation, and rebar corrosion.	GPR, Laser Distance Sensors (LDSs), and Infra-Red Thermography (IRT) cameras.

Table 2. The results of comparing the YOLO v5n, YOLO v5s, and YOLO v5m crack detection methods.

Method	mAP 0.5	mAP 0.5 0.95	Speed (FPS)
YOLO v5n	0.927	0.701	4.5
YOLO v5s	0.895	0.702	3
YOLO v5m	0.890	0.728	1

Table 3. The results of comparing the Sobel, Canny, and Prewitt crack edge detectors.

Algorithm	Precision (P)	Recall (R)	F1 Score
Sobel	0.89	0.70	0.77
Canny	0.93	0.87	0.88
Prewitt	0.94	0.55	0.67

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shojaei, D.; Jafary, P.; Zhang, Z. Mixed Reality-Based Concrete Crack Detection and Skeleton Extraction Using Deep Learning and Image Processing. Electronics 2024, 13, 4426. https://doi.org/10.3390/electronics13224426

AMA Style

Shojaei D, Jafary P, Zhang Z. Mixed Reality-Based Concrete Crack Detection and Skeleton Extraction Using Deep Learning and Image Processing. Electronics. 2024; 13(22):4426. https://doi.org/10.3390/electronics13224426

Chicago/Turabian Style

Shojaei, Davood, Peyman Jafary, and Zezheng Zhang. 2024. "Mixed Reality-Based Concrete Crack Detection and Skeleton Extraction Using Deep Learning and Image Processing" Electronics 13, no. 22: 4426. https://doi.org/10.3390/electronics13224426

APA Style

Shojaei, D., Jafary, P., & Zhang, Z. (2024). Mixed Reality-Based Concrete Crack Detection and Skeleton Extraction Using Deep Learning and Image Processing. Electronics, 13(22), 4426. https://doi.org/10.3390/electronics13224426

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Mixed Reality-Based Concrete Crack Detection and Skeleton Extraction Using Deep Learning and Image Processing

Abstract

1. Introduction

2. Related Works

2.1. Crack Detection Using Deep Learning

2.1.1. YOLO for the Crack Detection of Civil Structures

2.1.2. YOLO v5

2.2. Crack Edge Detection Using Image Processing

2.3. AR and MR

2.3.1. AR and MR in Civil Engineering

2.3.2. AR and MR for Defect Detection in Civil Structures

3. An Integrated Method for Crack Detection and Skeleton Extraction

3.1. YOLO v5 Algorithm Network

3.2. Image Processing-Based Crack Edge Detection

3.2.1. Sobel

3.2.2. Canny

3.2.3. Prewitt

4. Transformation of the Model to the HoloLens

5. Experimental Analysis, Results, and Discussion

5.1. Dataset Preparation

5.1.1. Dataset for Crack Detection

5.1.2. Dataset for Crack Skeleton Extraction

5.2. Assessment Metrics

5.3. Model Implementation and Testing

5.3.1. Crack Detection

5.3.2. Crack Edge Extraction

5.3.3. Integrated Model in the HoloLens

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI