Deep-Learning- and Unmanned Aerial Vehicle-Based Structural Crack Detection in Concrete
Abstract
:1. Introduction
2. Introduction of U-Net, DeepLab v3, and the Crack Datasets
2.1. Related Network Introduction
2.2. Selected Models
2.3. Evaluation Indices
2.4. Data Collection
3. Model Training
3.1. Hyperparameter Tuning
3.2. UAV Image Testing
3.3. Field Crack Identification Experiment Based on UAV
4. Conclusions
- (i)
- A publicly available crack image dataset with 11,000 images from handheld cameras was adopted for training purposes. Meanwhile, a small UAV image-based dataset with 648 images was newly established for testing. Four different DNNs—U-Net, DeepLab v3 (MobileNet v3), DeepLab v3 (ResNet50), and TransUNet—were adopted for comparison to study the performance of different DNNs.
- (ii)
- The four DNN models were first trained on existing datasets and exhibited different features on different types of cracks. As demonstrated by the evaluation indices of precision, mIoU, recall, and F1, TransUNet was the best model and U-Net was a close second. DeepLab v3 (ResNet50) demonstrated similar performance to TransUNet and U-Net, while DeepLab v3 (MobileNet v3) was less accurate compared with the other three models. The image tests showed that the four models could successfully detect most of the crack regions, but dim and low-contrast backgrounds would cause incorrect detection.
- (iii)
- The tests of the UAV image-based dataset indicated that the performance of all four models decreased. In detail, the mIoU reductions in U-Net, DeepLab v3 (MobileNet v3), DeepLab v3 (ResNet50), and TransUNet were 6.9%, 4.8%, 3.2%, and 7.8%, respectively. The two U-Net-related models exhibited an increase in the recall rate of more than 10%, while the precision and F1 dropped. Based on the classic U-Net model, the influence of the proportion of crack training samples was investigated based on k-fold cross-validation and two more mix ratios. It was found that the mix ratio did influence the crack detection performance; however, the influence was not certain and the largest change in all six groups was 7.4%. It showed that the large training dataset had rich diversity and could release the imbalance issue of the training sample.
- (iv)
- The sub-UAV image test showed that the trained models could detect most of the crack regions, but a low-contrast background and fine cracks caused incorrect detection. Quantitative evaluation of the crack areas indicated that TransUNet was the best, with the smallest relative error of 6.2% and an average relative error of 16.3. The raw UAV image tests revealed that TransUNet and U-Net performed similarly. The TransUNet results were more continuous and smoother, but tiny cracks caused mistakes. DeepLab V3 (ResNet50) was better than DeepLab V3 (MobileNet V3), but they demonstrated problems of discontinuity and were less accurate than the two U-Net-based versions.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv 2014, arXiv:1412.7062. [Google Scholar]
- Liu, Z.; Cao, Y.; Wang, Y.; Wang, W. Computer vision-based concrete crack detection using U-net fully convolutional networks. Automat. Constr. 2019, 104, 129–139. [Google Scholar] [CrossRef]
- Qiao, W.; Zhang, H.; Zhu, F.; Wu, Q. A crack identification method for concrete structures using improved U-Net convolutional neural networks. Math. Probl. Eng. 2021, 2021, 6654996. [Google Scholar] [CrossRef]
- Sun, X.; Xie, Y.; Jiang, L. DMA-Net: DeepLab with Multi-Scale Attention for Pavement Crack Segmentation. IEEE Trans. Intell. Transp. Syst. 2022, 23, 18392–18403. [Google Scholar] [CrossRef]
- Zou, Q.; Zhang, Z.; Li, Q.; Qi, X.; Wang, Q.; Wang, S. DeepCrack: Learning Hierarchical Convolutional Features for Crack Detection. IEEE Trans. Image Process. 2019, 28, 1498–1512. [Google Scholar] [CrossRef]
- Ghazali, M.H.M.; Rahiman, W. Vibration-based fault detection in drone using artificial intelligence. IEEE Sens. J. 2022, 22, 8439–8448. [Google Scholar] [CrossRef]
- Nooralishahi, P.; Ramos, G.; Pozzer, S.; Ibarra-Castanedo, C.; Lopez, F.; Maldague, X.P.V. Texture analysis to enhance drone-based multi-modal inspection of structures. Drones 2022, 6, 407. [Google Scholar] [CrossRef]
- Smaoui, A.; Yaddaden, Y.; Cherif, R.; Lamouchi, D. Automated Scanning of Concrete Structures for Crack Detection and Assessment Using a Drone. In Proceedings of the 2022 IEEE 21st international Ccnference on Sciences and Techniques of Automatic Control and Computer Engineering (STA), Sousse, Tunisia, 19–21 December 2022; pp. 56–61. [Google Scholar]
- Ngo, B.T.; Luong, C.X.; Ngo, L.; Luong, H. Development of a solution for collecting crack images on concrete surfaces to assess the structural health of bridges using drone. J. Inf. Telecommun. 2023, 7, 304–316. [Google Scholar] [CrossRef]
- Zhong, X.G.; Peng, X.; Shen, M. Study on the feasibility of identifying concrete crack width with images acquired by unmanned aerial vehicles. China Civ. Eng. J. 2019, 52, 52–61. [Google Scholar]
- Peng, X.; Zhong, X.; Zhao, C.; Chen, Y.F.; Zhang, T. The feasibility assessment study of bridge crack width recognition in images based on special inspection UAV. Adv. Civ. Eng. 2020, 2020, 8811649. [Google Scholar] [CrossRef]
- Li, Y.; Ma, J.; Zhao, Z.; Shi, G. A Novel Approach for UAV Image Crack Detection. Sensors 2022, 22, 3305. [Google Scholar] [CrossRef]
- Guo, F.; Qian, Y.; Liu, J.; Yu, H. Pavement crack detection based on transformer network. Autom. Constr. 2023, 145, 104646. [Google Scholar] [CrossRef]
- Kao, S.P.; Chang, Y.C.; Wang, F.L. Combining the YOLOv4 deep learning model with UAV imagery processing technology in the extraction and quantization of cracks in bridges. Sensors 2023, 23, 2572. [Google Scholar] [CrossRef]
- Jeong, E.; Seo, J.; Wacker, J.P. UAV-aided bridge inspection protocol through machine learning with improved visibility images. Expert Syst. Appl. 2022, 197, 116791. [Google Scholar] [CrossRef]
- Baltacıoğlu, A.K.; Öztürk, B.; Civalek, Ö.; Akgöz, B. Is Artificial Neural Network Suitable for Damage Level Determination of Rc-Structures? Int. J. Eng. Appl. Sci. 2010, 2, 71–81. [Google Scholar]
- Kim, B.; Cho, S. Automated Vision-Based Detection of Cracks on Concrete Surfaces Using a Deep Learning Technique. Sensors 2018, 18, 3452. [Google Scholar] [CrossRef]
- Deng, J.H.; Lu, Y.; Lee, V.C.S. Concrete crack detection with handwriting script interferences using faster region-based convolutional neural network. Comput. Aided Civ. Infrastruct. Eng. 2020, 35, 373–388. [Google Scholar] [CrossRef]
- Ye, X.W.; Jin, T.; Li, Z.X.; Ma, S.Y.; Ding, Y.; Ou, Y.H. Structural crack detection from benchmark data sets using pruned fully convolutional networks. J. Struct. Eng. 2021, 147, 04721008. [Google Scholar] [CrossRef]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. In Proceedings of the 26th Annual Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.Q.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A.A. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; p. 31. [Google Scholar]
- Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 492–1500. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Zhang, X.; Zhou, X.Y.; Lin, M.X.; Sun, R. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
- Liu, Z.; Mao, H.; Wu, C.Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11976–11986. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Shelhamer, E.; Long, J.; Darrell, T. Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern. Anal. Mach. Intell. 2017, 39, 640–651. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Qin, X.; Zhang, Z.; Huang, C.; Dehghan, M.; Zaiane, O.R.; Jagersand, M. U2-Net: Going deeper with nested U-structure for salient object detection. Pattern Recogn. 2020, 106, 107404. [Google Scholar] [CrossRef]
- Liu, Y.; Bao, Y. Intelligent monitoring of spatially-distributed cracks using distributed fiber optic sensors assisted by deep learning. Measurement 2023, 220, 113418. [Google Scholar] [CrossRef]
- Rosso, M.M.; Aloisio, A.; Randazzo, V.; Tanzi, L.; Cirrincione, G.; Marano, G.C. Comparative deep learning studies for indirect tunnel monitoring with and without Fourier pre-processing. Integr. Comput. Aided Eng. 2023, Pre-press, 1–20. [Google Scholar] [CrossRef]
- Benz, C.; Debus, P.; Ha, H.K.; Rodehorst, V. Crack Segmentation on UAS-based Imagery using Transfer Learning. In Proceedings of the 2019 International Conference on Image and Vision Computing New Zealand (IVCNZ), Dunedin, New Zealand, 2–4 December 2019. [Google Scholar]
- Chen, J.N.; Lu, Y.Y.; Yu, Q.H.; Luo, X.D.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y.Y. Transunet: Transformers make strong encoders for medical image segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar]
The Right Situation | Model Prediction | |
---|---|---|
True | False | |
True | True prediction (TP) | False negative (FN) |
False | False prediction (FP) | True negative (TN) |
General information | |||
Weight | Maximum rising speed | Maximum tilt angle | Hover accuracy |
1391 g | 6 m/s (Automatic flight) 5 m/s (Manual operation) | 25° (Positioning mode) 35° (Attitude mode) | Enable RTK Vertical: ±0.1 m; Horizontal: ±0.1 m |
Drawing function | |||
Ground sampling distance | Controllable rotation range | Height measurement range | Accurate hover range |
(H/36.5) cm/pixel | Pitch: −90° to +30° | 0–10 m | 0–10 m |
Camera | |||
Mechanical Shutter | Maximum photo resolution | Electronic shutter | Photo Format |
8–1/2000 s | 4864 × 3648 (4:3) 5472 × 3648 (3:2) | 8–1/8000 s | JPEG |
Intelligent flight battery | |||
Capacity | Specifications | Overall weight of battery | Maximum charging power |
5870 mAh | PH4-5870 mAh-15.2 V | 468 g | 160 W |
Remote control smart battery | |||
Capacity | Specifications | Type | Energy |
4920 mAh | WB37-4920 mAh-7.6 V | LiPo 2S | 37.39 Wh |
Model Name | Precision | mIoU | Recall | F1 |
---|---|---|---|---|
U-Net | 0.797 | 0.540 | 0.684 | 0.736 |
DeepLab v3 (MobileNet v3) | 0.546 | 0.415 | 0.691 | 0.610 |
DeepLab v3 (ResNet50) | 0.780 | 0.554 | 0.708 | 0.742 |
TransUNet | 0.781 | 0.567 | 0.675 | 0.724 |
Model Name | Precision | mIoU | Recall | F1 |
---|---|---|---|---|
U-Net | 0.720 | 0.503 | 0.775 | 0.592 |
DeepLab v3 (MobileNet v3) | 0.531 | 0.395 | 0.606 | 0.453 |
DeepLab v3 (ResNet50) | 0.759 | 0.536 | 0.647 | 0.628 |
TransUNet | 0.596 | 0.523 | 0.806 | 0.685 |
Model Name | Precision | mIoU | Recall | F1 |
---|---|---|---|---|
U-Net | 0.720 | 0.503 | 0.775 | 0.592 |
k-fold cross-validation-1 | 0.724 | 0.543 | 0.622 | 0.621 |
k-fold cross-validation-2 | 0.716 | 0.511 | 0.663 | 0.596 |
k-fold cross-validation-3 | 0.731 | 0.532 | 0.654 | 0.616 |
U-Net (Mix ratio: 4:1) | 0.732 | 0.532 | 0.624 | 0.616 |
U-Net (Mix ratio: 5:1) | 0.728 | 0.518 | 0.633 | 0.605 |
Crack Area Source | Pic7 | Pic8 | Pic9 | Pic10 | Pic11 | Pic12 |
---|---|---|---|---|---|---|
Ground Truth | 1565 | 2026 | 3189 | 1374 | 717 | 3812 |
U-Net | 2301 | 3043 | 3761 | 2055 | 800 | 3677 |
DeepLab v3 (MobileNet v3) | 2785 | 3254 | 3574 | 2272 | 0 | 4760 |
DeepLab v3 (ResNet50) | 1767 | 2554 | 1564 | 608 | 583 | 4187 |
TransUNet | 1457 | 1989 | 1835 | 1277 | 956 | 3577 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jin, T.; Zhang, W.; Chen, C.; Chen, B.; Zhuang, Y.; Zhang, H. Deep-Learning- and Unmanned Aerial Vehicle-Based Structural Crack Detection in Concrete. Buildings 2023, 13, 3114. https://doi.org/10.3390/buildings13123114
Jin T, Zhang W, Chen C, Chen B, Zhuang Y, Zhang H. Deep-Learning- and Unmanned Aerial Vehicle-Based Structural Crack Detection in Concrete. Buildings. 2023; 13(12):3114. https://doi.org/10.3390/buildings13123114
Chicago/Turabian StyleJin, Tao, Wen Zhang, Chunlai Chen, Bin Chen, Yizhou Zhuang, and He Zhang. 2023. "Deep-Learning- and Unmanned Aerial Vehicle-Based Structural Crack Detection in Concrete" Buildings 13, no. 12: 3114. https://doi.org/10.3390/buildings13123114
APA StyleJin, T., Zhang, W., Chen, C., Chen, B., Zhuang, Y., & Zhang, H. (2023). Deep-Learning- and Unmanned Aerial Vehicle-Based Structural Crack Detection in Concrete. Buildings, 13(12), 3114. https://doi.org/10.3390/buildings13123114