Data-Augmented Deep Learning Models for Abnormal Road Manhole Cover Detection
Abstract
:1. Introduction
- A sample expansion method for the abnormal manhole cover dataset is proposed. This method allows obtaining a variety of anomalous coverage samples from images using geodetic information and perspective transformations to provide samples for subsequent data augmentation.
- Using extracted abnormal manhole cover samples, we proposed a visually guided copy–paste data augmentation method for abnormal manhole covers, namely VGCopy-paste. This method combines prior visual and spatial information to more intuitive paste anomalous manhole cover samples onto the image, alleviating the problems of sample imbalance and an insufficient number of samples during training.
- Better performance under different training configurations and epochs compared with the current state-of-the-art object detection models: The experimental results show that networks using the data enhancement method in this paper have higher accuracy and faster convergence than networks that do not use this method with the same configuration.
2. Materials and Methods
2.1. Data Augmentation for Deep Learning
2.2. Deep Learning Manhole Cover Detection
3. Data-Augmented Deep Learning Model
3.1. Abnormal Manhole Cover Sample Expansion
3.2. Visually Guided Copy–Paste Data Augmentation
3.2.1. Pasting Method of Abnormal Manhole Cover Samples
3.2.2. Adaptive Pasting Method Combined with Scene Semantics Information
Algorithm 1 VGCopy-Paste Data Augmentation for Road Manhole Cover Detection |
(1) Input the abnormal manhole cover image taken by a mobile device; |
(2) Fit the cover edge with an ellipse and use (2) and (3) to calculate the ; |
(3) Extract abnormal manhole cover samples; |
(4) Input the image taken by the vehicle camera; |
(5) Find the pasting position of the cover sample in the image taken by the vehicle camera through the road segmentation model; |
(6) Calculate the by (9) to paste the sample into the image; |
(7) Repeat steps 1 to steps 6 until the images in the train set are all enhanced. |
4. Discussion
4.1. Experimental Data
4.2. Models
4.3. Implementation Details
4.4. Main Result and Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Yan, K.; Chen, X.; Zhou, X.; Yan, Z.; Ma, J. Physical Model Informed Fault Detection and Diagnosis of Air Handling Units Based on Transformer Generative Adversarial NetworK. IEEE Trans. Ind. Inform. 2022, 19, 2192–2199. [Google Scholar] [CrossRef]
- Yan, K.; Guo, X.; Ji, Z.; Zhou, X. Deep transfer learning for cross-species plant disease diagnosis adapting mixed subdomains. IEEE/ACM Trans. Comput. Biol. Bioinform. 2021, 1. [Google Scholar] [CrossRef] [PubMed]
- Yan, K.; Dai, Y.; Xu, M.; Mo, Y. Tunnel surface settlement forecasting with ensemble learning. Sustainability 2019, 12, 232. [Google Scholar] [CrossRef] [Green Version]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In Proceeding of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; pp. 213–229. [Google Scholar] [CrossRef]
- Qin, X.; Zhang, Z.; Huang, C.; Gao, C.; Dehgan, M.; Jagersand, M. Basnet: Boundary-aware salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 7479–7489. [Google Scholar] [CrossRef]
- Lin, T.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar] [CrossRef] [Green Version]
- Ren, S.; He, K.; Ross, B.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, W.; Chen, D.; Yin, P.; Yang, M.; Li, E.; Xie, M.; Zhang, L. Small manhole cover detection in remote sensing Imagery with deep convolutional neural networks. ISPRS Int. J. Geo-Inf. 2019, 8, 49. [Google Scholar] [CrossRef] [Green Version]
- Qing, L.; Yang, K.; Tan, W.; Li, J. Automated detection of manhole covers in Mls point clouds using a deep learning approach. Int. Symp. Geosci. Remote Sens. 2020, 2020, 1580–1583. [Google Scholar] [CrossRef]
- Fu, X.; Jing, W.; Guiran, C.; Huiyu, Z. Manhole cover intelligent detection and management system. In Proceedings of the International Conference on Electronic, Odessa, Ukraine, 23–27 May 2016; Volume 40, pp. 986–988. [Google Scholar] [CrossRef] [Green Version]
- Guo, X.; Liu, B.; Wang, L. Design and implementation of intelligent manhole cover monitoring system based on Nb-Iot. In Proceedings of the International Conference on Robots & Intelligent System, Haikou, China, 15–16 June 2019; pp. 207–210. [Google Scholar] [CrossRef]
- Shotrten, C.; Khoshgoftaar, T. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
- Ghiasi, G.; Cui, Y.; Qian, A.; Lin, R.; Cubuk, E.; Le, Q.; Zoph, B. Simple copy-paste is a strong data augmentation method for instance segmentation. Comput. Vis. Pattern Recognit. 2021, 2021, 2918–2928. [Google Scholar] [CrossRef]
- Wu, J.; Zhou, C.; Yang, M.; Zhang, Q.; Yuan, J. Temporal-context enhanced detection of heavily occluded pedestrians. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Washington State Convention Center, Seattle, WA, USA, 16–20 June 2020; pp. 13427–13436. [Google Scholar] [CrossRef]
- Hu, H.; Cai, Q.; Wang, D.; Lin, J.; Sun, M.; Krahenbuhl, P.; Darrell, T.; Yu, F. Joint monocular 3d vehicle detection and tracking. In Proceedings of the International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 5389–5398. [Google Scholar] [CrossRef]
- Serna, C.; Ruicheck, Y. Traffic signs detection and classification for european urban environments. IEEE Trans. Intell. Transp. Syst. 2019, 21, 4388–4399. [Google Scholar] [CrossRef]
- Feng, Z.; Li, M.; Stilz, M.; Kunert, M.; Wiesbeck, W. Lane detection with a high-resolution automotive radar by introducing a new type of road marking. IEEE Trans. Intell. Transp. Syst. 2019, 20, 2430–2447. [Google Scholar] [CrossRef]
- Zhang, H.; Moustapha, C.; Yann, N.; David, L. mixup: Beyond Empirical Risk Minimization. arXiv 2017, arXiv:1710.09412. [Google Scholar] [CrossRef]
- Yun, S.; Han, D.; Chun, S.; Joon, S.; Yoo, Y.; Choe, J. CutMix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6022–6031. [Google Scholar] [CrossRef]
- Bochkovskjy, A.; Wang, C.; Liao, H. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Dvornik, N.; Mairal, J.; Schmid, C. Modeling Visual Context is Key to Augmenting Object Detection Datasets. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; Volume 11216, pp. 364–380. [Google Scholar] [CrossRef] [Green Version]
- Georgios, G.; Arsalan, M.; Alexander, C.; Jana, K. Synthesizing training data for object detection in indoor scenes. Robot. Sci. Syst. 2017, 479–488. [Google Scholar] [CrossRef]
- Fang, H.; Sun, J.; Wang, R.; Gou, M.; Li, Y.; Lu, C. InstaBoost: Boosting instance segmentation via probability map guided copy-pasting. In Proceedings of the International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 682–691. [Google Scholar] [CrossRef] [Green Version]
- Wei, Z.; Yang, M.; Wang, L.; Ma, H.; Xhong, R. Customized mobile LiDAR system for manhole cover detection and identification. Sensors 2019, 19, 2422. [Google Scholar] [CrossRef] [Green Version]
- Ren, H.; Deng, F. Manhole cover detection using depth information. J. Phys. Conf. Ser. 2021, 1865, 012037. [Google Scholar] [CrossRef]
- Zhou, B.; Zhao, W.; Guo, W.; Li, L.; Zhang, D.; Mao, Q.; Li, Q. Smartphone-based road manhole cover detection and classification. Autom. Constr. 2022, 140, 104344. [Google Scholar] [CrossRef]
- Ling, J.; Xue, H.; Song, L.; Xie, R.; Gu, X. Region-aware adaptive instance normalization for image harmonization. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 9357–9366. [Google Scholar] [CrossRef]
- Zhang, Y.; Chen, H.; He, Y.; Ye, M.; Cai, X.; Zhang, D. Road segmentation for all-day outdoor robot navigation. Neurocomputing 2018, 314, 316–325. [Google Scholar] [CrossRef]
- Shelhamer, E.; Long, J.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar] [CrossRef] [Green Version]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. Pattern Recognit. Image Process. 2015, 9351, 234–241. [Google Scholar] [CrossRef] [Green Version]
- FastestDet Release v1.0. Available online: https://github.com/dog-qiuqiu/FastestDet (accessed on 3 August 2022).
- YOLOv5 Release v6.2. Available online: https://github.com/ultralytics/yolov5/tree/v6.2 (accessed on 3 August 2022).
- Zhou, X.; Wang, D.; Philipp, K. Objects as Points. arXiv 2019, arXiv:1905.11922. [Google Scholar] [CrossRef]
- Lin, T.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar] [CrossRef] [Green Version]
- Wang, C.; Alexey, B.; Liao, H. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar] [CrossRef]
- Wenyan, C.; Jianfu, Z.; Li, N.; Liu, L.; Zhixin, L.; Weiyuan, L.; Liqing, Z. DoveNet: Deep Image Harmonization via Domain Verification. Comput. Vis. Pattern Recognit. 2019, 8391–8400. [Google Scholar] [CrossRef]
- Wu, P.; Li, H.; Zeng, N. FMD-Yolo: An efficient face mask detection method for COVID-19 prevention and control in public. Image Vis. Comput. 2022, 117, 104341. [Google Scholar] [CrossRef] [PubMed]
No. | Input | Operator | Description | No. Parameters |
---|---|---|---|---|
1 | 512 × 512 × 3 | conv2d, 3 × 3 | First conv layer | 464 |
2 | 256 × 256 × 16 | bneck, 3 × 3 | Inverted ResBlock, s = 2 | 744 |
3 | 128 × 128 × 16 | bneck, 3 × 3 | Inverted ResBlock, s = 2 | 3864 |
4 | 64 × 64 × 24 | bneck, 3 × 3 | Inverted ResBlock, s = 1 | 5416 |
5 | 64 × 64 × 24 | bneck, 5 × 5 | Inverted ResBlock, s = 2 | 13,736 |
6 | 32 × 32 × 40 | bneck, 5 × 5 | Inverted ResBlock, s = 1 | 57,264 |
7 | 32 × 32 × 40 | bneck, 5 × 5 | Inverted ResBlock, s = 1 | 57,264 |
8 | 32 × 32 × 40 | bneck, 5 × 5 | Inverted ResBlock, s = 1 | 21,968 |
9 | 32 × 32 × 48 | bneck, 5 × 5 | Inverted ResBlock, s = 1 | 29,800 |
10 | 32 × 32 × 48 | bneck, 5 × 5 | Inverted ResBlock, s = 2 | 91,848 |
11 | 16 × 16 × 96 | bneck, 5 × 5 | Inverted ResBlock, s = 1 | 294,096 |
12 | 16 × 16 × 96 | bneck, 5 × 5 | Inverted ResBlock, s = 1 | 294,096 |
13 | 16 × 16 × 96 | shortcut + upsample | Connect to No. 12 layer | 117,824 |
14 | 32 × 32 × 24 | shortcut + upsample | Connect to No. 9 layer | 39,376 |
15 | 64 × 64 × 16 | shortcut + upsample | Connect to No. 4 layer | 56,608 |
16 | 128 × 128 × 64 | upsample | upsample to 512 × 512 × 2 | 84,800 |
Total No. parameters: 1,169,168 |
Category | No. of Manhole Cover Samples Shot by the Vehicle’s Camera | No. of Manhole Cover Samples Shot by Smartphone | No. of Manhole Cover Samples after Data Augmentation |
---|---|---|---|
Dislocated | 202 | 36 | 1549 |
Damaged | 81 | 26 | 1035 |
Missing | 41 | 20 | 809 |
Normal | 4781 | 0 | 4781 |
Method | Params | |||
---|---|---|---|---|
FastestDet | Without copy–paste | 4.74 M | 63.2 | 30.7 |
FastestDet | Random copy–paste | 4.74 M | 67.4 | 31.2 |
FastestDet | VGCopy-paste | 4.74 M | 76.1 | 46.0 |
YOLOv5s | Without copy–paste | 7.02 M | 61.9 | 30.9 |
YOLOv5s | Random copy–paste | 7.02 M | 78.3 | 41.7 |
YOLOv5s | VGCopy-paste | 7.02 M | 80.0 | 54.6 |
CenterNet-DLA34 | Without copy–paste | 20.17 M | 64.1 | 22.0 |
CenterNet-DLA34 | Random copy–paste | 20.17 M | 70.5 | 38.7 |
CenterNet-DLA34 | VGCopy-paste | 20.17 M | 70.9 | 39.9 |
Retinanet-D34 | Without copy–paste | 31.52 M | 50.5 | 17.8 |
Retinanet-D34 | Random copy–paste | 31.52 M | 70.3 | 24.3 |
Retinanet-D34 | VGCopy-paste | 31.52 M | 70.7 | 25.3 |
YOLOv7 | Without copy–paste | 37.21 M | 73.4 | 35.9 |
YOLOv7 | Random copy–paste | 37.21 M | 80.5 | 50.6 |
YOLOv7 | VGCopy-paste | 37.21 M | 81.8 | 56.5 |
Params | meanIOU | |
---|---|---|
FCN | 35.31 M | 91.9 |
UNet | 4.32 M | 96.5 |
Ours | 1.17 M | 97.9 |
Method | ||
---|---|---|
Mixup | 60.1 | 33.1 |
Cutout | 46.6 | 13.4 |
Random affine | 63.1 | 27.8 |
HVS augmentation | 40.8 | 13.1 |
Random copy–paste | 78.3 | 41.7 |
VGCopy-paste | 80.0 | 54.6 |
No. of Pasted Samples | Model | Pasted Range | ||
---|---|---|---|---|
1 | YOLOv5 | 0.58–0.91 | 79.5 | 49.9 |
1 | YOLOv5 | 0.67–0.91 | 80.0 | 54.6 |
1 | YOLOv5 | 0.77–0.91 | 75.4 | 48.9 |
1 | YOLOv5 | 0.67–0.91 | 80.0 | 54.6 |
2 | YOLOv5 | 0.67–0.91 | 79.7 | 51.5 |
3 | YOLOv5 | 0.67–0.91 | 79.0 | 49.8 |
1 | YOLOv7 | 0.58–0.91 | 81.6 | 55.8 |
1 | YOLOv7 | 0.67–0.91 | 81.8 | 56.5 |
1 | YOLOv7 | 0.77–0.91 | 80.0 | 54.9 |
1 | YOLOv7 | 0.67–0.91 | 81.8 | 56.5 |
2 | YOLOv7 | 0.67–0.91 | 82.8 | 52.5 |
3 | YOLOv7 | 0.67–0.91 | 81.0 | 51.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, D.; Yu, X.; Yang, L.; Quan, D.; Mi, H.; Yan, K. Data-Augmented Deep Learning Models for Abnormal Road Manhole Cover Detection. Sensors 2023, 23, 2676. https://doi.org/10.3390/s23052676
Zhang D, Yu X, Yang L, Quan D, Mi H, Yan K. Data-Augmented Deep Learning Models for Abnormal Road Manhole Cover Detection. Sensors. 2023; 23(5):2676. https://doi.org/10.3390/s23052676
Chicago/Turabian StyleZhang, Dongping, Xuecheng Yu, Li Yang, Daying Quan, Hongmei Mi, and Ke Yan. 2023. "Data-Augmented Deep Learning Models for Abnormal Road Manhole Cover Detection" Sensors 23, no. 5: 2676. https://doi.org/10.3390/s23052676
APA StyleZhang, D., Yu, X., Yang, L., Quan, D., Mi, H., & Yan, K. (2023). Data-Augmented Deep Learning Models for Abnormal Road Manhole Cover Detection. Sensors, 23(5), 2676. https://doi.org/10.3390/s23052676