VPP: Visual Pollution Prediction Framework Based on a Deep Active Learning Approach Using Public Road Images
Abstract
:1. Introduction
- The proposed AI-based real-time visual pollution prediction (VPP) aims to simultaneously detect and categorize visual pollution (VP) from color images.
- An end-to-end AI-based framework is trained and evaluated using a private dataset in a multi-class classification scenario to simultaneously predict various pollutants.
- A new private VP dataset is collected by the Ministry of Municipal and Rural Affairs and Housing (MOMRAH), Saudi Arabia. This dataset has various VP classes and is called the MOMRAH benchmark dataset: excavation barriers, potholes, and dilapidated sidewalks.
- Deep active learning (DAL) supports MOMRAH experts in automatically annotating the VP dataset for multiple tasks: detection with a bounding box and classification with a class label. The annotation process is conducted at an object level, not just at an image level. This is because some images carry multiple and different objects at once.
- A comprehensive training process is conducted to optimize and select the optimal solution for the proposed VPP. We perform various emerging AI predictors, which are MobileNetSSDv2, EfficientDet, Faster RCNN, Detectron2, YOLO-v7, and YOLOv5.
- An ablation or adaptation study is conducted to check the reliability of the proposed AI-based VPP framework when unseen images from different sources are used.
2. Related Works
3. Materials and Methods
3.1. Visual Pollution Real Dataset: MOMRAH VP Dataset
3.2. Data Pre-Processing
3.3. Deep Active Learning (DAL) for Automatic Data Annotation
3.4. Training Data Enlargement via an Augmentation Strategy
3.5. The Concept of VP Object Detection—VPP-Based YOLO
3.5.1. Hyperparameters’ Evolution
3.5.2. Transfer Learning
3.6. Experimental Setting
3.7. Implementation Environment
3.8. Evaluation Strategy
4. Experimental Results and Discussion
4.1. The Optimization Results of the Proposed AI-Based VPP Framework
4.1.1. Evaluation Results Based on the Various YOLO Structures’ Depth and Width
4.1.2. Evaluation Results of the Best YOLO Candidate with Various Activation Functions
4.1.3. Influence of Hyperparameter Optimization on Prediction Performance
4.2. Prediction Evaluation Performance during the Deep Active Learning (DAL) Strategy
4.3. Prediction Evaluation Results Using the Whole Annotated Dataset
4.4. Evaluation Comparison Results
4.5. Work Limitation and Future Work
4.6. Ablation Study
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Campaign to Improve Saudi Arabia’s Urban Landscape. Available online: https://www.arabnews.com/node/1910761/saudi-arabia (accessed on 26 April 2022).
- Aqeel, A.B. Quality of Life. Available online: https://www.vision2030.gov.sa/v2030/vrps/qol/ (accessed on 26 April 2022).
- Models of Drivers of Biodiversity and Ecosystem Change. Available online: https://ipbes.net/models-drivers-biodiversity-ecosystem-change (accessed on 10 December 2022).
- Visual Pollution, Pollution A to Z. Available online: https://www.encyclopedia.com/environment/educational-magazines/visual-pollution (accessed on 25 April 2022).
- Ahmed, N.; Islam, M.N.; Tuba, A.S.; Mahdy, M.; Sujauddin, M. Solving visual pollution with deep learning: A new nexus in environmental management. J. Environ. Manag. 2019, 248, 109253. [Google Scholar] [CrossRef]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 11–17 October 2021; pp. 2778–2788. [Google Scholar]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Szczepańska, M.; Wilkaniec, A.; Škamlová, L. Visual pollution in natural and landscape protected areas: Case studies from Poland and Slovakia. Quaest. Geogr. 2019, 38, 133–149. [Google Scholar] [CrossRef] [Green Version]
- Chmielewski, S. Chaos in motion: Measuring visual pollution with tangential view landscape metrics. Land 2020, 9, 515. [Google Scholar] [CrossRef]
- Liu, H.; Lei, F.; Tong, C.; Cui, C.; Wu, L. Visual smoke detection based on ensemble deep cnns. Displays 2021, 69, 102020. [Google Scholar] [CrossRef]
- Al-Masni, M.A.; Al-Antari, M.A.; Choi, M.T.; Han, S.M.; Kim, T.S. Skin lesion segmentation in dermoscopy images via deep full resolution convolutional networks. Comput. Methods Programs Biomed. 2018, 162, 221–231. [Google Scholar] [CrossRef]
- Al-Antari, M.A.; Al-Masni, M.A.; Choi, M.T.; Han, S.M.; Kim, T.S. A fully integrated computer-aided diagnosis system for digital X-ray mammograms via deep learning detection, segmentation, and classification. Int. J. Med. Inform. 2018, 117, 44–54. [Google Scholar] [CrossRef] [PubMed]
- Al-antari, M.A.; Hua, C.-H.; Bang, J.; Lee, S. Fast deep learning computer-aided diagnosis of COVID-19 based on digital chest x-ray images. Appl. Intell. 2020, 51, 2890–2907. [Google Scholar] [CrossRef] [PubMed]
- Al-Antari, M.A.; Kim, T.-S. Evaluation of deep learning detection and classification towards computer-aided diagnosis of breast lesions in digital x-ray mammograms. Comput. Methods Programs Biomed. 2020, 196, 105584. [Google Scholar] [CrossRef]
- Salman, A.G.; Kanigoro, B.; Heryadi, Y. Weather forecasting using deep learning techniques. In Proceedings of the 2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Depok, Indonesia, 10–11 October 2015; pp. 281–285. [Google Scholar]
- Wang, X.; Chen, J.; Quan, S.; Wang, Y.-X.; He, H. Hierarchical model predictive control via deep learning vehicle speed predictions for oxygen stoichiometry regulation of fuel cells. Appl. Energy 2020, 276, 115460. [Google Scholar] [CrossRef]
- Gunning, D.; Aha, D. DARPA’s explainable artificial intelligence (XAI) program. AI Mag. 2019, 40, 44–58. [Google Scholar]
- Al-antari, M.A.; Hua, C.-H.; Bang, J.; Choi, D.-J.; Kang, S.M.; Lee, S. A rapid deep learning computer-aided diagnosis to simultaneously detect and classify the novel COVID-19 pandemic. In Proceedings of the 2020 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES), Langkawi Island, Malaysia, 1–3 March 2021; pp. 585–588. [Google Scholar]
- Wu, X.; Sahoo, D.; Hoi, S.C. Recent advances in deep learning for object detection. Neurocomputing 2020, 396, 39–64. [Google Scholar] [CrossRef]
- Koch, C.; Brilakis, I. Pothole detection in asphalt pavement images. Adv. Eng. Inform. 2011, 25, 507–515. [Google Scholar] [CrossRef]
- Shu, Z.; Yan, Z.; Xu, X. Pavement crack detection method of street view images based on deep learning. Journal of Physics: Conference Series 2021, 1952, 022043. [Google Scholar] [CrossRef]
- Yang, F.; Zhang, L.; Yu, S.; Prokhorov, D.; Mei, X.; Ling, H. Feature pyramid and hierarchical boosting network for pavement crack detection. IEEE Trans. Intell. Transp. Syst. 2019, 21, 1525–1535. [Google Scholar] [CrossRef] [Green Version]
- Wakil, K.; Naeem, M.A.; Anjum, G.A.; Waheed, A.; Thaheem, M.J.; Hussnain, M.Q.u.; Nawaz, R. A hybrid tool for visual pollution Assessment in urban environments. Sustainability 2019, 11, 2211. [Google Scholar] [CrossRef] [Green Version]
- Wakil, K.; Tahir, A.; Hussnain, M.Q.u.; Waheed, A.; Nawaz, R. Mitigating urban visual pollution through a multistakeholder spatial decision support system to optimize locational potential of billboards. ISPRS Int. J. Geo-Inf. 2021, 10, 60. [Google Scholar] [CrossRef]
- Chiu, Y.-C.; Tsai, C.-Y.; Ruan, M.-D.; Shen, G.-Y.; Lee, T.-T. Mobilenet-SSDv2: An improved object detection model for embedded systems. In Proceedings of the 2020 International Conference on System Science and Engineering (ICSSE), Kagawa, Japan, 31 August–3 September 2020; pp. 1–5. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
- Pham, V.; Pham, C.; Dang, T. Road damage detection and classification with detectron2 and faster r-cnn. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; pp. 5592–5601. [Google Scholar]
- Dima, T.F.; Ahmed, M.E. Using YOLOV5 algorithm to detect and recognize american sign language. In Proceedings of the 2021 International Conference on Information Technology (ICIT), Amman, Jordan, 14–15 July 2021; pp. 603–607. [Google Scholar]
- Al-Masni, M.A.; Al-Antari, M.A.; Park, J.-M.; Gi, G.; Kim, T.-Y.; Rivera, P.; Valarezo, E.; Choi, M.-T.; Han, S.-M.; Kim, T.-S. Simultaneous detection and classification of breast masses in digital mammograms via a deep learning YOLO-based CAD system. Comput. Methods Programs Biomed. 2018, 157, 85–94. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, G.; Liu, Z.; Maaten, L. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Mohammad, A.; Hafiz, A.; Jamil, H.; Mugahed, A.; Bader, A.; Areeba, A. Saudi Arabia Public Roads Visual Pollution Dataset; King Faisal University: Hufof, Saudi Arabia, 2022. [Google Scholar] [CrossRef]
- Tzutalin, L. LabelImg. Available online: https://github.com/tzutalin/labelImg (accessed on 25 April 2022).
- Kim, J.-H.; Kim, N.; Park, Y.W.; Won, C.S. Object detection and classification based on YOLO-v5 with improved maritime dataset. J. Mar. Sci. Eng. 2022, 10, 377. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; pp. 8759–8768. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, USA, 26 June 2016–1 July 2016; pp. 779–788. [Google Scholar]
- Li, Z.; Tian, X.; Liu, X.; Liu, Y.; Shi, X. A two-stage industrial defect detection framework based on improved-yolov5 and optimized-inception-resnetv2 models. Appl. Sci. 2022, 12, 834. [Google Scholar] [CrossRef]
- Bhatia, Y.; Rai, R.; Gupta, V.; Aggarwal, N.; Akula, A. Convolutional neural networks based potholes detection using thermal imaging. J. King Saud Univ. -Comput. Inf. Sci. 2022, 34, 578–588. [Google Scholar]
- Yousaf, M.H.; Azhar, K.; Murtaza, F.; Hussain, F. Visual analysis of asphalt pavement for detection and localization of potholes. Adv. Eng. Inform. 2018, 38, 527–537. [Google Scholar] [CrossRef]
- Baek, J.-W.; Chung, K. Pothole classification model using edge detection in road image. Appl. Sci. 2020, 10, 6662. [Google Scholar] [CrossRef]
- Al-Tam, R.M.; Al-Hejri, A.M.; Narangale, S.M.; Samee, N.A.; Mahmoud, N.F.; Al-Masni, M.A.; Al-Antari, M.A. Ahybrid workflow of residual convolutional transformer encoder for breast cancer classification using digital x-ray mammograms. Biomedicines 2022, 10, 2971. [Google Scholar] [CrossRef]
- Universe, R. Pothole Detection Dataset. Available online: https://universe.roboflow.com/aegis/pothole-detection-i00zy (accessed on 17 December 2022).
AI Model | Precision | Recall | F1-Score | mAP |
---|---|---|---|---|
YOLOv5s | 0.65 | 0.50 | 0.57 | 0.55 |
YOLOv5m | 0.69 | 0.57 | 0.62 | 0.61 |
YOLOv5l | 0.72 | 0.59 | 0.65 | 0.65 |
YOLOv5x | 0.70 | 0.62 | 0.66 | 0.67 |
AI Model | Precision | Recall | F1-Score | mAP_0.5 |
---|---|---|---|---|
hyp.scratch-low | 0.61 | 0.50 | 0.55 | 0.53 |
hyp.scratch-med | 0.70 | 0.58 | 0.63 | 0.62 |
hyp.scratch-high | 0.74 | 0.66 | 0.70 | 0.71 |
AI Predictor | Precision | Recall | F1-Score | [email protected] | Inferencing Time (Msec) | FPS |
---|---|---|---|---|---|---|
MobileNetSSDv2 | 0.70 | 0.58 | 0.63 | 0.62 | 600 | 13.2 |
EfficientDet | 0.74 | 0.66 | 0.70 | 0.72 | 583.1 | 8.32 |
Faster R-CNN | 0.84 | 0.77 | 0.80 | 0.80 | 540.2 | 98.2 |
Detectron2 | 0.87 | 0.86 | 0.86 | 0.89 | 342.0 | 120.2 |
YOLOv5x | 0.88 | 0.89 | 0.88 | 0.92 | 22.7 | 319 |
YOLOv7 | 0.89 | 0.88 | 0.89 | 0.93 | 18.5 | 325 |
Reference | Dataset | Target Classes | Methodology | Evaluation Performance (mAP) (%) | |||
---|---|---|---|---|---|---|---|
Precision | Recall | F1-Score | mAP | ||||
Aparna et al. (2019) [43] | Road thermal images | Pothole | Classification via CNN-based ResNet | 81.15 | - | - | - |
M. H. Yousaf et al. (2018) [44] | Private dataset: 120 pavement images | Pothole | Classification via SVM | 71.59 | - | - | - |
Ji-Won Baek et al. (2020) [45] | Private road damage images | Pothole | YOLO-based algorithm | 83.45 | - | - | - |
Pham et al. (2020) [28] | 2020 IEEE Global Road Damage Cup Challenge | Longitudinal crack, transverse crack, alligator crack, and pothole | Faster-RCNN | - | - | 51.40 | - |
Proposed * | Private MOMRAH Dataset | Excavation barriers, potholes, and dilapidated sidewalks | Simultaneous detection and classification via AI-based VPP framework | 89.0 | 88.0 | 89.0 | 93.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
AlElaiwi, M.; Al-antari, M.A.; Ahmad, H.F.; Azhar, A.; Almarri, B.; Hussain, J. VPP: Visual Pollution Prediction Framework Based on a Deep Active Learning Approach Using Public Road Images. Mathematics 2023, 11, 186. https://doi.org/10.3390/math11010186
AlElaiwi M, Al-antari MA, Ahmad HF, Azhar A, Almarri B, Hussain J. VPP: Visual Pollution Prediction Framework Based on a Deep Active Learning Approach Using Public Road Images. Mathematics. 2023; 11(1):186. https://doi.org/10.3390/math11010186
Chicago/Turabian StyleAlElaiwi, Mohammad, Mugahed A. Al-antari, Hafiz Farooq Ahmad, Areeba Azhar, Badar Almarri, and Jamil Hussain. 2023. "VPP: Visual Pollution Prediction Framework Based on a Deep Active Learning Approach Using Public Road Images" Mathematics 11, no. 1: 186. https://doi.org/10.3390/math11010186
APA StyleAlElaiwi, M., Al-antari, M. A., Ahmad, H. F., Azhar, A., Almarri, B., & Hussain, J. (2023). VPP: Visual Pollution Prediction Framework Based on a Deep Active Learning Approach Using Public Road Images. Mathematics, 11(1), 186. https://doi.org/10.3390/math11010186