AquaPile-YOLO: Pioneering Underwater Pile Foundation Detection with Forward-Looking Sonar Image Processing
Abstract
:1. Introduction
2. Methods
2.1. Forward-Looking Sonar
2.2. AquaPile-YOLO Network
2.2.1. Data Augmentation
2.2.2. Transfer Learning Strategy
2.2.3. Multi-Scale Feature Fusion
2.2.4. Attention Mechanism
- (1)
- C3 Module with CBAM Attention
- (2)
- MPConv Module
- (3)
- C3N Module
2.2.5. Loss Function Optimization
2.2.6. Soft-NMS (Soft-Non-Maximum Suppression)
3. Experiments
3.1. Experimental Design
3.2. Data Collection
3.3. Experimental Procedure
4. Results
4.1. Ablation Studies
4.2. Comparisons
5. Discussion
- Multi-scale Feature Fusion: By incorporating a multi-scale feature fusion scheme, this study effectively captures underwater targets of varying sizes, thereby improving small target detection accuracy.
- Enhanced Attention Mechanism: The attention mechanism is improved by combining Normalized Weighted Distance (NWD) and Intersection over Union (IOU), enhancing the model’s ability to distinguish small targets and reducing scale sensitivity. This enhancement is complemented by structural modifications within the YOLOv5 network, allowing for a more nuanced focus on critical image regions.
- Application of Soft-NMS: Rather than traditional NMS, Soft-NMS better handles occlusions and overlapping targets, limiting missed and false detections in complex scenes.
- Data Augmentation Strategy: The model’s generalization and adaptability to diverse environmental conditions are bolstered through data augmentation techniques like rotation, random cropping, and noise addition.
- Algorithm Optimization: While the AquaPile-YOLO algorithm has demonstrated high accuracy, there is a need to continue optimizing the model structure. Reducing computational resource consumption and improving detection speed are essential to meet the demands of real-time detection, particularly in resource-constrained environments.
- Multimodal Data Fusion: To further improve detection accuracy and robustness, exploring the combination of sonar images with other sensor data, such as optical images or LiDAR data, is a promising avenue. Multi-modal data fusion could provide a more comprehensive understanding of the underwater environment and enhance the algorithm’s capabilities.
- Broader Environmental Adaptability: Assessing the model’s performance across a broader range of underwater environments is crucial. Testing the algorithm in various water qualities, lighting conditions, and underwater structures will enhance the model’s generality and adaptability, ensuring its effectiveness in diverse marine settings.
- Automation and Intelligence: The development of an automated sonar image collection system, integrated into underwater robots or autonomous underwater vehicles (AUV/USV/ROV/UUV), is essential for achieving fully autonomous underwater detection tasks. This advancement would increase the efficiency and safety of underwater operations.
- Engineering Application Deployment: Integrating the AquaPile-YOLO model into existing underwater monitoring systems for long-term deployment and performance evaluation is vital. Such integration will provide insights into the model’s practical performance and longevity, facilitating its adoption in marine engineering and environmental monitoring projects.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Gan, J. Development of an Underwater Detection Robot for the Structures with Pile Foundation. J. Mar. Sci. Eng. 2024, 12, 1051. [Google Scholar] [CrossRef]
- Lu, Y.; Sang, E. Feature extraction techniques of underwater objects based on active sonars—An overview. J. Harbin Eng. Univ. 1997, 18, 43–54. (In Chinese) [Google Scholar]
- Calder, B.R.; Linnett, L.M.; Carmichael, D.R. Bayesian approach to object detection in sidescan sonar. IEE Proc.-Vis. Image Signal Process. 1998, 145, 221–228. [Google Scholar] [CrossRef]
- Foresti, G.L.; Gentili, S. A Vision Based System for Object Detection in Underwater Images. Int. J. Pattern Recognit. Artif. Intell. 2000, 14, 167–188. [Google Scholar] [CrossRef]
- Guo, H. Post-Image Processing of High-Resolution Imaging Sonar. Master’s Thesis, Harbin Engineering University, Harbin, China, 2002. (In Chinese). [Google Scholar]
- Liu, C.C.; Sang, E.F. Underwater Acoustic image processing based on mathematical morphology. J. Jilin Univ. Inf. Sci. Ed. 2003, 21, 52–57. (In Chinese) [Google Scholar]
- Kelly, J.G.; Carpenter, R.N.; Tague, J.A. Object classification and acoustic imaging with active sonar. J. Acoust. Soc. Am. 1992, 91 Pt 1, 2073–2081. [Google Scholar] [CrossRef]
- Ye, X.F.; Zhang, Z.H.; Liu, P.X.; Guan, H.L. Sonar image segmentation based on GMRF and level-set models. Ocean. Eng. 2010, 37, 891–901. [Google Scholar] [CrossRef]
- Wang, X. Research on Underwater Sonar Images Objective Detection and Based Respectively on MRF and Level-Set. Ph.D. Thesis., Harbin Engineering University, Harbin, China, 2010. (In Chinese). [Google Scholar]
- Sheng, H.; Meng, F.; Li, Q.; Ma, G.; Cao, Y. Enhancement Algorithm of Side-scan Sonar Image in Curvelet Transform Domain. Ocean. Surv. Mapp. 2012, 32, 8–17. (In Chinese) [Google Scholar]
- Sheng, Z.; Huo, G. Detection of underwater mine target in sidescan sonar image based on sample simulation and transfer learning. CAAI Trans. Intell. Syst. 2021, 16, 385–392. (In Chinese) [Google Scholar]
- Valdenegro-Toro, M. End-to-end object detection and recognition in forward-looking sonar images with convolutional neural networks. In Proceedings of the 2016 IEEE/OES Autonomous Underwater Vehicles (AUV), Tokyo, Japan, 6–9 November 2016. [Google Scholar]
- Gong, W.; Tian, J.; Huang, H. Underwater sonar image small target recognition method based on shape features. J. Appl. Acoust. 2021, 40, 294–302. (In Chinese) [Google Scholar]
- Bian, H.Y.; Sang, E.F.; Ji, X.C.; Zhao, J.Y. Simulation research on acoustic lens beamforming. J. Harbin Eng. Univ. 2004, 25, 43–45. (In Chinese) [Google Scholar]
- Yang, C.Y.; Xu, F.; Wei, J.J. Seafloor sediments classification using a neighborhood gray level co-occurrence matrix. J. Harbin Eng. Univ. 2005, 26, 561–564. (In Chinese) [Google Scholar]
- Gao, S.; Xu, J.; Zhang, P. Automatic target recognition of mine-like objects in sonar images. Mine Warf. Ship Prot. 2006, 1, 42–45. (In Chinese) [Google Scholar]
- Fandos, R.; Zoubir, A.M.; Siantidis, K. Unified Design of a Feature-Based ADAC System for Mine Hunting Using Synthetic Aperture Sonar. IEEE Trans. Geosci. Remote Sens. 2014, 52, 2413–2426. [Google Scholar] [CrossRef]
- Valdenegro-Toro, M. Objectness Scoring and Detection Proposals in Forward-Looking Sonar Images with Convolutional Neural Networks. In Artificial Neural Networks in Pattern Recognition, Proceedings of the 7th IAPR TC3 Workshop, ANNPR 2016, Ulm, Germany, 28–30 September 2016; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2016; pp. 209–219. [Google Scholar]
- Zhu, K.; Tian, J.; Huang, H. Underwater objects classification method in high-resolution sonar images using deep neural network. Acta Acust. 2019, 44, 595–603. (In Chinese) [Google Scholar]
- Sawas, J.; Petillot, Y.; Pailhas, Y. Cascade of Boosted Classifiers for Rapid Detection of Underwater Objects. In Proceedings of the 10th European Conference on Underwater Acoustics, Istanbul, Turkey, 5–9 July 2010. [Google Scholar]
- Reed, S.; Petillot, Y.; Bell, J. Automated approach to classification of mine-like objects in sidescan sonar using highlight and shadow information. IEE Proc. Radar Sonar Navig. 2004, 151, 48–56. [Google Scholar] [CrossRef]
- Isaacs, J.C. Sonar automatic target recognition for underwater UXO remediation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Williams, D.P.; Groen, J. A fast physics-based, environmentally adaptive underwater object detection algorithm. In Proceedings of the OCEANS 2011 IEEE—Spain, Santander, Spain, 6–9 June 2011. [Google Scholar]
- Fandos, R.; Zoubir, A.M. Optimal Feature Set for Automatic Detection and Classification of Underwater Objects in SAS Images. IEEE J. Sel. Top. Signal Process. 2011, 5, 454–468. [Google Scholar] [CrossRef]
- Myers, V.; Fawcett, J. A Template Matching Procedure for Automatic Target Recognition in Synthetic Aperture Sonar Imagery. IEEE Signal Process. Lett. 2010, 17, 683–686. [Google Scholar] [CrossRef]
- Hurtós, N.; Palomeras, N.; Nagappa, S.; Salvi, J. Automatic detection of underwater chain links using a forward-looking sonar. In Proceedings of the 2013 MTS/IEEE OCEANS—Bergen, Bergen, Norway, 10–14 June 2013. [Google Scholar] [CrossRef]
- Kocak, D.M.; Dalgleish, F.R.; Caimi, F.M.; Schechner, Y.Y. A focus on recent developments and trends in underwater imaging. Mar. Technol. Soc. J. 2008, 42, 52–67. [Google Scholar] [CrossRef]
- Fan, Z.; Xia, W.; Liu, X.; Li, H. Detection and segmentation of underwater objects from forward-looking sonar based on a modified Mask RCNN. Signal Image Video Process. 2021, 15, 1135–1143. [Google Scholar] [CrossRef]
- Zhang, P.; Tang, J.; Zhong, H.; Ning, M.; Liu, D.; Wu, K. Self-Trained Target Detection of Radar and Sonar Images Using Automatic Deep Learning. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–4. [Google Scholar] [CrossRef]
- Xie, K.; Yang, J.; Qiu, K. A Dataset with Multibeam Forward-Looking Sonar for Underwater Object Detection. Sci. Data 2022, 9, 739. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.; Tian, M.; Shao, G.; Cheng, J.; Liu, J. Target Detection of Forward-Looking Sonar Image Based on Improved YOLOv5. IEEE Access 2022, 10, 18023–18034. [Google Scholar] [CrossRef]
- Gaspar, A.R.; Matos, A. Feature-Based Place Recognition Using Forward-Looking Sonar. J. Mar. Sci. Eng. 2023, 11, 2198. [Google Scholar] [CrossRef]
- Jiao, W.; Zhang, J.; Zhang, C. Open-set recognition with long-tail sonar images. Expert Syst. Appl. 2024, 249 Pt A, 123495. [Google Scholar] [CrossRef]
- Li, Y.; Ye, X.; Zhang, W. TransYOLO: High-Performance Object Detector for Forward Looking Sonar Images. IEEE Signal Process. Lett. 2022, 29, 2098–2102. [Google Scholar]
- Haiying Marine. HY1645 Imaging Sonar. Available online: https://www.haiyingmarine.com/index.php?a=shows&catid=74&id=106 (accessed on 15 January 2025).
- Xia, W.; Jin, X.; Dou, F. Thinned Array Design With Minimum Number of Transducers for Multibeam Imaging Sonar. IEEE J. Ocean. Eng. 2017, 42, 892–900. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767v1. [Google Scholar]
- Bochkovsky, A.; Wang, C.Y.; Liao, H.Y. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Huo, G.; Wu, Z.; Li, J. Underwater Object Classification in Sidescan Sonar Images Using Deep Transfer Learning and Semisynthetic Training Data. IEEE Access 2020, 8, 47407–47418. [Google Scholar] [CrossRef]
- Wang, C.Y.; Liao, H.Y.M.; Wu, Y.H.; Chen, P.Y.; Hsieh, J.W.; Yeh, I.H. CSPNet: A New Backbone that can Enhance Learning Capability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019. [Google Scholar]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Liu, Z.; Mao, H.; Wu, C.Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Xing, B.; Sun, M.; Ding, M.; Han, C. Fish sonar image recognition algorithm based on improved YOLOv5. Math. Biosci. Eng. 2024, 21, 1321–1341. [Google Scholar] [CrossRef] [PubMed]
- Qin, K.S.; Liu, D.; Wang, F.; Zhou, J.; Yang, J.; Zhang, W. Improved YOLOv7 model for underwater sonar image object detection. J. Vis. Commun. Image Represent. 2024, 100, 104124. [Google Scholar] [CrossRef]
- Bodla, N.; Singh, B.; Chellappa, R.; Davis, L.S. Soft-NMS--Improving Object Detection With One Line of Code. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar] [CrossRef]
Parameter | Value |
---|---|
Operating Frequency | 450 kHz |
Field of View | 90° × 20° |
Maximum Range | 100 m |
Beam Width (Horizontal × Vertical) | 1° × 20° |
Number of Beams | 538 |
Beam Spacing | 0.17° |
Range Resolution | 2.5 cm |
Maximum Sampling Rate | 15 Hz |
Parameter | Setup |
---|---|
Ubuntu | 18.04.5 LTS |
Pytorch | 2.0.0 |
Python | 3.8 |
CUDA | 11.8 |
GPU | Tesla V100-PCIE-32GB |
CPU | Intel(R) Xeon(R) Gold 6130 |
Parameter | Setup |
---|---|
Epoch | 300 |
Batch | 32 |
NMS IoU | 0.6 |
Initial Learning Rate | 0.01 |
Final Learning Rate | 0.01 |
Momentum | 0.937 |
Weight Decay | 0.0005 |
Multi-Scale Feature Fusion | CBAM | Sonar Loss | Soft-NMS | Precision | Recall | mAP50 | mAP50-95 |
---|---|---|---|---|---|---|---|
0.886 | 0.76 | 0.789 | 0.517 | ||||
✓ | 0.9 | 0.764 | 0.8 | 0.521 | |||
✓ | 0.912 | 0.764 | 0.808 | 0.524 | |||
✓ | ✓ | 0.919 | 0.771 | 0.811 | 0.525 | ||
✓ | ✓ | ✓ | 0.896 | 0.785 | 0.819 | 0.528 | |
✓ | ✓ | ✓ | ✓ | 0.888 | 0.798 | 0.821 | 0.529 |
Model | Precision | Recall | mAP@50 | Params/M | FPS |
---|---|---|---|---|---|
SSD300 | 0.238 | 0.403 | 0.670 | 23.88 | 9.1 |
YOLOv3 | 0.364 | 0.455 | 0.783 | 61.52 | 46.7 |
Faster R-CNN | 0.328 | 0.429 | 0.760 | 41.35 | 19.4 |
Cascade R-CNN | 0.333 | 0.438 | 0.752 | 69.15 | 15.5 |
AquaPile-YOLO | 0.888 | 0.798 | 0.821 | 46.60 | 111.1 |
Model | AP (Ball) | AP (Cube) | AP (Tyre) | AP (sc) | AP (cc) | AP (Pile) |
---|---|---|---|---|---|---|
Faster-RCNN (Resnet-18) | 0.869 | 0.717 | 0.847 | 0.547 | 0.666 | - |
Faster-RCNN(Resnet-50) | 0.870 | 0.686 | 0.889 | 0.621 | 0.538 | 0.328 |
Faster-RCNN(Resnet-101) | 0.865 | 0.697 | 0.840 | 0.572 | 0.491 | 0.333 |
YOLOv3 (Darknet-53) | 0.860 | 0.669 | 0.874 | 0.470 | 0.519 | - |
YOLOv3 (MobilenetV2) | 0.868 | 0.573 | 0.738 | 0.518 | 0.498 | 0.364 |
AquaPile-YOLO | - | - | - | - | - | 0.888 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, Z.; Wang, R.; Cao, T.; Guo, W.; Shi, B.; Ge, Q. AquaPile-YOLO: Pioneering Underwater Pile Foundation Detection with Forward-Looking Sonar Image Processing. Remote Sens. 2025, 17, 360. https://doi.org/10.3390/rs17030360
Xu Z, Wang R, Cao T, Guo W, Shi B, Ge Q. AquaPile-YOLO: Pioneering Underwater Pile Foundation Detection with Forward-Looking Sonar Image Processing. Remote Sensing. 2025; 17(3):360. https://doi.org/10.3390/rs17030360
Chicago/Turabian StyleXu, Zhongwei, Rui Wang, Tianyu Cao, Wenbo Guo, Bo Shi, and Qiqi Ge. 2025. "AquaPile-YOLO: Pioneering Underwater Pile Foundation Detection with Forward-Looking Sonar Image Processing" Remote Sensing 17, no. 3: 360. https://doi.org/10.3390/rs17030360
APA StyleXu, Z., Wang, R., Cao, T., Guo, W., Shi, B., & Ge, Q. (2025). AquaPile-YOLO: Pioneering Underwater Pile Foundation Detection with Forward-Looking Sonar Image Processing. Remote Sensing, 17(3), 360. https://doi.org/10.3390/rs17030360