DSA-SOLO: Double Split Attention SOLO for Side-Scan Sonar Target Segmentation
Abstract
:1. Introduction
- (1)
- We proposed a novel model named DSA-SOLO for side-scan sonar image instance segmentation, and experimentally demonstrated its effectiveness in the object segmentation of side-scan sonar images.
- (2)
- We proposed a DSA module which fuses spatial and channel attention together to extract the target feature. This model improves segmentation accuracy without affecting the speed.
- (3)
- The experimental results contrasting to the existed instance segmentation methods on SCTD [21] dataset show that the proposed DSA-SOLO can achieve better performance.
2. Literature Review
2.1. Instance Segmentation for Sonar Images
2.2. Attention Mechanisms
3. Methods
3.1. DSA-SOLO
3.2. Double Split Attention (DSA) Module
3.2.1. Channel Attention
3.2.2. Spatial Attention
3.3. Loss Function
4. Experimental Results
4.1. Dataset
4.2. Implementation Detail and Evaluation Indexes
4.3. Comparative Experiments
4.4. Ablation Experiments
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Guo, Y.; Wei, L.; Xu, X. A sonar image segmentation algorithm based on quantum-inspired particle swarm optimization and fuzzy clustering. Neural Comput. Appl. 2018, 32, 16775–16782. [Google Scholar] [CrossRef]
- Huo, G.; Yang, S.; Li, Q.; Zhou, Y. A Robust and Fast Method for Sidescan Sonar Image Segmentation Using Nonlocal Despeckling and Active Contour Model. IEEE Trans. Cybern. 2017, 47, 855–872. [Google Scholar] [CrossRef] [PubMed]
- Steele, S.; Ejdrygiewicz, J.; Dillon, J. Automated Synthetic Aperture Sonar Image Segmentation using Spatially Coherent Clustering. In Proceedings of the OCEANS 2021: San Diego—Porto, San Diego, CA, USA, 20–23 September 2021. [Google Scholar]
- Chabane, A.N.; Islam, N.; Zerr, B. Incremental clustering of sonar images using self-organizing maps combined with fuzzy adaptive resonance theory. Ocean Eng. 2017, 142, 133–144. [Google Scholar] [CrossRef]
- Liu, Y.; Li, Q.; Huo, G. Robust and fast-converging level set method for side-scan sonar image segmentation. J. Electron. Imaging 2017, 26, 063021. [Google Scholar] [CrossRef]
- Imen, K.; Fablet, R.; Boucher, J.M.; Augustin, J.M. Region-based and incidence angle dependent segmentation of seabed sonar images using a level set approach combined to local texture statistics. In Proceedings of the OCEANS 2006—Asia Pacific, Singapore, 16–19 May 2006. [Google Scholar]
- Wang, L.; Ye, X.; Wang, G.; Wang, L. A Fast Hierarchical MRF Sonar Image Segmentation Algorithm. Int. J. Robot. Autom 2017, 32, 48–54. [Google Scholar] [CrossRef]
- Li, J.; Jiang, P.; Zhu, H. A Local Region-Based Level Set Method With Markov Random Field for Side-Scan Sonar Image Multi-Level Segmentation. IEEE Sens. J. 2021, 21, 510–519. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Identity Mappings in Deep Residual Networks. arXiv 2016, arXiv:1603.05027. [Google Scholar]
- Shelhamer, E.; Long, J.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 640–651. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. Proc. Int. Conf. Med. Image Comput. Comput. Assist. Interv. 2015, 9351, 234–241. [Google Scholar]
- Lin, G.; Milan, A.; Shen, C.; Reid, I. RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5168–5177. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar]
- Sledge, I.J.; Emigh, M.S.; King, J.L.; Woods, D.L.; Cobb, J.T.; Principe, J.C. Target Detection and Segmentation in Circular-Scan Synthetic Aperture Sonar Images Using Semisupervised Convolutional Encoder–Decoders. arXiv 2022, arXiv:2101.03603. [Google Scholar] [CrossRef]
- Yu, F.; He, B.; Li, K.; Yan, T.; Shen, Y.; Wang, Q.; Wu, M. Side-scan sonar images segmentation for AUV with recurrent residual convolutional neural network module and self-guidance module. Appl. Ocean Res. 2021, 113, 102608. [Google Scholar] [CrossRef]
- Wang, Z.; Guo, J.; Huang, W.; Zhang, S. Side-Scan Sonar Image Segmentation Based on Multi-Channel Fusion Convolution Neural Networks. IEEE Sens. J. 2022, 22, 5911–5928. [Google Scholar] [CrossRef]
- Ma, N.; Zhang, X.; Zheng, H.-T.; Sun, J. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. arXiv 2018, arXiv:1807.11164. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.; Kweon, I.S. CBAM: Convolutional Block Attention Module. arXiv 2018, arXiv:1807.06521v2. [Google Scholar]
- Zhang, P.; Tang, J.; Zhong, H.; Ning, M.; Liu, D.; Wu, K. Self-Trained Target Detection of Radar and Sonar Images Using Automatic Deep Learning. IEEE T. Geosci. Remote 2022, 60, 1–14. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 386–397. [Google Scholar] [CrossRef]
- Liu, S.; Jia, J.; Fidler, S.; Urtasun, R. SGN: Sequential Grouping Networks for Instance Segmentation. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Gao, N.; Shan, Y.; Wang, Y.; Zhao, X.; Huang, K. SSAP: Single-Shot Instance Segmentation With Affinity Pyramid. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 661–673. [Google Scholar] [CrossRef]
- Bolya, D.; Zhou, C.; Xiao, F.; Lee, Y.J. YOLACT Real-time Instance Segmentation. arXiv 2019, arXiv:1904.02689v2. [Google Scholar]
- Xie, E.; Sun, P.; Song, X.; Wang, W.; Liang, D.; Shen, C.; Luo, P. PolarMask: Single Shot Instance Segmentation with Polar Representation. arXiv 2020, arXiv:1909.13226v4. [Google Scholar]
- Wang, X.; Zhang, R.; Shen, C.; Kong, T.; Li, L. SOLO: A Simple Framework for Instance Segmentation. arXiv 2021, arXiv:2106.15974v1. [Google Scholar] [CrossRef]
- Xu, F.; Huang, H.; Wu, J.; Jiang, L. Active Mask-Box Scoring R-CNN for Sonar Image Instance Segmentation. Electronics 2022, 11, 2048. [Google Scholar] [CrossRef]
- Fan, Z.; Xia, W.; Liu, X.; Li, H. Detection and segmentation of underwater objects from forward-looking sonar based on a modified Mask RCNN. Signal. Image Video Process. 2021, 15, 1135–1143. [Google Scholar] [CrossRef]
- Kessel, R.T. Using sonar speckle to identify regions of interest and for mine detection. Proc. Detect. Remediat. Technol. Mines Minelike Targets 2002, 4742, 440–451. [Google Scholar]
- Jaderberg, M.; Simonyan, K.; Zisserman, A.; Kavukcuoglu, K. Spatial transformer networks. arXiv 2015, arXiv:1506.02025. [Google Scholar]
- Almahairi, A.; Ballas, N.; Cooijmans, T.; Zheng, Y.; Larochelle, H.; Courville, A. Dynamic Capacity Networks. arXiv 2015, arXiv:1511.07838v7. [Google Scholar]
- Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2011–2023. [Google Scholar] [CrossRef] [Green Version]
- Park, J.; Woo, S.; Lee, J.; Kweon, I.S. BAM: Bottleneck Attention Module. arXiv 2018, arXiv:1807.06514v2. [Google Scholar]
- Lin, T.-Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. arXiv 2016, arXiv:1612.03144v2. [Google Scholar]
- Wu, Y.; He, K. Group Normalization. arXiv 2018, arXiv:1803.08494. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proc. Int. Conf. Mach. Learn. 2015, 37, 448–456. [Google Scholar]
- Ba, J.L.; Kiros, J.R.; Hinton, G.E. Layer normalization. arXiv 2016, arXiv:1607.06450. [Google Scholar]
- Ulyanov, D.; Vedaldi, A.; Lempitsky, V. Instance normalization: The missing ingredient for fast stylization. arXiv 2016, arXiv:1607.08022. [Google Scholar]
- Li, T.Y.; Goyal, P.; Grishick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar]
- Sun, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual Attention Network for Scene Segmentation. arXiv 2019, arXiv:1809.02983v4. [Google Scholar]
Stage | Output Size | Backbone |
---|---|---|
Conv1 | 112 × 112 | 7 × 7 conv, 64 |
Conv2 | 56 × 56 | 3 × 3 max pooling |
Conv3 | 28 × 28 | |
Conv4 | 14 × 14 | |
Conv5 | 7 × 7 |
Label | Train | Val |
---|---|---|
ship | 295 | 83 |
plane | 72 | 18 |
body | 37 | 9 |
Model | mAP.5:.95 | mAP.5 | mAP.75 | FPS |
---|---|---|---|---|
Mask R-CNN | 37.8% | 71.8% | 31.6% | 7.9 |
YOLACT | 33.6% | 65.7% | 15.3% | 13.13 |
Polar Mask | 34.5% | 68.6% | 17.6% | 11.97 |
SOLOv2 | 40.0% | 73.3% | 35.8% | 18.64 |
DSA-SOLO | 42.7% | 78.4% | 43.2% | 18.14 |
Attention Module | mAP.5:.95 | mAP.5 | mAP.75 | FPS |
---|---|---|---|---|
SENet | 37.8% | 72.6% | 31.8% | 18.89 |
STN | 36.9% | 73.5% | 32.8% | 18.35 |
CBAM | 39.2% | 74.3% | 39.8% | 19.04 |
DANet | 38.1% | 73.8% | 31.6% | 19.02 |
DSA | 42.7% | 78.4% | 43.2% | 18.14 |
Model | mAP.5:.95 | mAP.5 | mAP.75 | FPS |
---|---|---|---|---|
SOLOv2 | 40.0% | 73.3% | 35.8% | 18.64 |
C-S Unit Only | 40.2% | 75.2% | 35.7% | 19.02 |
S-C Unit Only | 41.9% | 75.9% | 44.3% | 18.39 |
DSA | 42.7% | 78.4% | 43.2% | 18.14 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, H.; Zuo, Z.; Sun, B.; Wu, P.; Zhang, J. DSA-SOLO: Double Split Attention SOLO for Side-Scan Sonar Target Segmentation. Appl. Sci. 2022, 12, 9365. https://doi.org/10.3390/app12189365
Huang H, Zuo Z, Sun B, Wu P, Zhang J. DSA-SOLO: Double Split Attention SOLO for Side-Scan Sonar Target Segmentation. Applied Sciences. 2022; 12(18):9365. https://doi.org/10.3390/app12189365
Chicago/Turabian StyleHuang, Honghe, Zhen Zuo, Bei Sun, Peng Wu, and Jiaju Zhang. 2022. "DSA-SOLO: Double Split Attention SOLO for Side-Scan Sonar Target Segmentation" Applied Sciences 12, no. 18: 9365. https://doi.org/10.3390/app12189365
APA StyleHuang, H., Zuo, Z., Sun, B., Wu, P., & Zhang, J. (2022). DSA-SOLO: Double Split Attention SOLO for Side-Scan Sonar Target Segmentation. Applied Sciences, 12(18), 9365. https://doi.org/10.3390/app12189365