G-Net: An Efficient Convolutional Network for Underwater Object Detection
Abstract
:1. Introduction
- We propose a data-driven approach based on an end-to-end target detection model. It fuses image enhancement with target detection in the feature extraction part and enables accurate detection of target information of underwater blurred images based on enhanced features, which can be generalized to real-world underwater scenarios.
- In order to simplify the process of image enhancement, feature extraction, object classification and boundary regression in traditional methods, image enhancement and feature extraction for object detection are combined to realize the integration of image enhancement and object detection. By inputting hazy images, our G-Net network can enhance the detection features of degraded images and directly output the target detection results. The perception efficiency of the underwater unmanned boat is greatly improved.
- In order to better capture the degradation features, we propose a feature map learning (FML) module, using clear image features to guide the network to enhance the detection features of degraded images.
- Our proposed lightweight neural network not only exhibits good performance on the dataset but also performs well on real underwater environment images. It has significant performance improvement compared with other underwater object detection algorithms.
2. Related Works
2.1. Object Detection Based on Deep Learning
2.2. Underwater Object Detection Based on Deep Learning
3. Proposed Methods
3.1. Model Workflow
3.2. ODFME Module
3.3. DTR Detection Module
3.4. FML Module
3.5. Loss Function
4. Experiment
4.1. Datasets
4.2. Evaluation Metric
4.3. Implementation Details
4.4. Contrast Experiment
4.5. Ablation Study
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhang, S.; Zhao, S.; An, D.; Liu, J.; Wang, H.; Feng, Y.; Li, D.; Zhao, R. Visual SLAM for Underwater Vehicles: A Survey. Comput. Sci. Rev. 2022, 46, 100510. [Google Scholar] [CrossRef]
- Li, C.-Y.; Guo, J.-C.; Cong, R.-M.; Pang, Y.-W.; Wang, B. Underwater Image Enhancement by Dehazing With Minimum Information Loss and Histogram Distribution Prior. IEEE Trans. Image Process. 2016, 25, 5664–5677. [Google Scholar] [CrossRef]
- Islam, M.J.; Xia, Y.; Sattar, J. Fast Underwater Image Enhancement for Improved Visual Perception. IEEE Robot. Autom. Lett. 2020, 5, 3227–3234. [Google Scholar] [CrossRef]
- Gong, T.; Zhang, M.; Zhou, Y.; Bai, H. Underwater Image Enhancement Based on Color Feature Fusion. Electronics 2023, 12, 4999. [Google Scholar] [CrossRef]
- Yang, H.-H.; Huang, K.-C.; Chen, W.-T. LAFFNet: A Lightweight Adaptive Feature Fusion Network for Underwater Image Enhancement. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May 2021; pp. 685–692. [Google Scholar]
- Akkaynak, D.; Treibitz, T. A Revised Underwater Image Formation Model. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6723–6732. [Google Scholar]
- Akkaynak, D.; Treibitz, T. Sea-Thru: A Method for Removing Water from Underwater Images. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 1682–1691. [Google Scholar]
- Ueda, T.; Yamada, K.; Tanaka, Y. Underwater Image Synthesis from RGB-D Images and Its Application to Deep Underwater Image Restoration. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 2115–2119. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Volume 9905, pp. 21–37. [Google Scholar]
- Reis, D.; Kupec, J.; Hong, J.; Daoudi, A. Real-Time Flying Object Detection with YOLOv8 2023. arXiv 2023, arXiv:2305.09972. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2999–3007. [Google Scholar]
- Mieske, S.; Hilker, M.; Infante, L. Fornax Compact Object Survey FCOS: On the Nature of Ultra Compact Dwarf Galaxies. Astron. Astrophys. 2004, 418, 445–458. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving Into High Quality Object Detection. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [Google Scholar]
- Wang, Z.; Chen, H.; Qin, H.; Chen, Q. Self-Supervised Pre-Training Joint Framework: Assisting Lightweight Detection Network for Underwater Object Detection. J. Mar. Sci. Eng. 2023, 11, 604. [Google Scholar] [CrossRef]
- Lei, F.; Tang, F.; Li, S. Underwater Target Detection Algorithm Based on Improved YOLOv5. J. Mar. Sci. Eng. 2022, 10, 310. [Google Scholar] [CrossRef]
- Song, P.; Li, P.; Dai, L.; Wang, T.; Chen, Z. Boosting R-CNN: Reweighting R-CNN Samples by RPN’s Error for Underwater Object Detection. Neurocomputing 2023, 530, 150–164. [Google Scholar] [CrossRef]
- Zeng, L.; Sun, B.; Zhu, D. Underwater Target Detection Based on Faster R-CNN and Adversarial Occlusion Network. Eng. Appl. Artif. Intell. 2021, 100, 104190. [Google Scholar] [CrossRef]
- Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object Detection in 20 Years: A Survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
- Viola, P.; Jones, M. Robust Real-Time Face Detection. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001; IEEE Computer Society: Vancouver, BC, Canada, 2001; Volume 2, p. 747. [Google Scholar]
- Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 1, pp. 886–893. [Google Scholar]
- Felzenszwalb, P.F.; Girshick, R.B.; McAllester, D.; Ramanan, D. Object Detection with Discriminatively Trained Part-Based Models. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 1627–1645. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Yeh, C.-H.; Huang, C.-H.; Kang, L.-W. Multi-Scale Deep Residual Learning-Based Single Image Haze Removal via Image Decomposition. IEEE Trans. Image Process. 2020, 29, 3153–3167. [Google Scholar] [CrossRef] [PubMed]
- Yeh, C.-H.; Lin, M.-H.; Chang, P.-C.; Kang, L.-W. Enhanced Visual Attention-Guided Deep Neural Networks for Image Classification. IEEE Access 2020, 8, 163447–163457. [Google Scholar] [CrossRef]
- Lin, C.-Y.; Tao, Z.; Xu, A.-S.; Kang, L.-W.; Akhyar, F. Sequential Dual Attention Network for Rain Streak Removal in a Single Image. IEEE Trans. Image Process. 2020, 29, 9250–9265. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Li, X.; Shang, M.; Qin, H.; Chen, L. Fast Accurate Fish Detection and Recognition of Underwater Images with Fast R-CNN. In Proceedings of the OCEANS 2015—MTS/IEEE Washington, Washington, DC, USA, 19–22 October 2015; pp. 1–5. [Google Scholar]
- Li, X.; Cui, Z. Deep Residual Networks for Plankton Classification. In Proceedings of the OCEANS 2016 MTS/IEEE Monterey, Monterey, CA, USA, 19–23 September 2016; pp. 1–4. [Google Scholar]
- Li, X.; Tang, Y.; Gao, T. Deep But Lightweight Neural Networks for Fish Detection. In Proceedings of the OCEANS 2017—Aberdeen, Aberdeen, UK, 19–22 June 2017; pp. 1–5. [Google Scholar]
- Jiang, Y.; Li, W.; Zhang, J.; Li, F.; Wu, Z. YOLOv4-dense: A Smaller and Faster YOLOv4 for Real-time Edge-device Based Object Detection in Traffic Scene. IET Image Process. 2023, 17, 570–580. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. In Proceedings of the Computer Vision—ECCV 2014, Zurich, Switzerland, 6–12 September 2014; pp. 346–361. Available online: https://link.springer.com/chapter/10.1007/978-3-319-10578-9_23 (accessed on 2 December 2023).
- Yeh, C.-H.; Lin, C.-H.; Kang, L.-W.; Huang, C.-H.; Lin, M.-H.; Chang, C.-Y.; Wang, C.-C. Lightweight Deep Neural Network for Joint Learning of Underwater Object Detection and Color Conversion. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 6129–6143. [Google Scholar] [CrossRef]
- Deng, J.; Pan, Y.; Yao, T.; Zhou, W.; Li, H.; Mei, T. Single Shot Video Object Detector. IEEE Trans. Multimed. 2021, 23, 846–858. [Google Scholar] [CrossRef]
- Yu, K.; Cheng, Y.; Tian, Z.; Zhang, K. High Speed and Precision Underwater Biological Detection Based on the Improved YOLOV4-Tiny Algorithm. J. Mar. Sci. Eng. 2022, 10, 1821. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
- Zhang, Y.; Tian, Y.; Kong, Y.; Zhong, B.; Fu, Y. Residual Dense Network for Image Super-Resolution. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2472–2481. [Google Scholar]
- Peng, Y.-T.; Cosman, P.C. Underwater Image Restoration Based on Image Blurriness and Light Absorption. IEEE Trans. Image Process. 2017, 26, 1579–1594. [Google Scholar] [CrossRef]
- Peng, Y.-T.; Cao, K.; Cosman, P.C. Generalization of the Dark Channel Prior for Single Image Restoration. IEEE Trans. Image Process. 2018, 27, 2856–2868. [Google Scholar] [CrossRef]
- Chiang, J.Y.; Chen, Y.C. Underwater Image Enhancement by Wavelength Compensation and Dehazing. IEEE Trans. Image Process. 2012, 21, 1756–1769. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Computer Vision—ECCV 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 11211, pp. 3–19. ISBN 978-3-030-01233-5. [Google Scholar]
- Zhao, H.; Gallo, O.; Frosio, I.; Kautz, J. Loss Functions for Image Restoration With Neural Networks. IEEE Trans. Comput. Imaging 2017, 3, 47–57. [Google Scholar] [CrossRef]
True Detect | False Detect | |
---|---|---|
True ground | TP | FN |
False ground | FP | TN |
Image Enhance Method | Detection Method | mAP(%) | AP(%) | Params (M) | FLOPs (G) | Time (ms) | |||
---|---|---|---|---|---|---|---|---|---|
C1 | C2 | C3 | C4 | ||||||
- | YOLOv4 | 70.92 | 86.67 | 52.86 | 67.19 | 76.96 | 63.98 | 22.72 | 4.5 |
UDCP | YOLOv4 | 71.62 | 86.74 | 53.54 | 64.30 | 81.89 | 63.98 | 22.72 | 12.0 |
UGan | YOLOv4 | 70.64 | 83.86 | 53.96 | 70.48 | 74.27 | 105.81 | 58.85 | 8.2 |
CycleGan | YOLOv4 | 72.49 | 87.29 | 54.43 | 70.29 | 77.49 | 78.12 | 113.75 | 7.7 |
PUIE | YOLOv4 | 74.02 | 87.94 | 58.38 | 71.48 | 78.27 | 65.38 | 82.69 | 7.3 |
G-Net | 75.46 | 87.08 | 59.63 | 74.02 | 81.11 | 78.75 | 33.41 | 7.1 |
FML Module | Att_res Module | mAP(%) | AP(%) | Params (M) | FLOPs (G) | Time (ms) | |||
---|---|---|---|---|---|---|---|---|---|
C1 | C2 | C3 | C4 | ||||||
✓ | 74.86 | 88.06 | 58.90 | 73.40 | 79.08 | 65.031 | 27.351 | 6.3 | |
✓ | 74.52 | 87.43 | 58.62 | 75.30 | 79.28 | 78.75 | 33.41 | 7.1 | |
✓ | ✓ | 75.46 | 87.08 | 59.63 | 74.02 | 81.11 | 78.75 | 33.41 | 7.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, X.; Wang, Z.; Deng, Z.; Qin, H. G-Net: An Efficient Convolutional Network for Underwater Object Detection. J. Mar. Sci. Eng. 2024, 12, 116. https://doi.org/10.3390/jmse12010116
Zhao X, Wang Z, Deng Z, Qin H. G-Net: An Efficient Convolutional Network for Underwater Object Detection. Journal of Marine Science and Engineering. 2024; 12(1):116. https://doi.org/10.3390/jmse12010116
Chicago/Turabian StyleZhao, Xiaoyang, Zhuo Wang, Zhongchao Deng, and Hongde Qin. 2024. "G-Net: An Efficient Convolutional Network for Underwater Object Detection" Journal of Marine Science and Engineering 12, no. 1: 116. https://doi.org/10.3390/jmse12010116
APA StyleZhao, X., Wang, Z., Deng, Z., & Qin, H. (2024). G-Net: An Efficient Convolutional Network for Underwater Object Detection. Journal of Marine Science and Engineering, 12(1), 116. https://doi.org/10.3390/jmse12010116