RTL-YOLOv8n: A Lightweight Model for Efficient and Accurate Underwater Target Detection
Abstract
:1. Introduction
- Improved the feature-extraction method in the backbone part—using the spatial-decay matrix and Manhattan self-attention mechanism (MaSA), which reduces the computational burden while improving accuracy, addressing the efficiency challenges in global information modeling.
- Enhanced the feature-rich representation and improved the model’s ability to capture definitive features through a parameter-free attention mechanism. Focusing on capturing cross-dimensional interactions addresses the limitations of existing methods, achieving significant performance improvement with minimal computational overhead, and enhancing the model’s network feature-extraction capabilities.
- Designed a lightweight coupled detection head that significantly reduces the number of parameters by using shared convolutions. By using the scale layer to scale different features, the detection head ensures a balance between model accuracy and complexity with reduced computational cost.
- Combined Focaler–MPDIoU to enhance detector performance by focusing on the detection box accuracy in object detection tasks and addressing the distribution of easy and hard samples, effectively improving detection accuracy.
2. Related Work
3. Materials and Methods
3.1. Data Source
3.2. Method
3.2.1. YOLOv8n Baseline
3.2.2. RetBlock
3.2.3. Triplet Attention
- First branch: Processes the interaction between the height dimension (H) and the channel dimension (C). The input tensor is rotated 90° counterclockwise (anti-clockwise) about the H axis to obtain a tensor of shape (W × H × C). Then, passes through a standard convolution layer with kernel size k = 7 and a batch normalization layer, providing an intermediate dimension output (1 × H × C). The tensor then passes through a sigmoid activation layer () to generate the attention weights applied to the input tensor . Finally, it is rotated 90° clockwise along the H axis to maintain the original input shape .
- Second branch: Processes the interaction between the width dimension (W) and the channel dimension (C). The input tensor is rotated 90° counterclockwise about the W axis to obtain a tensor of shape (H × C × W). Similar to the processing in the first branch, passes through a Z-pool and a convolution layer to generate the attention weights, , and finally rotates 90° clockwise along the W axis to maintain the same shape as the input tensor .
- Third branch: Constructs spatial attention, handling the dependency between the height and width dimensions (H and W). The input tensor is reduced in dimensionality by Z-pool to two channels, obtaining a tensor of shape (2 × H × W). Then, is processed through a convolution layer to generate attention weights of shape (1 × H × W) and applied to the input tensor . Finally, the refined tensors of shape (C × H × W) generated by the three branches are aggregated by averaging.
3.2.4. LCD-Head
3.2.5. Focaler–MPDIoU
4. Experiments
4.1. Experiment Platform
4.2. Experiment Result
4.3. Ablation Experiments
4.4. Comparative Experiments
5. Discussion and Future Development
5.1. Discussion
5.2. Future Development
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Stevens, J.R.; Newton, R.W.; Tlusty, M.; Little, D.C. The rise of aquaculture by-products: Increasing food production, value, and sustainability through strategic utilisation. Mar. Policy 2018, 90, 115–124. [Google Scholar] [CrossRef]
- Campbell, B.; Pauly, D. Mariculture: A global analysis of production trends since 1950. Mar. Policy 2013, 39, 94–100. [Google Scholar] [CrossRef]
- Wang, Q.; Liu, H.; Sui, J. Mariculture: Developments, present status and prospects. In Aquaculture in China: Success Stories and Modern Trends; Wiley: Hoboken, NJ, USA, 2018; pp. 38–54. [Google Scholar]
- Mandić, M.; Ikica, Z.; Gvozdenović, S. Mariculture in the Boka Kotorska Bay: Tradition, current state and perspective. In The Boka Kotorska Bay Environment; Springer: Berlin/Heidelberg, Germany, 2017; pp. 395–409. [Google Scholar]
- Zheng, L.; Liu, Q.; Liu, J.; Xiao, J.; Xu, G. Pollution control of industrial mariculture wastewater: A mini-review. Water 2022, 14, 1390. [Google Scholar] [CrossRef]
- Wang, Z.; Liu, H.; Zhang, G.; Yang, X.; Wen, L.; Zhao, W. Diseased fish detection in the underwater environment using an improved yolov5 network for intensive aquaculture. Fishes 2023, 8, 169. [Google Scholar] [CrossRef]
- Gao, T.; Xiong, Z.; Li, Z.; Huang, X.; Liu, Y.; Cai, K. Precise underwater fish measurement: A geometric approach leveraging medium regression. Comput. Electron. Agric. 2024, 221, 108932. [Google Scholar] [CrossRef]
- Cai, K.; Miao, X.; Wang, W.; Pang, H.; Liu, Y.; Song, J. A modified YOLOv3 model for fish detection based on MobileNetv1 as backbone. Aquac. Eng. 2020, 91, 102117. [Google Scholar] [CrossRef]
- Khan, A.; Fouda, M.M.; Do, D.-T.; Almaleh, A.; Alqahtani, A.M.; Rahman, A.U. Underwater target detection using deep learning: Methodologies, challenges, applications and future evolution. IEEE Access 2024, 12, 12618–12635. [Google Scholar] [CrossRef]
- Peli, T.; Malah, D. A study of edge detection algorithms. Comput. Graph. Image Process. 1982, 20, 1–21. [Google Scholar] [CrossRef]
- Belongie, S.; Malik, J.; Puzicha, J. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 509–522. [Google Scholar] [CrossRef]
- Bhanu, B. Automatic target recognition: State of the art survey. IEEE Trans. Aerosp. Electron. Syst. 1986, 22, 364–379. [Google Scholar] [CrossRef]
- Chapelle, O.; Haffner, P.; Vapnik, V. Support vector machines for histogram-based image classification. IEEE Trans. Neural Netw. 1999, 10, 1055–1064. [Google Scholar] [CrossRef]
- Zhou, H.; Jiang, T. Decision tree based sea-surface weak target detection with false alarm rate controllable. IEEE Signal Process. Lett. 2019, 26, 793–797. [Google Scholar] [CrossRef]
- Guo, Z.X.; Shui, P.L. Anomaly based sea-surface small target detection using K-nearest neighbor classification. IEEE Trans. Aerosp. Electron. Syst. 2020, 56, 4947–4964. [Google Scholar] [CrossRef]
- Lei, F.; Tang, F.; Li, S. Underwater target detection algorithm based on improved YOLOv5. J. Mar. Sci. Eng. 2022, 10, 310. [Google Scholar] [CrossRef]
- Nanni, L.; Ghidoni, S.; Brahnam, S. Handcrafted vs. non-handcrafted features for computer vision classification. Pattern Recognit. 2017, 71, 158–172. [Google Scholar] [CrossRef]
- Devulapalli, S.; Potti, A.; Krishnan, R.; Khan, S. Experimental evaluation of unsupervised image retrieval application using hybrid feature extraction by integrating deep learning and handcrafted techniques. Mater. Today Proc. 2023, 81, 983–988. [Google Scholar] [CrossRef]
- Kamal, S.; Mohammed, S.K.; Pillai, P.R.S.; Supriya, M.H. Deep learning architectures for underwater target recognition. In Proceedings of the 2013 Ocean Electronics (SYMPOL), Kochi, India, 23–25 October 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 48–54. [Google Scholar]
- Liu, P.; Hongbo, Y.A.N.G.; Hu, Y.; Fu, J. Research on target recognition of underwater robot. In Proceedings of the 2018 IEEE International Conference on Advanced Manufacturing (ICAM), Yunlin, Taiwan, 16–18 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 463–466. [Google Scholar]
- Er, M.J.; Chen, J.; Zhang, Y.; Gao, W. Research challenges, recent advances, and popular datasets in deep learning-based underwater marine object detection: A review. Sensors 2023, 23, 1990. [Google Scholar] [CrossRef] [PubMed]
- Zeng, L.; Sun, B.; Zhu, D. Underwater target detection based on Faster R-CNN and adversarial occlusion network. Eng. Appl. Artif. Intell. 2021, 100, 104190. [Google Scholar] [CrossRef]
- Wang, Q.; Zhang, Y.; He, B. Intelligent Marine Survey: Lightweight Multi-Scale Attention Adaptive Segmentation Framework for Underwater Target Detection of AUV. IEEE Trans. Autom. Sci. Eng. 2024. [Google Scholar] [CrossRef]
- Han, F.; Yao, J.; Zhu, H.; Wang, C. Underwater image processing and object detection based on deep CNN method. J. Sens. 2020, 2020, 6707328. [Google Scholar] [CrossRef]
- Liu, K.; Sun, Q.; Sun, D.; Peng, L.; Yang, M.; Wang, N. Underwater target detection based on improved YOLOv7. J. Mar. Sci. Eng. 2023, 11, 677. [Google Scholar] [CrossRef]
- Zhai, X.; Wei, H.; He, Y.; Shang, Y.; Liu, C. Underwater sea cucumber identification based on improved YOLOv5. Appl. Sci. 2022, 12, 9105. [Google Scholar] [CrossRef]
- Wang, P.; Yang, Z.; Pang, H.; Zhang, T.; Cai, K. A novel fft_yolox model for underwater precious marine product detection. Appl. Sci. 2022, 12, 6801. [Google Scholar] [CrossRef]
- Bao, Z.; Guo, Y.; Wang, J.; Zhu, L.; Huang, J.; Yan, S. Underwater target detection based on parallel high-resolution networks. Sensors 2023, 23, 7337. [Google Scholar] [CrossRef]
- Chen, L.; Zheng, M.; Duan, S.; Luo, W.; Yao, L. Underwater target recognition based on improved YOLOv4 neural network. Electronics 2021, 10, 1634. [Google Scholar] [CrossRef]
- Fan, Y.; Zhang, L.; Li, P. A Lightweight Model of Underwater Object Detection Based on YOLOv8n for an Edge Computing Platform. J. Mar. Sci. Eng. 2024, 12, 697. [Google Scholar] [CrossRef]
- Wei, Y.; Fang, Y.; Cheng, F.; Zhang, M.; Cao, M.; Zhang, H. A lightweight underwater target detection network for seafood. In Proceedings of the 2023 42nd Chinese Control Conference (CCC), Tianjin, China, 242–26 July 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 8381–8387. [Google Scholar]
- Liu, Q.; Huang, W.; Duan, X.; Wei, J.; Hu, T.; Yu, J.; Huang, J. DSW-YOLOv8n: A new underwater target detection algorithm based on improved YOLOv8n. Electronics 2023, 12, 3892. [Google Scholar] [CrossRef]
- Wang, X.; Gao, H.; Jia, Z.; Li, Z. BL-YOLOv8: An improved road defect detection model based on YOLOv8. Sensors 2023, 23, 8361. [Google Scholar] [CrossRef]
- Tian, Y.; Zhao, C.; Zhang, T.; Wu, H.; Zhao, Y. Recognition Method of Cabbage Heads at Harvest Stage under Complex Background Based on Improved YOLOv8n. Agriculture 2024, 14, 1125. [Google Scholar] [CrossRef]
- Pon, M.Z.A.; Krishna Prakash, K.K. Hyperparameter tuning of deep learning models in keras. Sparkling Light Trans. Artif. Intell. Quantum Comput. STAIQC 2021, 1, 36–40. [Google Scholar] [CrossRef]
- Ding, X.; Zhang, X.; Han, J.; Ding, G. Diverse branch block: Building a convolution as an inception-like unit. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 July 2021; pp. 10886–10895. [Google Scholar]
- Han, K.; Wang, Y.; Guo, J.; Wu, E. ParameterNet: Parameters Are All You Need for Large-scale Visual Pretraining of Mobile Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 15751–15761. [Google Scholar]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. Detrs beat yolos on real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 16965–16974. [Google Scholar]
. | YOLOv8n | RTL | V8n-Head | LCD | V8n-Head/v8n (%) | LCD/RTL (%) | LCD/v8n-Head (%) |
---|---|---|---|---|---|---|---|
FLOPS (G) | 8.1 | 5.8 | 2.98 | 1.4 | 36.8% | 24.1% | 47% |
Parameters (M) | 3.01 | 2.06 | 0.75 | 0.1 | 25% | 4.9% | 13.3% |
Parameters | Value |
---|---|
Epochs | 250 |
Batch | 8 |
Optimizer | SGD |
CUDA | 11.3.1 |
Pytorch | 1.12.1 |
Python | 3.9.18 |
Models | RetBlock | Triplet | LCD | FMloU | Parameters (M) | FLOPS (G) | mAPO.5 (%) | mAP0.95 (%) |
---|---|---|---|---|---|---|---|---|
Baseline | 3.01 | 8.2 | 79.8 | 46.2 | ||||
YOLOv8 | − | − | − | − | ||||
Model1 | √ | 2.83 (−6%) | 7.3 (−11%) | 80.3 | 46.5 | |||
Model2 | √ | − | − | 80 | 46.5 | |||
Model3 | √ | 2.37 (−21.3%) | 6.6 (−19.5%6) | 80 | 46.4 | |||
Model4 | √ | − | − | 80.4 | 46.4 | |||
Model5 | √ | √ | 2.7 (−10.3%) | 7.3 (−11%) | 80.4 | 46.5 | ||
Model6 | √ | √ | √ | 2.06 (−31.6%) | 5.8 (−29.3%) | 80.5 | 46.7 | |
RTL | √ | √ | √ | √ | 2.06 (−31.6%) | 5.8 (−29.3%) | 80.8 | 46.9 |
B | C | D | E | F | G | |
---|---|---|---|---|---|---|
Experimental Figures (1) | starfish: 0.81 holothurian: None | starfish: 0.79 holothurian: None | starfish: 0.79 holothurian: None | starfish: 0.72 holothurian: None | starfish: 0.81 holothurian: None | starfish: 0.82 holothurian: 0.27 |
Experimental Figures (2) | echinus: 0.75 scallop: None holothurian: None | echinus: 0.67 scallop: None holothurian: None | echinus: 0.58 scallop: None holothurian: None | echinus: 0.55 scallop: 0.31 holothurian: None | echinus: 0.75 scallop: 0.30 holothurian: None | echinus: 0.76 scallop: 0.56 holothurian: 0.31 |
Experimental Figures (3) | holothurian: 0.79 echinus: None | holothurian: 0.77 echinus: None | holothurian: 0.72 echinus: 0.34 echinus: 0.33 | holothurian: 0.70 holothurian: 0.22 echinus: 0.41 | holothurian: 0.75 echinus: None | holothurian: 0.79 echinus: 0.52 |
Experimental Figures (4) | dojou: None | dojou: None | dojou: None | dojou:0.33 | dojou: None | dojou: 0.44 |
Experimental Figures (5) | dojou1: 0.68 dojou2: None | dojou1: 0.62 dojou2: None | dojou1: 0.68 dojou2: None | dojou1: 0.66 dojou2: None | dojou1: 0.53 dojou2: None | dojou1: 0.78 dojou2: 0.48 |
Experimental Figures (6) | dojou1: 0.67 dojou2: None funa: 0.88 | dojou1: 0.65 dojou2: 0.32 funa: 0.77 | dojou1: 0.71 dojou2: 0.37 funa: 0.89 | dojou1: 0.73 dojou2: 0.37 funa: 0.80 | dojou1: 0.75 dojou2: 0.33 funa: 0.84 | dojou1: 0.78 dojou2: 0.43 funa: 0.88 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Feng, G.; Xiong, Z.; Pang, H.; Gao, Y.; Zhang, Z.; Yang, J.; Ma, Z. RTL-YOLOv8n: A Lightweight Model for Efficient and Accurate Underwater Target Detection. Fishes 2024, 9, 294. https://doi.org/10.3390/fishes9080294
Feng G, Xiong Z, Pang H, Gao Y, Zhang Z, Yang J, Ma Z. RTL-YOLOv8n: A Lightweight Model for Efficient and Accurate Underwater Target Detection. Fishes. 2024; 9(8):294. https://doi.org/10.3390/fishes9080294
Chicago/Turabian StyleFeng, Guanbo, Zhixin Xiong, Hongshuai Pang, Yunlei Gao, Zhiqiang Zhang, Jiapeng Yang, and Zhihong Ma. 2024. "RTL-YOLOv8n: A Lightweight Model for Efficient and Accurate Underwater Target Detection" Fishes 9, no. 8: 294. https://doi.org/10.3390/fishes9080294
APA StyleFeng, G., Xiong, Z., Pang, H., Gao, Y., Zhang, Z., Yang, J., & Ma, Z. (2024). RTL-YOLOv8n: A Lightweight Model for Efficient and Accurate Underwater Target Detection. Fishes, 9(8), 294. https://doi.org/10.3390/fishes9080294