YOLOv7-Ship: A Lightweight Algorithm for Ship Object Detection in Complex Marine Environments
Abstract
:1. Introduction
- We introduce the improved CA-M attention module to the YOLOv7 backbone module to weaken the background feature weight, introduce ODconv in the neck, and propose an improved aggregation network module, OD-ELAN, which efficiently enhances the network’s feature extraction capacity for ships in complex scenes with less computational increase.
- We use the lightweight CARAFE method in the feature fusion layer, which can utilize learnable interpolation weights to interpolate the low-resolution feature maps, thus reducing the loss while processing small-target ship feature information.
- We adopt SIoU as the loss function, which more accurately captures the orientation-matching information between target bounding boxes and improves the convergence speed of algorithm training.
- We construct a ship target detection dataset containing thousands of accurately labeled visible ship images in complex marine environments.
2. Related Work
3. Methods
3.1. YOLOv7 Network Structure
3.2. OD-ELAN Module
3.3. CA-M Attention Mechanism
3.4. CARAFE Upsampler
3.5. SIoU Loss Function
3.6. The YOLOv7-Ship Model
4. Experiments
4.1. Data Collection and Processing
4.2. Experimental Environment
4.3. Evaluation Metrics
5. Results and Analysis
5.1. Effectiveness of CA-M Module
5.2. Comparative Analysis of Loss Functions
5.3. Ablation Experiment
5.4. Comparison Experiment
5.5. Analysis of the Detection Results
5.6. Qualitative Analysis of Detection Effects
6. Conclusions and Discussions
- The research in this paper is limited to the algorithm level, and the algorithm has not yet been deployed to the embedded computing platform.
- In this paper’s self-constructed dataset, there is an imbalance in the category labels. The number of liner and container ship labels is small, leading to insufficient feature extraction and model training for these two categories. In addition, the virtual dataset may not be able to fully simulate the actual scenario, which may lead to the performance degradation of the model in real applications.
- Although the YOLOv7-Ship model improves the accuracy on small targets, there are still problems of ship missed detection in foggy and dark day scenarios.
- Compared with the latest YOLOv8 model, the network structure of the YOLOv7-Ship model is more complex and requires more computational resources in the inference stage.
- Therefore, our future research directions and work include the following:
- We plan to design a complete object detector embedded system so that it can execute the YOLOv7-Ship model.
- We will expand the dataset by including more images of real scenes in different environments and improve the problem of unbalanced category labeling by data augmentation and resampling.
- We will investigate defogging algorithms and multimodal information fusion techniques. We plan to fuse multisource information from infrared or radar into the model to enhance perception in different environments.
- We will consider using a more lightweight network structure and employ methods such as pruning and knowledge distillation to reduce the number of model parameters. We aim to maintain the higher detection accuracy of the model while compensating for its shortcomings in detection speed.
- We will also explore applying the optimized algorithms to complex tasks such as ship object tracking and ship trajectory planning to offer more dependable technical support for realizing intelligence and safety in the maritime field.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhou, W.; Peng, Y. Ship detection based on multi-scale weighted fusion. Displays 2023, 78, 102448. [Google Scholar] [CrossRef]
- Xing, B.; Wang, W.; Qian, J.; Pan, C.; Le, Q. A Lightweight Model for Real-Time Monitoring of Ships. Electronics 2023, 12, 3804. [Google Scholar] [CrossRef]
- Zhang, M.; Rong, X.; Yu, X. Light-SDNet: A Lightweight CNN Architecture for Ship Detection. IEEE Access 2022, 10, 86647–86662. [Google Scholar] [CrossRef]
- Xu, F.; Liu, J.; Sun, M.; Zeng, D.; Wang, X. A hierarchical maritime target detection method for optical remote sensing imagery. Remote Sens. 2017, 9, 280. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part I 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 7464–7475. [Google Scholar]
- Liang, Q.; Dong, W.; Kai, C.L.; Wei, W.; Liang, D. Ship target detection method based on SRM segmentation and hierarchical line segment features. In Proceedings of the 2019 Chinese Control and Decision Conference (CCDC), Nanchang, China, 3–5 June 2019. [Google Scholar] [CrossRef]
- Zhu, C.; Zhou, H.; Wang, R.; Guo, J. A Novel Hierarchical Method of Ship Detection from Spaceborne Optical Image Based on Shape and Texture Features. IEEE Trans. Geosci. Remote Sens. 2010, 48, 3446–3456. [Google Scholar] [CrossRef]
- Yang, F.; Xu, Q.; Li, B. Ship Detection from Optical Satellite Images Based on Saliency Segmentation and Structure-LBP Feature. IEEE Geosci. Remote Sens. Lett. 2017, 14, 602–606. [Google Scholar] [CrossRef]
- Yang, Y.; Chen, P.; Ding, K.; Chen, Z.; Hu, K. Object detection of inland waterway ships based on improved SSD model. Ships Offshore Struct. 2023, 18, 1192–1200. [Google Scholar] [CrossRef]
- Li, D.; Zhang, Z.; Fang, Z.; Cao, F. Ship detection with optical image based on CA-YOLO v3 Network. In Proceedings of the 2023 3rd International Conference on Frontiers of Electronics, Information and Computation Technologies (ICFEICT), IEEE, Yangzhou, China, 26–29 May 2023; pp. 589–598. [Google Scholar] [CrossRef]
- Huang, Q.; Sun, H.; Wang, Y.; Yuan, Y.; Guo, X.; Gao, Q. Ship detection based on YOLO algorithm for visible images. IET Image Process. 2023. [Google Scholar] [CrossRef]
- Zhou, S.; Yin, J. YOLO-Ship: A Visible Light Ship Detection Method. In Proceedings of the 2022 2nd International Conference on Consumer Electronics and Computer Engineering (ICCECE), IEEE, Guangzhou, China, 14–16 January 2022; pp. 113–118. [Google Scholar] [CrossRef]
- Gao, Z.; Zhang, Y.; Wang, S. Lightweight Small Ship Detection Algorithm Combined with Infrared Characteristic Analysis for Autonomous Navigation. J. Mar. Sci. Eng. 2023, 11, 1114. [Google Scholar] [CrossRef]
- Wu, W.; Li, X.; Hu, Z.; Liu, X. Ship Detection and Recognition Based on Improved YOLOv7. Comput. Mater. Contin. 2023, 76, 489–498. [Google Scholar] [CrossRef]
- Cen, J.; Feng, H.; Liu, X.; Hu, Y.; Li, H.; Li, H.; Huang, W. An Improved Ship Classification Method Based on YOLOv7 Model with Attention Mechanism. Wirel. Commun. Mob. Comput. 2023, 2023, 7196323. [Google Scholar] [CrossRef]
- Lang, C.; Yu, X.; Rong, X. LSDNet: A Lightweight Ship Detection Network with Improved YOLOv7. J. Real-Time Image Process. 2023. [Google Scholar] [CrossRef]
- Er, M.J.; Zhang, Y.; Chen, J.; Gao, W. Ship detection with deep learning: A survey. Artif. Intell. Rev. 2023, 56, 11825–11865. [Google Scholar] [CrossRef]
- Escorcia-Gutierrez, J.; Gamarra, M.; Beleño, K.; Soto, C.; Mansour, R.F. Intelligent deep learning-enabled autonomous small ship detection and classification model. Comput. Electr. Eng. 2022, 100, 107871. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar] [CrossRef]
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A single-stage object detection framework for industrial applications. arXiv 2022, arXiv:2209.02976. [Google Scholar] [CrossRef]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Yang, B.; Bender, G.; Le, Q.V.; Ngiam, J. Condconv: Conditionally parameterized convolutions for efficient inference. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS 2019), Vancouver, BC, Canada, 8–14 December 2019; Volume 32. [Google Scholar]
- Chen, Y.; Dai, X.; Liu, M.; Chen, D.; Yuan, L.; Liu, Z. Dynamic convolution: Attention over convolution kernels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11030–11039. [Google Scholar]
- Li, C.; Zhou, A.; Yao, A. Omni-dimensional dynamic convolution. arXiv 2022, arXiv:2209.07947. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13713–13722. [Google Scholar]
- Wang, J.; Chen, K.; Xu, R.; Liu, Z.; Loy, C.C.; Lin, D. Carafe: Content-aware reassembly of features. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3007–3016. [Google Scholar]
- Zheng, Z.; Wang, P.; Ren, D.; Liu, W.; Ye, R.; Hu, Q.; Zuo, W. Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans. Cybern. 2021, 52, 8574–8586. [Google Scholar] [CrossRef]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 12993–13000. [Google Scholar] [CrossRef]
- Gevorgyan, Z. SIoU loss: More powerful learning for bounding box regression. arXiv 2022, arXiv:2205.12740. [Google Scholar] [CrossRef]
- Zheng, J.; Wu, H.; Zhang, H.; Wang, Z.; Xu, W. Insulator-defect detection algorithm based on improved YOLOv7. Sensors 2022, 22, 8801. [Google Scholar] [CrossRef]
- Shao, Z.; Wu, W.; Wang, Z.; Du, W.; Li, C. Seaships: A large-scale precisely annotated dataset for ship detection. IEEE Trans. Multimed. 2018, 20, 2593–2604. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
- Liu, Y.; Shao, Z.; Hoffmann, N. Global attention mechanism: Retain information to enhance channel-spatial interactions. arXiv 2021, arXiv:2112.05561. [Google Scholar] [CrossRef]
- Yang, L.; Zhang, R.Y.; Li, L.; Xie, X. Simam: A simple, parameter-free attention module for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 18–24 July 2021; pp. 11863–11874. [Google Scholar]
- Zhang, Y.F.; Ren, W.; Zhang, Z.; Jia, Z.; Wang, L.; Tan, T. Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 2022, 506, 146–157. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
Condition | Training | Validation | Test | Total | Percentage |
---|---|---|---|---|---|
Sunny | 2522 | 316 | 319 | 3157 | 61.9% |
Rainy, foggy, and snowy | 1021 | 128 | 140 | 1289 | 25.3% |
Dusk or darkness | 537 | 66 | 51 | 654 | 12.8% |
Objects | Metric (Square Pixels) |
---|---|
Small | Area < 322 |
Medium | 322 < Area < 962 |
Large | Area > 962 |
Category | Small | Medium | Large | Total |
---|---|---|---|---|
Liner | 735 | 150 | 42 | 927 |
Container ship | 362 | 20 | 8 | 390 |
Bulk carrier | 1338 | 308 | 100 | 1746 |
Island reef | 1926 | 1869 | 468 | 4263 |
Sailboat | 1339 | 1011 | 755 | 3105 |
Other ship | 2037 | 3617 | 4155 | 9809 |
Configuration | Versions |
---|---|
Operation system | Windows 10 |
CPU | Intel(R) Core(TM) i7-10750H CPU @ 2.60 GHz |
GPU | NVIDIA GeForce RTX 2060 |
RAM | 16.0 GB |
Toolkit | CUDA 11.7 |
Compiler | Python 3.9 |
Framework | PyTorch 2.0.0 |
Component | Name/Value |
---|---|
Epochs | 200 |
Image size | 640 × 640 |
Batch size | 8 |
Initial learning rate | 0.01 |
Final learning rate | 0.1 |
Momentum | 0.8 |
Optimizer | SGD |
Mosaic | 0.9 |
Mixup | 0.05 |
Copy_paste | 0.05 |
Model | P/% | R/% | [email protected]/% | [email protected]:.95/% |
---|---|---|---|---|
(a) | 80.7 | 74.2 | 78.6 | 53.7 |
(b) | 81.1 | 75.4 | 78.7 | 53.7 |
(c) | 78.8 | 76.7 | 78.1 | 52.5 |
(d) | 80.4 | 75.1 | 78.6 | 53.4 |
(e) | 79.9 | 76.2 | 78.4 | 53.7 |
(f) | 80.2 | 75.3 | 78.9 | 53.9 |
Model | P/% | R/% | [email protected]/% | [email protected]:.95/% |
---|---|---|---|---|
YOLOv7-Tiny | 80.1 | 74.1 | 78.3 | 53.5 |
+SE | 80.4 | 73.4 | 78.4 | 51.6 |
+CBAM | 80.8 | 72.9 | 78.2 | 51.9 |
+ECA | 80.3 | 74.2 | 78.6 | 52.2 |
+GAM | 79.3 | 72.0 | 77.1 | 49.9 |
+SimAM | 80.9 | 73.8 | 78.7 | 52.4 |
+CA | 79.9 | 74.8 | 78.7 | 53.6 |
+CA-M | 80.2 | 75.3 | 78.9 | 53.9 |
Model | Loss function | Loss | [email protected]/% |
---|---|---|---|
YOLOv7-Ship | CIoU | 0.04281 | 80.3 |
EIoU | 0.04278 | 80.2 | |
DIoU | 0.04385 | 80.1 | |
GIoU | 0.04264 | 80.4 | |
SIoU | 0.04229 | 80.5 |
Model | Group | CA-M | OD- ELAN | CARAFE | SIoU | [email protected]/% | [email protected]:.95/% | GFLOPS | FPS |
---|---|---|---|---|---|---|---|---|---|
YOLOv7-Tiny | 1 | ✕ | ✕ | ✕ | ✕ | 78.3 | 53.5 | 13.1 | 77 |
2 | ✓ | ✕ | ✕ | ✕ | 78.9 | 53.9 | 13.1 | 79 | |
3 | ✓ | ✓ | ✕ | ✕ | 79.6 | 54.2 | 12.7 | 71 | |
4 | ✓ | ✓ | ✓ | ✕ | 80.3 | 55 | 12.8 | 64 | |
5 | ✓ | ✓ | ✓ | ✓ | 80.5 | 55.4 | 12.8 | 75 |
Model | P/% | R/% | [email protected]/% | [email protected]:.95/% | /% | /% | Params/M | GFLOPS | FPS |
---|---|---|---|---|---|---|---|---|---|
YOLOv7-Tiny | 80.1 | 74.1 | 78.3 | 53.5 | 35.2 | 68.4 | 6.0 | 13.1 | 77 |
YOLOv7-Ship | 81.4 | 75.8 | 80.5 | 55.4 | 37.7 | 67.6 | 6.1 | 12.8 | 75 |
Model | [email protected]/% | [email protected]:.95/% | /% | /% | /% | Params/M | GFLOPS | FPS |
---|---|---|---|---|---|---|---|---|
Faster R-CNN | 74.9 | 51.1 | 29.8 | 51.3 | 64.9 | 72.1 | 47.6 | 21 |
SSD | 72.2 | 47.1 | 25.3 | 48.6 | 61.3 | 38.6 | 28.8 | 43 |
YOLOv3 | 75.1 | 48.1 | 27.7 | 52.1 | 64.2 | 12.6 | 19.9 | 56 |
YOLOv4 | 74.8 | 50.1 | 29.1 | 50.9 | 64.7 | 52.5 | 54 | 31 |
YOLOv5s | 77.2 | 52 | 31.6 | 53.1 | 66.4 | 7.1 | 13.2 | 59 |
YOLOv5m | 78 | 55.8 | 32.0 | 54.8 | 69.2 | 20.9 | 47.9 | 41 |
YOLOv7-Tiny | 78.3 | 53.5 | 35.2 | 55.6 | 68.4 | 6.0 | 13.1 | 77 |
YOLOv8 | 78.5 | 55.7 | 31.2 | 56.3 | 68.9 | 3.0 | 8.1 | 120 |
YOLOv7-Ship (Ours) | 80.5 | 55.4 | 37.7 | 56.4 | 67.6 | 6.1 | 12.8 | 75 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jiang, Z.; Su, L.; Sun, Y. YOLOv7-Ship: A Lightweight Algorithm for Ship Object Detection in Complex Marine Environments. J. Mar. Sci. Eng. 2024, 12, 190. https://doi.org/10.3390/jmse12010190
Jiang Z, Su L, Sun Y. YOLOv7-Ship: A Lightweight Algorithm for Ship Object Detection in Complex Marine Environments. Journal of Marine Science and Engineering. 2024; 12(1):190. https://doi.org/10.3390/jmse12010190
Chicago/Turabian StyleJiang, Zhikai, Li Su, and Yuxin Sun. 2024. "YOLOv7-Ship: A Lightweight Algorithm for Ship Object Detection in Complex Marine Environments" Journal of Marine Science and Engineering 12, no. 1: 190. https://doi.org/10.3390/jmse12010190
APA StyleJiang, Z., Su, L., & Sun, Y. (2024). YOLOv7-Ship: A Lightweight Algorithm for Ship Object Detection in Complex Marine Environments. Journal of Marine Science and Engineering, 12(1), 190. https://doi.org/10.3390/jmse12010190