Research on the Application of Visual Recognition in the Engine Room of Intelligent Ships
Abstract
:1. Introduction
- There are no public datasets for the main equipment of marine engine rooms, and there is a lack of relevant research material and exact data. In the field of marine engine room visual recognition, there are no known reliable open sources datasets at present, and the very few existing works of literature on marine industry equipment detection are of little reference significance.
- There is a wide variety and dense layout of equipment in a marine engine room. The size of adjacent equipment may vary by several orders of magnitude. For example, there are large-scale differences between the main engine and valve, reservoir and meter, which increases the difficulty of detection and recognition.
- The equipment in marine engine rooms is densely arranged, and the pipelines are staggered and complicated. Due to the compact layout and the pipelines’ connection characteristics of marine engine rooms, there is widespread occlusion or overlap among various equipment.
- A MEMER dataset is built, relying on the resources of Dalian Maritime University’s three-dimension virtual marine engine room project team. We built the MEMER dataset by processing photos taken in actual engine rooms. The ship types of the dataset include a very large crude oil carrier (VLCC), very large container ship (VLCS), and very large ore carrier (VLOC), and the equipment category includes diesel engines, pumps, coolers, oil separators, meters, reservoirs, and valves. The details of data processing will be shown in Section 4.1.
- The channel pruning based on the BN layer [3] weight value is used to accelerate the recognition speed. To improve the accuracy of recognition in complicated engine rooms, the CIoU_Loss loss function and hard-swish activation function are used to optimize the original algorithm. Meanwhile, the soft-NMS is used as the NMS method to reduce the false rate and missed rate of detection.
2. Preliminary
3. Amelioration
3.1. Model Principle
3.1.1. Input
3.1.2. Backbone
3.1.3. Neck Structure
3.1.4. Output
3.2. YOLOv5 Improvement
3.2.1. Channel Pruning Based on BN Layer Weight Value
3.2.2. CIoU_Loss Loss Function
3.2.3. Soft-NMS
3.2.4. Hard-Swish Activation Function
4. Experiments
4.1. MEMER Dataset
4.2. Configurations and Situation
4.3. Criteria
4.4. Verification
4.4.1. Comparison on PASCAL VOC
4.4.2. Validations Using MEMER
5. Conclusions and Discussions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Murray, B.; Perera, L.P. Ship behavior prediction via trajectory extraction-based clustering for maritime situation awareness. J. Ocean Eng. Sci. 2022, 7, 1–13. [Google Scholar] [CrossRef]
- Chen, P.; Huang, Y.; Papadimitriou, E.; Mou, J.; van Gelder, P. Global path planning for autonomous ship: A hybrid approach of Fast Marching Square and velocity obstacles methods. Ocean Eng. 2020, 214, 107793. [Google Scholar] [CrossRef]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning; PMLR: Lille, France, 7–9 July 2015; pp. 448–456. [Google Scholar]
- Bovcon, B.; Kristan, M. WaSR—A Water Segmentation and Refinement Maritime Obstacle Detection Network. IEEE Trans. Cybern. 2021, 1–14. [Google Scholar] [CrossRef] [PubMed]
- Lee, W.J.; Roh, M.I.; Lee, H.W.; Ha, J.; Cho, Y.M.; Lee, S.J.; Son, N.S. Detection and tracking for the awareness of surroundings of a ship based on deep learning. J. Comput. Des. Eng. 2021, 8, 1407–1430. [Google Scholar] [CrossRef]
- Donahue, J.; Anne Hendricks, L.; Guadarrama, S.; Rohrbach, M.; Venugopalan, S.; Saenko, K.; Darrell, T. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 677–691. [Google Scholar] [CrossRef]
- Shao, Z.; Wang, L.; Wang, Z.; Du, W.; Wu, W. Saliency-Aware Convolution Neural Network for Ship Detection in Surveillance Video. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 781–794. [Google Scholar] [CrossRef]
- Chen, Y.; Zhang, X.; Chen, W.; Li, Y.; Wang, J. Research on Recognition of Fly Species Based on Improved RetinaNet and CBAM. IEEE Access 2020, 8, 102907–102919. [Google Scholar] [CrossRef]
- Zheng, G.; Zhao, J.; Li, S.; Feng, J. Zero-Shot Pipeline Detection for Sub-Bottom Profiler Data Based on Imaging Principles. Remote Sens. 2021, 13, 4401. [Google Scholar] [CrossRef]
- Li, J.; Xu, C.; Jiang, L.; Xiao, Y.; Deng, L.; Han, Z. Detection and Analysis of Behavior Trajectory for Sea Cucumbers Based on Deep Learning. IEEE Access 2020, 8, 18832–18840. [Google Scholar] [CrossRef]
- Neubeck, A.; Van Gool, L. Efficient non-maximum suppression. In Proceedings of the 18th International Conference on Pattern Recognition, Hong Kong, China, 20–24 August 2006. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014. [Google Scholar]
- Zhu, C.; He, Y.; Savvides, M. Feature selective anchor-free module for single-shot object detection. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 840–849. [Google Scholar]
- Zhu, C.; Chen, F.; Shen, Z.; Savvides, M. Soft anchor-point object detection. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2020; pp. 91–107. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Qi, J.; Zhang, J.; Meng, Q.; Ju, J.; Jiang, H. Detection of Auxiliary Equipment in Engine Room Based on Improved SSD. J. Phys. Conf. Ser. 2022, 2173, 012060. [Google Scholar] [CrossRef]
- Qi, J.; Zhang, J.; Meng, Q. Auxiliary Equipment Detection in Marine Engine Rooms Based on Deep Learning Model. J. Mar. Sci. Eng. 2021, 9, 1006. [Google Scholar] [CrossRef]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 2778–2788. [Google Scholar]
- Guo, Z.; Wang, C.; Yang, G.; Huang, Z.; Li, G. MSFT-YOLO: Improved YOLOv5 Based on Transformer for Detecting Defects of Steel Surface. Sensors 2022, 22, 3467. [Google Scholar] [CrossRef] [PubMed]
- Ting, L.; Baijun, Z.; Yongsheng, Z.; Shun, Y. Ship detection algorithm based on improved YOLO V5. In Proceedings of the 2021 6th International Conference on Automation, Control and Robotics Engineering (CACRE), Dalian, China, 15–17 July 2021; pp. 483–487. [Google Scholar]
- Han, S.; Pool, J.; Tran, J.; Dally, W. Learning both weights and connections for efficient neural network. Adv. Neural Inf. Process. Syst. 2015, 28. Available online: https://proceedings.neurips.cc/paper/2015/file/ae0eb3eed39d2bcef4622b2499a05fe6-Paper.pdf (accessed on 20 August 2022).
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Wang, C.Y.; Liao HY, M.; Wu, Y.H.; Chen, P.Y.; Hsieh, J.W.; Yeh, I.H. CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 390–391. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef]
- Hu, D.; Zhu, J.; Liu, J.; Wang, J.; Zhang, X. Gesture recognition based on modified Yolov5s. IET Image Process. 2022, 16, 2124–2132. [Google Scholar] [CrossRef]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- Han, S.; Mao, H.; Dally, W.J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv 2015, arXiv:1510.00149. [Google Scholar]
- Liu, Z.; Li, J.; Shen, Z.; Huang, G.; Yan, S.; Zhang, C. Learning efficient convolutional networks through network slimming. In Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2736–2744. [Google Scholar]
- Efraimidis, P.S.; Spirakis, P.G. Weighted random sampling with a reservoir. Inf. Process. Lett. 2006, 97, 181–185. [Google Scholar] [CrossRef]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. Proc. AAAI Conf. Artif. Intell. 2020, 34, 12993–13000. [Google Scholar] [CrossRef]
- Bodla, N.; Singh, B.; Chellappa, R.; Davis, L. SSoft-NMS--improving object detection with one line of code. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 5561–5569. [Google Scholar]
- Zhong, Z.; Zheng, L.; Kang, G.; Li, S.; Yang, Y. Random erasing data augmentation. Proc. AAAI Conf. Artif. Intell. 2020, 34, 13001–13008. [Google Scholar] [CrossRef]
- Li, J.; Tang, S.; Li, K.; Zhang, S.; Tang, L.; Cao, L.; Ji, F. Automatic recognition and classification of microseismic waveforms based on computer vision. Tunn. Undergr. Space Technol. 2022, 121, 104327. [Google Scholar] [CrossRef]
- Yi, J.; Wu, P.; Metaxas, D.N. ASSD: Attentive single shot multibox detector. Comput. Vis. Image Underst. 2019, 189, 102827. [Google Scholar] [CrossRef] [Green Version]
Configuration | Detail |
---|---|
Operating System | Windows 10 |
GPU | NVIDIA GeForce GTX1660Ti |
CPU | Inter i7-9700 (3.00 GHz) 8-core |
RAM | 16 GB |
IDE | PyCharm 2020.1.4 |
Framework | GPU-based PyTorch-1.4.0 |
Toolkit | CUDA 11.3 |
Model | Pre-Train | Input Size | GPU | FPS | mAP (%) |
---|---|---|---|---|---|
Faster R-CNN | √ | 600 × 1000 | Titan X | 7 | 73.2 |
Faster R-CNN | √ | 600 × 1000 | K40 | 2.4 | 76.4 |
YOLOv3 | √ | 352 × 352 | Titan X | 19.9 | 75.7 |
SSD | √ | 300 × 300 | Titan X | 46 | 77.2 |
DSSD | √ | 321 × 321 | Titan X | 9.5 | 78.6 |
RSSD | √ | 300 × 300 | Titan X | 35 | 78.5 |
FSSD | √ | 300 × 300 | 1080Ti | 65.8 | 78.8 |
RetinaNet | √ | 600 × 600 | 1660Ti | 17.4 | 79.3 |
YOLOv5 | √ | 640 × 640 | 1660Ti | 18.3 | 79.5 |
Improved YOLOv5 | √ | 640 × 640 | 1660Ti | 22.6 | 79.9 |
Model | CIoU_Loss | Soft-NMS | Hard-Swish | Time (ms) | mAP (%) |
---|---|---|---|---|---|
Baseline | - | - | - | 56 | 78.91 |
Schemes | √ | - | - | 60 | 82.58 |
- | √ | - | 57 | 80.23 | |
- | - | √ | 49 | 79.35 | |
- | √ | √ | 56 | 80.44 | |
√ | - | √ | 57 | 82.36 | |
√ | √ | - | 62 | 83.62 | |
√ | √ | √ | 52 | 84.07 |
Model | mAP (%) | ||
---|---|---|---|
D1 | D2 | D3 | |
M1 | 84.2 | 81.3 | 78.6 |
M2 | 84.8 | 84.7 | 84.2 |
Model | AP(%) | FPS | mAP (%) | ||||||
---|---|---|---|---|---|---|---|---|---|
Engine | Pump | Cooler | Separator | Meter | Reservoir | Valve | |||
Faster R-CNN | 93.77 | 82.11 | 90.96 | 84.83 | 43.81 | 86.95 | 50.49 | 8.53 | 76.13 |
SSD | 100 | 89.46 | 83.53 | 91.71 | 46.22 | 71.05 | 51.48 | 27.99 | 76.21 |
RSSD | 100 | 90.39 | 85.85 | 93.90 | 49.53 | 78.57 | 55.18 | 17.94 | 79.06 |
FSSD | 100 | 89.79 | 84.30 | 93.60 | 47.94 | 76.49 | 53.62 | 24.26 | 77.96 |
RetinaNet | 100 | 94.03 | 93.91 | 95.69 | 57.21 | 48.81 | 70.86 | 17.24 | 79.34 |
YOLOv5 | 100 | 94.52 | 94.02 | 94.97 | 55.02 | 53.35 | 58.54 | 19.05 | 78.91 |
Improved YOLOv5 | 100 | 95.91 | 94.29 | 98.54 | 64.21 | 60.23 | 75.32 | 25.07 | 84.07 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shang, D.; Zhang, J.; Zhou, K.; Wang, T.; Qi, J. Research on the Application of Visual Recognition in the Engine Room of Intelligent Ships. Sensors 2022, 22, 7261. https://doi.org/10.3390/s22197261
Shang D, Zhang J, Zhou K, Wang T, Qi J. Research on the Application of Visual Recognition in the Engine Room of Intelligent Ships. Sensors. 2022; 22(19):7261. https://doi.org/10.3390/s22197261
Chicago/Turabian StyleShang, Di, Jundong Zhang, Kunxin Zhou, Tianjian Wang, and Jiahao Qi. 2022. "Research on the Application of Visual Recognition in the Engine Room of Intelligent Ships" Sensors 22, no. 19: 7261. https://doi.org/10.3390/s22197261
APA StyleShang, D., Zhang, J., Zhou, K., Wang, T., & Qi, J. (2022). Research on the Application of Visual Recognition in the Engine Room of Intelligent Ships. Sensors, 22(19), 7261. https://doi.org/10.3390/s22197261