Light-YOLOv5: A Lightweight Algorithm for Improved YOLOv5 in Complex Fire Scenarios
Abstract
:1. Introduction
- Replacement of the last layers of the backbone network with SepViT Block and strengthening of the network’s connection to global feature information.
- We propose a Light-BiFPN structure to reduce the computational cost and parameters while enhancing the fusion of multi-scale features and enriching the semantic features.
- We incorporate the global attention mechanism into YOLOv5 to enhance the overall feature-extraction capability of the network.
- We verify the validity of the Mish activation and SIoU loss functions.
2. Related Work
3. Methods
3.1. Baseline
3.2. Separable Vision Transformer
3.3. Light-BiFPN Neck
3.4. Global Attention Mechanism
3.5. IoU Loss and Activation
4. Experiment
4.1. Datasets
4.2. Training Environment and Details
4.3. Model Evaluation
4.4. Result Analysis and Ablation Experiments
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Favorskaya, M.; Pyataeva, A.; Popov, A. Verification of smoke detection in video sequences based on spatio-temporal local binary patterns. Procedia Comput. Sci. 2015, 60, 671–680. [Google Scholar] [CrossRef] [Green Version]
- Dimitropoulos, K.; Barmpoutis, P.; Grammalidis, N. Higher order linear dynamical systems for smoke detection in video surveillance applications. IEEE Trans. Circuits Syst. Video Technol. 2016, 27, 1143–1154. [Google Scholar] [CrossRef]
- Wang, X.; Li, Y.; Li, Z. Research on flame detection algorithm based on multi-feature fusion. In Proceedings of the2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chongqing, China, 12–14 June 2020; Volume 1, pp. 184–189. [Google Scholar]
- Wang, Y.; Hua, C.; Ding, W.; Wu, R. Real-time detection of flame and smoke using an improved YOLOv4 network. Signal Image Video Processing 2022, 16, 1109–1116. [Google Scholar] [CrossRef]
- Zhang, J.; Ke, S. Improved YOLOX Fire Scenario Detection Method. In Proceedings of the Wireless Communications and Mobile Computing, Dubrovnik, Croatia, 30 May–3 June 2022. [Google Scholar]
- Zhao, L.; Zhi, L.; Zhao, C.; Zheng, W. Fire-YOLO: A Small Target Object Detection Method for Fire Inspection. Sustainability 2022, 14, 4930. [Google Scholar] [CrossRef]
- Li, J.; Guo, S.; Kong, L.; Tan, S.; Yuan, Y. An improved YOLOv3-tiny method for fire detection in the construction industry. In Proceedings of the E3S Web of Conferences, Changsha, China, 23–25 April 2021; Volume 253, p. 03069. [Google Scholar]
- Yue, C.; Ye, J. Research on Improved YOLOv3 Fire Detection Based on Enlarged Feature Map Resolution and Cluster Analysis. J. Phys. Conf. Ser. 2021, 1757, 012094. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the E3S Web of Conferences Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929, 2020. [Google Scholar]
- Li, W.; Wang, X.; Xia, X.; Wu, J.; Xiao, X.; Zheng, M.; Wen, S. Sepvit: Separable vision transformer. arXiv 2022, arXiv:2203.15380. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4510–4520. [Google Scholar]
- Howard, A.; Sandler, M.; Chu, G.; Chen, L.C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V.; et al. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
- Zhang, X.; Zhou, X.; Lin, M.; Sun, J. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6848–6856. [Google Scholar]
- Ma, N.; Zhang, X.; Zheng, H.T.; Sun, J. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 116–131. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. Ghostnet: More features from cheap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 1580–1589. [Google Scholar]
- Cui, C.; Gao, T.; Wei, S.; Du, Y.; Guo, R.; Dong, S.; Lu, B.; Zhou, Y.; Lv, X.; Liu, Q.; et al. PP-LCNet: A Lightweight CPU Convolutional Neural Network. arXiv 2021, arXiv:2109.15099. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Liu, Y.; Shao, Z.; Hoffmann, N. Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions. arXiv 2021, arXiv:2112.05561. [Google Scholar]
- Yu, J.; Jiang, Y.; Wang, Z.; Cao, Z.; Huang, T. Unitbox: An advanced object detection network. In Proceedings of the 24th ACM International Conference on Multimedia, Amsterdam, The Netherlands, 20–24 October 2016; pp. 516–520. [Google Scholar]
- Zheng, Z.; Wang, P.; Ren, D.; Liu, W.; Ye, R.; Hu, Q.; Zuo, W. Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans. Cybern. 2021, 52, 8574–8586. [Google Scholar] [CrossRef]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 12993–13000. [Google Scholar]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- Zhang, Y.F.; Ren, W.; Zhang, Z.; Jia, Z.; Wang, L.; Tan, T. Focal and Efficient IOU Loss for Accurate Bounding Box Regression. arXiv 2021, arXiv:2101.08158. [Google Scholar] [CrossRef]
- Gevorgyan, Z. SIoU Loss: More Powerful Learning for Bounding Box Regression. arXiv 2022, arXiv:2205.12740. [Google Scholar]
- Park, J.; Ko, B.; Nam, J.Y.; Kwak, S. Wildfire smoke detection using spatiotemporal bag-of-features of smoke. In Proceedings of the 2013 IEEE Workshop on Applications of Computer Vision (WACV), Clearwater Beach, FL, USA, 15–17 January 2013; pp. 200–205. [Google Scholar]
- Hüttner, V.; Steffens, C.R.; da Costa Botelho, S.S. First response fire combat: Deep leaning based visible fire detection. In Proceedings of the 2017 Latin American Robotics Symposium (LARS) and 2017 Brazilian Symposium on Robotics (SBR), Curitiba, Brazil, 8–10 November 2017; pp. 1–6. [Google Scholar]
Model | Params (M) | FLOPs (G) | [email protected] (%) | FPS |
---|---|---|---|---|
YOLOv5n | 1.77 | 4.2 | 67.6 | 111.1 |
YOLOv5s | 7.02 | 15.9 | 69.3 | 100.0 |
YOLOv5m | 20.87 | 48.0 | 70.4 | 87.78 |
Model | Activation/IoU Loss | Params (M) | [email protected] (%) | FPS |
---|---|---|---|---|
LeakyReLu/CIoU | 1.77 | 67.8 | 107.3 | |
HSwish/CIoU | 1.77 | 67.3 | 109.2 | |
YOLOv5n | Mish/CIoU | 1.77 | 68.0 | 95.3 |
LeakyReLu/SIoU | 1.77 | 68.3 | 107.5 | |
HSwish/SIoU | 1.77 | 67.6 | 109.4 | |
Mish/SIoU | 1.77 | 68.7 | 95.6 |
Model | Params (M) | FLOPs (G) | [email protected] (%) | FPS |
---|---|---|---|---|
MobileNetv3-YOLOv5n | 1.93 | 3.5 | 62.8 | 98.2 |
ShuffleNetv2-YOLOv5n | 0.71 | 1.0 | 61.8 | 126.3 |
GhostNet-YOLOv5n | 1.39 | 3.3 | 64.8 | 116.4 |
PPLCNet-YOLOv5n | 0.95 | 2.0 | 63.5 | 112.5 |
Light-BiFPN-YOLOv5n | 1.25 | 3.3 | 68.6 | 125.6 |
Model | Params (M) | FLOPs (G) | F1 (%) | [email protected] (%) | FPS |
---|---|---|---|---|---|
Baseline (YOLOv5n) | 1.77 | 4.2 | 66.0 | 67.6 | 111.1 |
Baseline + Light-BiFPN | 1.25 | 3.3 | 66.5 | 68.6 | 128.6 |
Baseline + Light-BiFPN + SepViT | 1.26 | 3.3 | 67.0 | 69.8 | 120.4 |
Baseline + Light-BiFPN + SepViT + GAM | 1.29 | 3.4 | 67.0 | 70.3 | 106.5 |
Baseline + Light-BiFPN + SepViT, Mish, SIoU | 1.29 | 3.4 | 68.0 | 70.9 | 91.1 |
Model | Params (M) | FLOPs (G) | F1 (%) | [email protected] (%) | FPS |
---|---|---|---|---|---|
YOLOv3-tiny | 8.67 | 12.9 | 63.0 | 64.8 | 201.5 |
YOLOX-s | 8.93 | 26.8 | 64.0 | 65.4 | 64.6 |
YOLOv7-tiny | 6.01 | 13.1 | 63.0 | 64.1 | 285.3 |
Light-YOLOv5 | 1.29 | 3.4 | 68.0 | 70.9 | 91.1 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, H.; Li, B.; Zhong, F. Light-YOLOv5: A Lightweight Algorithm for Improved YOLOv5 in Complex Fire Scenarios. Appl. Sci. 2022, 12, 12312. https://doi.org/10.3390/app122312312
Xu H, Li B, Zhong F. Light-YOLOv5: A Lightweight Algorithm for Improved YOLOv5 in Complex Fire Scenarios. Applied Sciences. 2022; 12(23):12312. https://doi.org/10.3390/app122312312
Chicago/Turabian StyleXu, Hao, Bo Li, and Fei Zhong. 2022. "Light-YOLOv5: A Lightweight Algorithm for Improved YOLOv5 in Complex Fire Scenarios" Applied Sciences 12, no. 23: 12312. https://doi.org/10.3390/app122312312
APA StyleXu, H., Li, B., & Zhong, F. (2022). Light-YOLOv5: A Lightweight Algorithm for Improved YOLOv5 in Complex Fire Scenarios. Applied Sciences, 12(23), 12312. https://doi.org/10.3390/app122312312