Semantic Segmentation Algorithm Fusing Infrared and Natural Light Images for Automatic Navigation in Transmission Line Inspection
Abstract
:1. Introduction
- (1)
- Use a global information block and multi-level information aggregation block with a feature pyramid structure to deeply mine and fuse the multi-scale contextual information, and utilize the multi-scale feature of infrared and natural light images to better characterize the essential features of images;
- (2)
- Design differentiated fusion strategies for different levels of natural light and infrared dual-modal feature maps, and use cross-modal attention interaction activation mechanisms to fully mine the modality complementation between natural light and infrared images, thereby improving semantic segmentation results;
- (3)
- Use seven types of semantic segmentation algorithms to conduct experiments on our self-made TTS200 multi-modal dataset for transmission pylon equipment and analyze the experimental results in depth.
2. Semantic Segmentation Algorithm for Transmission Pylon Equipment Based on Fusion of Infrared and Natural Light Images
2.1. Lower-Level Fusion Block (LLFB)
2.2. Higher-Level Fusion Block (HLFB)
2.3. Multi-Scale Feature Fusion
2.4. Loss Function
3. Results
3.1. Dataset
3.2. Evaluation Indicators
3.3. Experiment Results
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Luo, H.; Zhang, Y. A Survey of Image Semantic Segmentation Based on Deep Network. Acta Electron. Sin. 2019, 47, 2211–2220. [Google Scholar]
- Tian, X.; Wang, L.; Ding, Q. Review of Image Semantic Segmentation Based on Deep Learning. J. Softw. 2019, 30, 440–468. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention (MICCAI), Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Peng, D.; Lei, Y.; Hayat, M.; Guo, Y.; Li, W. Semantic-Aware Domain Generalized Segmentation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 2584–2595. [Google Scholar] [CrossRef]
- Lee, S.; Seong, H.; Lee, S.; Kim, E. WildNet: Learning Domain Generalized Semantic Segmentation from the Wild. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 9926–9936. [Google Scholar] [CrossRef]
- Hoyer, L.; Dai, D.; Van Gool, L. DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 9914–9925. [Google Scholar] [CrossRef]
- Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. UNet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging 2020, 39, 1856–1867. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. arXiv 2014, arXiv:1412.7062. [Google Scholar]
- Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.-C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar]
- Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 833–851. [Google Scholar]
- Fan, Q.; Pei, W.; Tai, Y.-W.; Tang, C.-K. Self-Support Few-Shot Semantic Segmentation. In European Conference on Computer Vision, Proceedings of the 17th European Conference, Tel Aviv, Israel, 23–27 October 2022; Springer Nature: Cham, Switzerland, 2022. [Google Scholar]
- Liu, Q.; Wen, Y.; Han, J.; Xu, C.; Xu, H.; Liang, X. Open-World Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding. In European Conference on Computer Vision, Proceedings of the 17th European Conference, Tel Aviv, Israel, 23–27 October 2022; Springer Nature: Cham, Switzerland, 2022. [Google Scholar]
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.-Y.; et al. Segment Anything. arXiv 2023, arXiv:2304.02643. [Google Scholar]
- Liu, S.; Wang, B.; Gao, K.; Wang, Y.; Gao, C.; Chen, J. Object Detection Method for Aerial Inspection Image Based on Region-based Fully Convolutional Network. Autom. Electr. Power Syst. 2019, 43, 162–168. [Google Scholar] [CrossRef]
- Dai, J.; Li, Y.; He, K.; Sun, J. R-FCN: Object detection via region-based fully convolutional networks. In Proceedings of the International Conference on Neural Information Processing Systems (NIPS), Barcelona, Spain, 5–10 December 2016; pp. 379–387. [Google Scholar]
- Liu, H.; Zhao, T.; Liu, J.; Jiao, L.; Xu, Z.; Yuan, X. Deep Residual UNet Network-based Infrared Image Segmentation Method for Electrical Equipment. Infrared Technol. 2022, 44, 1351–1357. [Google Scholar]
- Xiong, S.; Liu, Y.; Rui, X.; He, K.; Dollár, P. Power equipment recognition method based on mask R-CNN and bayesian context network. In Proceedings of the IEEE Power & Energy Society General Meeting (PESGM), Montreal, QC, Canada, 2–6 August 2020; pp. 1–5. [Google Scholar]
- Chen, G.; Hao, K.; Wang, B.; Li, Z.; Zhao, X. A power line segmentation model in aerial images based on an efficient multibranch concatenation network. Expert Syst. Appl. 2023, 228, 120359. [Google Scholar] [CrossRef]
- Ha, Q.; Watanabe, K.; Karasawa, T.; Ushiku, Y.; Harada, T. MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 5108–5115. [Google Scholar]
- Sun, Y.; Zuo, W.; Liu, M. RTFNet: RGB-Thermal fusion network for semantic segmentation of urban scenes. IEEE Robot. Autom. Lett. 2019, 4, 2576–2583. [Google Scholar] [CrossRef]
- Sun, Y.; Zuo, W.; Yun, P.; Wang, H.; Liu, M. FuseSeg: Semantic segmentation of urban scenes based on RGB and thermal data fusion. IEEE Trans. Autom. Sci. Eng. 2020, 18, 1000–1011. [Google Scholar] [CrossRef]
- Zhou, W.; Lv, Y.; Lei, J.; Yu, L. Embedded Control Gate Fusion and Attention Residual Learning for RGB–Thermal Urban Scene Parsing. IEEE Trans. Intell. Transp. Syst. 2023, 24, 4794–4803. [Google Scholar] [CrossRef]
- Wu, W.; Chu, T.; Liu, Q. Complementarity-aware cross-modal feature fusion network for RGB-T semantic segmentation. Pattern Recognit. 2022, 131, 108881. [Google Scholar] [CrossRef]
- Wang, Y.; Li, G.; Liu, Z. SGFNet: Semantic-Guided Fusion Network for RGB-Thermal Semantic Segmentation. IEEE Trans. Circuits Syst. Video Technol. 2023. [Google Scholar] [CrossRef]
- Yan, N.; Zhou, T.; Gu, C.; Jiang, A.; Lu, W. Bimodal-based object detection and instance segmentation models for substation equipments. In Proceedings of the Annual Conference of the IEEE Industrial Electronics Society (IES), Singapore, 18–21 October 2020; pp. 428–434. [Google Scholar]
- Shu, J.; He, J.; Li, L. MSIS: Multispectral instance segmentation method for power equipment. Comput. Intell. Neurosci. 2022, 2022, 2864717. [Google Scholar] [CrossRef] [PubMed]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Yu, C.; Wang, J.; Peng, C.; Jiang, A.; Lu, W. Bisenet: Bilateral segmentation network for real-time semantic segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 325–341. [Google Scholar]
- Zhou, W.; Liu, J.; Lei, J.; Yu, L.; Hwang, J.-N. GMNet: Graded-feature multilabel-learning network for RGB-thermal urban scene semantic segmentation. IEEE Trans. Image Process. 2021, 30, 7790–7802. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Q.; Zhao, S.; Luo, Y.; Zhang, D.; Huang, N.; Han, J. ABMDRNet: Adaptive-weighted bi-directional modality difference reduction network for RGB-T semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 2633–2642. [Google Scholar]
- Li, G.; Wang, Y.; Liu, Z.; Zhang, X.; Zeng, D. RGB-T semantic segmentation with location, activation, and sharpening. IEEE Trans. Circuits Syst. Video Technol. 2022, 33, 1223–1235. [Google Scholar] [CrossRef]
Pylon | Vertical Insulator | Shockproof Hammer | Connecting Hardware | MAcc | MIOU | |||||
---|---|---|---|---|---|---|---|---|---|---|
Acc | IOU | Acc | IOU | Acc | IOU | Acc | IOU | |||
BiSeNet | 89.7 | 85.0 | 89.2 | 35.6 | 62.8 | 12.1 | 79.2 | 26.2 | 80.2 | 39.7 |
RTFNet | 89.5 | 86.1 | 87.3 | 64.8 | 85.3 | 37.2 | 85.8 | 48.4 | 87.0 | 59.1 |
FuseSeg | 89.8 | 86.8 | 88.2 | 76.4 | 80.6 | 64.3 | 87.1 | 67.1 | 86.4 | 73.7 |
ABMDRNet | 89.2 | 87.7 | 89.4 | 75.8 | 86.2 | 63.1 | 86.6 | 67.4 | 87.9 | 73.5 |
GMNet | 89.0 | 88.2 | 89.5 | 72.0 | 88.8 | 44.4 | 88.4 | 54.7 | 88.9 | 64.8 |
LASNet | 89.5 | 88.0 | 89.5 | 79.2 | 88.6 | 61.5 | 88.6 | 67.8 | 89.0 | 74.1 |
Ours | 89.9 | 88.3 | 89.4 | 81.8 | 88.9 | 68.1 | 89.5 | 73.1 | 89.6 | 77.9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yuan, J.; Wang, T.; Huo, G.; Jin, R.; Wang, L. Semantic Segmentation Algorithm Fusing Infrared and Natural Light Images for Automatic Navigation in Transmission Line Inspection. Electronics 2023, 12, 4810. https://doi.org/10.3390/electronics12234810
Yuan J, Wang T, Huo G, Jin R, Wang L. Semantic Segmentation Algorithm Fusing Infrared and Natural Light Images for Automatic Navigation in Transmission Line Inspection. Electronics. 2023; 12(23):4810. https://doi.org/10.3390/electronics12234810
Chicago/Turabian StyleYuan, Jie, Ting Wang, Guanying Huo, Ran Jin, and Lidong Wang. 2023. "Semantic Segmentation Algorithm Fusing Infrared and Natural Light Images for Automatic Navigation in Transmission Line Inspection" Electronics 12, no. 23: 4810. https://doi.org/10.3390/electronics12234810
APA StyleYuan, J., Wang, T., Huo, G., Jin, R., & Wang, L. (2023). Semantic Segmentation Algorithm Fusing Infrared and Natural Light Images for Automatic Navigation in Transmission Line Inspection. Electronics, 12(23), 4810. https://doi.org/10.3390/electronics12234810