Real-Time Semantics-Driven Infrared and Visible Image Fusion Network
Abstract
:1. Introduction
- This paper proposes a real-time semantics-driven framework for infrared and visible image fusion that makes full use of semantic information to have the fusion images retain clearer semantic targets;
- This paper proposes local target content loss that guides the fusion network to locally fuse the same targets extracted from infrared and visible images, effectively improving the local target quality of the fusion images;
- Experimental results show that the proposed algorithm outperforms existing popular fusion algorithms in both subjective visualization and objective evaluation.
2. Related Work
2.1. Deep Learning-Based Image Fusion Methods
2.2. Semantic Segmentation
3. Proposed Method
3.1. General Framework
3.2. Network Architecture
3.3. Semantic Segmentation Module
3.4. Loss Function
4. Experimental Validation
4.1. Experimental Configuration
4.2. Comparative Experiments
4.3. Generalization Comparison
4.4. Efficiency Comparison
4.5. Ablation Experiments
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhang, H.; Xu, H.; Tian, X.; Jiang, J.; Ma, J. Image fusion meets deep learning: A survey and perspective. Inf. Fusion 2021, 76, 323–336. [Google Scholar] [CrossRef]
- Ma, J.; Ma, Y.; Li, C. Infrared and visible image fusion methods and applications: A survey. Inf. Fusion 2019, 45, 153–178. [Google Scholar] [CrossRef]
- Cao, Y.; Guan, D.; Huang, W.; Yang, J.; Cao, Y.; Qiao, Y. Pedestrian detection with unsupervised multispectral feature learning using deep neural networks. Inf. Fusion 2019, 46, 206–217. [Google Scholar] [CrossRef]
- Bhatnagar, G.; Wu, Q.J.; Liu, Z. Directive contrast based multimodal medical image fusion in NSCT domain. IEEE Trans. Multimed. 2013, 15, 1014–1024. [Google Scholar] [CrossRef]
- Ha, Q.; Watanabe, K.; Karasawa, T.; Ushiku, Y.; Harada, T. MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 5108–5115. [Google Scholar]
- Simone, G.; Farina, A.; Morabito, F.C.; Serpico, S.B.; Bruzzone, L. Image fusion techniques for remote sensing applications. Inf. Fusion 2002, 3, 3–15. [Google Scholar] [CrossRef] [Green Version]
- Ben Hamza, A.; He, Y.; Krim, H.; Willsky, A. A multiscale approach to pixel-level image fusion. Integr. Comput.-Aided Eng. 2005, 12, 135–146. [Google Scholar] [CrossRef] [Green Version]
- Chen, J.; Li, X.; Luo, L.; Mei, X.; Ma, J. Infrared and visible image fusion based on target-enhanced multiscale transform decomposition. Inf. Sci. 2020, 508, 64–78. [Google Scholar] [CrossRef]
- Wang, J.; Lu, C.; Wang, M.; Li, P.; Yan, S.; Hu, X. Robust face recognition via adaptive sparse representation. IEEE Trans. Cybern. 2014, 44, 2368–2378. [Google Scholar] [CrossRef] [Green Version]
- Liu, Y.; Chen, X.; Ward, R.K.; Wang, Z.J. Image fusion with convolutional sparse representation. IEEE Signal Process. Lett. 2016, 23, 1882–1886. [Google Scholar] [CrossRef]
- Li, H.; Wu, X.J.; Kittler, J. MDLatLRR: A novel decomposition method for infrared and visible image fusion. IEEE Trans. Image Process. 2020, 29, 4733–4746. [Google Scholar] [CrossRef] [Green Version]
- Cvejic, N.; Bull, D.; Canagarajah, N. Region-based multimodal image fusion using ICA bases. IEEE Sens. J. 2007, 7, 743–751. [Google Scholar] [CrossRef] [Green Version]
- Ma, J.; Zhou, Z.; Wang, B.; Zong, H. Infrared and visible image fusion based on visual saliency map and weighted least square optimization. Infrared Phys. Technol. 2017, 82, 8–17. [Google Scholar] [CrossRef]
- Li, H.; Wu, X.J. DenseFuse: A fusion approach to infrared and visible images. IEEE Trans. Image Process. 2018, 28, 2614–2623. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, H.; Wu, X.J.; Durrani, T. NestFuse: An infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Trans. Instrum. Meas. 2020, 69, 9645–9656. [Google Scholar] [CrossRef]
- Liu, Y.; Chen, X.; Cheng, J.; Peng, H.; Wang, Z. Infrared and visible image fusion with convolutional neural networks. Int. J. Wavelets Multiresolut. Inf. Process. 2018, 16, 1850018. [Google Scholar] [CrossRef]
- Zhang, Y.; Liu, Y.; Sun, P.; Yan, H.; Zhao, X.; Zhang, L. IFCNN: A general image fusion framework based on convolutional neural network. Inf. Fusion 2020, 54, 99–118. [Google Scholar] [CrossRef]
- Tang, L.; Yuan, J.; Ma, J. Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network. Inf. Fusion 2022, 82, 28–42. [Google Scholar] [CrossRef]
- Ma, J.; Yu, W.; Liang, P.; Li, C.; Jiang, J. FusionGAN: A generative adversarial network for infrared and visible image fusion. Inf. Fusion 2019, 48, 11–26. [Google Scholar] [CrossRef]
- Ma, J.; Xu, H.; Jiang, J.; Mei, X.; Zhang, X.P. DDcGAN: A dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Trans. Image Process. 2020, 29, 4980–4995. [Google Scholar] [CrossRef]
- Li, J.; Huo, H.; Li, C.; Wang, R.; Feng, Q. AttentionFGAN: Infrared and visible image fusion using attention-based generative adversarial networks. IEEE Trans. Multimed. 2020, 23, 1383–1396. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Lecture Notes in Computer Science, Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef] [Green Version]
- Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and efficient design for semantic segmentation with transformers. Adv. Neural Inf. Process. Syst. 2021, 34, 12077–12090. [Google Scholar]
- Hou, J.; Zhang, D.; Wu, W.; Ma, J.; Zhou, H. A generative adversarial network for infrared and visible image fusion based on semantic segmentation. Entropy 2021, 23, 376. [Google Scholar] [CrossRef]
- Ciocca, G.; Napoletano, P.; Schettini, R. CNN-based features for retrieval and classification of food images. Comput. Vis. Image Underst. 2018, 176, 70–77. [Google Scholar] [CrossRef]
- Zhou, B.; Zhao, H.; Puig, X.; Fidler, S.; Barriuso, A.; Torralba, A. Scene parsing through ade20k dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 633–641. [Google Scholar]
- Ma, K.; Zeng, K.; Wang, Z. Perceptual quality assessment for multi-exposure image fusion. IEEE Trans. Image Process. 2015, 24, 3345–3356. [Google Scholar] [CrossRef]
- Roberts, J.W.; Van Aardt, J.A.; Ahmed, F.B. Assessment of image fusion procedures using entropy, image quality, and multispectral classification. J. Appl. Remote Sens. 2008, 2, 023522. [Google Scholar]
- Rao, Y.J. In-fibre Bragg grating sensors. Meas. Sci. Technol. 1997, 8, 355. [Google Scholar] [CrossRef]
- Qu, G.; Zhang, D.; Yan, P. Information measure for performance of image fusion. Electron. Lett. 2002, 38, 313–315. [Google Scholar] [CrossRef] [Green Version]
- Han, Y.; Cai, Y.; Cao, Y.; Xu, X. A new image fusion performance metric based on visual information fidelity. Inf. Fusion 2013, 14, 127–135. [Google Scholar] [CrossRef]
- Aslantas, V.; Bendes, E. A new image quality metric for image fusion: The sum of the correlations of differences. Aeu-Int. J. Electron. Commun. 2015, 69, 1890–1896. [Google Scholar] [CrossRef]
- Ma, J.; Chen, C.; Li, C.; Huang, J. Infrared and visible image fusion via gradient transfer and total variation minimization. Inf. Fusion 2016, 31, 100–109. [Google Scholar] [CrossRef]
- Li, H.; Wu, X.J.; Kittler, J. RFN-Nest: An end-to-end residual fusion network for infrared and visible images. Inf. Fusion 2021, 73, 72–86. [Google Scholar] [CrossRef]
- Xu, H.; Ma, J.; Jiang, J.; Guo, X.; Ling, H. U2Fusion: A unified unsupervised image fusion network. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 502–518. [Google Scholar] [CrossRef] [PubMed]
EN | SD | MI | MS-SSIM | VIF | SCD | |
---|---|---|---|---|---|---|
GTF [35] | 7.4558 | 73.0327 | 14.9208 | 0.8030 | 0.5246 | 1.0011 |
MDLatLRR [11] | 6.9506 | 50.5803 | 13.9012 | 0.8990 | 0.4695 | 1.2325 |
IFCNN [17] | 7.1369 | 58.7900 | 14.2737 | 0.9101 | 0.7329 | 1.3293 |
FusionGAN [19] | 7.1118 | 50.3857 | 13.8964 | 0.8768 | 0.4781 | 1.0773 |
U2Fusion [37] | 6.9555 | 48.2608 | 13.7975 | 0.8961 | 0.4809 | 1.0971 |
DenseFuse [14] | 7.2675 | 62.7135 | 14.5349 | 0.9238 | 0.6312 | 1.6136 |
NestFuse [15] | 7.4417 | 74.0860 | 14.8834 | 0.8834 | 0.8916 | 1.6369 |
RFN-Nest [36] | 7.3366 | 66.2510 | 14.6733 | 0.8649 | 0.5902 | 1.6387 |
SeAFusion [18] | 7.3590 | 71.5472 | 14.7181 | 0.8970 | 0.8869 | 1.5209 |
RSDFusion (Ours) | 7.4611 | 82.3759 | 14.9719 | 0.8704 | 0.9090 | 1.5684 |
EN | SD | MI | MS-SSIM | VIF | SCD | |
---|---|---|---|---|---|---|
GTF [35] | 6.5416 | 90.9200 | 13.0832 | 0.9319 | 0.7593 | 1.6746 |
MDLatLRR [11] | 6.2199 | 49.6511 | 12.4397 | 0.8956 | 0.3477 | 1.6494 |
IFCNN [17] | 6.5953 | 66.8668 | 13.1907 | 0.9053 | 0.5903 | 1.7137 |
FusionGAN [19] | 6.5633 | 84.6054 | 13.1265 | 0.8246 | 0.4355 | 1.3814 |
U2Fusion [37] | 6.2466 | 46.9912 | 12.4932 | 0.9009 | 0.4264 | 1.6276 |
DenseFuse [14] | 6.6813 | 67.6328 | 13.3627 | 0.9290 | 0.6476 | 1.7990 |
NestFuse [15] | 6.9198 | 82.7524 | 13.8397 | 0.8625 | 0.7895 | 1.7335 |
RFN-Nest [36] | 6.8414 | 71.9016 | 13.6827 | 0.9146 | 0.6577 | 1.8368 |
SeAFusion [18] | 7.0111 | 87.4057 | 14.0221 | 0.8816 | 0.9326 | 1.7449 |
RSDFusion (Ours) | 7.2231 | 103.8587 | 14.4462 | 0.8428 | 1.1158 | 1.6476 |
RoadScene | TNO | |
---|---|---|
GTF [35] | 0.8749 + 1.0116 | 2.3931 + 1.3194 |
MDLatLRR [11] | 20.1372 + 4.2147 | 40.9576 + 13.8720 |
IFCNN [17] | 0.0077 + 0.0030 | 0.0149 + 0.0078 |
FusionGAN [19] | 1.3039 + 0.2174 | 1.4587 + 0.3558 |
U2Fusion [37] | 0.5178 + 0.1064 | 1.0105 + 0.4211 |
DenseFuse [14] | 0.2139 + 0.0254 | 0.3629 + 0.1102 |
NestFuse [15] | 0.1778 + 0.4340 | 0.2915 + 0.4584 |
RFN-Nest [36] | 0.3153 + 0.0685 | 0.3445 + 0.4505 |
SeAFusion [18] | 0.0652 + 0.2783 | 0.0863 + 0.3041 |
RSDFusion | 0.0604 + 0.2514 | 0.0672 + 0.2424 |
EN | SD | MI | SSIM | VIF | SCD | |
---|---|---|---|---|---|---|
W/o semantic | 7.2876 | 65.4618 | 14.5753 | 0.8916 | 0.7518 | 1.6608 |
W/o SSIMLoss | 7.3703 | 81.4396 | 14.7407 | 0.7464 | 0.7799 | 1.3501 |
RSDFusion | 7.4611 | 82.3759 | 14.9719 | 0.8704 | 0.9090 | 1.5684 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zheng, B.; Xiang, T.; Lin, M.; Cheng, S.; Zhang, P. Real-Time Semantics-Driven Infrared and Visible Image Fusion Network. Sensors 2023, 23, 6113. https://doi.org/10.3390/s23136113
Zheng B, Xiang T, Lin M, Cheng S, Zhang P. Real-Time Semantics-Driven Infrared and Visible Image Fusion Network. Sensors. 2023; 23(13):6113. https://doi.org/10.3390/s23136113
Chicago/Turabian StyleZheng, Binhao, Tieming Xiang, Minghuang Lin, Silin Cheng, and Pengquan Zhang. 2023. "Real-Time Semantics-Driven Infrared and Visible Image Fusion Network" Sensors 23, no. 13: 6113. https://doi.org/10.3390/s23136113
APA StyleZheng, B., Xiang, T., Lin, M., Cheng, S., & Zhang, P. (2023). Real-Time Semantics-Driven Infrared and Visible Image Fusion Network. Sensors, 23(13), 6113. https://doi.org/10.3390/s23136113