Hierarchical Spectral–Spatial Transformer for Hyperspectral and Multispectral Image Fusion
Abstract
:1. Introduction
- We introduce a Hierarchical Spectral–Spatial Transformer network (HSST) for the fusion of HSI and MSI. The HSST is designed to extract and merge deep spectral and spatial features via hierarchical Spectral–Spatial Transformers and subsequently reconstruct HR-HSI through a process of hierarchical progressive fusion.
- We also propose the use of a Hierarchical Spectral–Spatial Transformer to more effectively capture cross-modality spectral and spatial features at multiple scales. In addition to the traditional multi-head self-attention transformers, cross attention is incorporated to enhance the extraction of cross-modality features.
- To optimize the spatial details of the reconstructed HR-HSI, hierarchical progressive fusion is proposed to gradually recover spatial detail information through progressive upsampling and fusion. This cumulative process facilitates the gradual reconstruction of the HR-HSI result.
2. Materials and Methods
2.1. Related Works
2.1.1. HSI and MSI Fusion
2.1.2. Hyperspectral Image Transformer
2.2. Proposed Method
2.2.1. Spectral–Spatial Transformer
2.2.2. Hierarchical Progressive Fusion
3. Results
3.1. Experimental Settings
3.2. Evaluation Metrics
3.3. Experimental Results on the Pavia Center Dataset
3.4. Experimental Results on the Botswana Dataset
3.5. Experimental Results on the Urban Dataset
4. Discussion
4.1. Ablation Studies
4.2. Classification Performance Studies
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zhuang, L.; Ng, M.K.; Fu, X.; Bioucas-Dias, J.M. Hy-Demosaicing: Hyperspectral Blind Reconstruction from Spectral Sub-Sampling. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
- Bian, J.; Li, A.; Zhang, Z.; Zhao, W.; Lei, G.; Yin, G.; Jin, H.; Tan, J.; Huang, C. Monitoring Fractional Green Vegetation Cover Dynamics over a Seasonally Inundated Alpine Wetland Using Dense Time Series HJ-1A/B Constellation Images and an Adaptive Endmember Selection LSMM Model. Remote Sens. Environ. 2017, 197, 98–114. [Google Scholar] [CrossRef]
- Jia, S.; Shen, L.; Zhu, J.; Li, Q. A 3-D Gabor Phase-Based Coding and Matching Framework for Hyperspectral Imagery Classification. IEEE Trans. Cybern. 2018, 48, 1176–1188. [Google Scholar] [CrossRef]
- Zhao, J.; Zhong, Y.; Hu, X.; Wei, L.; Zhang, L. A Robust Spectral-Spatial Approach to Identifying Heterogeneous Crops Using Remote Sensing Imagery with High Spectral and Spatial Resolutions. Remote Sens. Environ. 2020, 239, 111605. [Google Scholar] [CrossRef]
- Fu, X.; Jia, S.; Zhuang, L.; Xu, M.; Zhou, J.; Li, Q. Hyperspectral Anomaly Detection via Deep Plug-and-Play Denoising CNN Regularization. IEEE Trans. Geosci. Remote Sens. 2021, 59, 9553–9568. [Google Scholar] [CrossRef]
- Zhuang, L.; Fu, X.; Ng, M.K.; Bioucas-Dias, J.M. Hyperspectral Image Denoising Based on Global and Nonlocal Low-Rank Factorizations. IEEE Trans. Geosci. Remote Sens. 2021, 59, 10438–10454. [Google Scholar] [CrossRef]
- Ghassemian, H. A Review of Remote Sensing Image Fusion Methods. Inf. Fusion 2016, 32, 75–89. [Google Scholar] [CrossRef]
- Wei, Q.; Dobigeon, N.; Tourneret, J.-Y. Fast Fusion of Multi-Band Images Based on Solving a Sylvester Equation. IEEE Trans. Image Process. 2015, 24, 4109–4121. [Google Scholar] [CrossRef] [PubMed]
- Dian, R.; Li, S.; Fang, L.; Wei, Q. Multispectral and Hyperspectral Image Fusion with Spectral-Spatial Sparse Representation. Inf. Fusion 2019, 49, 262–270. [Google Scholar] [CrossRef]
- Fu, X.; Jia, S.; Xu, M.; Zhou, J.; Li, Q. Fusion of Hyperspectral and Multispectral Images Accounting for Localized Inter-Image Changes. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–18. [Google Scholar] [CrossRef]
- Li, S.; Dian, R.; Fang, L.; Bioucas-Dias, J.M. Fusing Hyperspectral and Multispectral Images via Coupled Sparse Tensor Factorization. IEEE Trans. Image Process. 2018, 27, 4118–4130. [Google Scholar] [CrossRef] [PubMed]
- Dian, R.; Fang, L.; Li, S. Hyperspectral Image Super-Resolution via Non-Local Sparse Tensor Factorization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5344–5353. [Google Scholar]
- Yang, Q.; Xu, Y.; Wu, Z.; Wei, Z. Hyperspectral and Multispectral Image Fusion Based on Deep Attention Network. In Proceedings of the 2019 10th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing (WHISPERS), Amsterdam, The Netherlands, 24–26 September 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–5. [Google Scholar]
- Cai, J.; Huang, B. Super-Resolution-Guided Progressive Pansharpening Based on a Deep Convolutional Neural Network. IEEE Trans. Geosci. Remote Sens. 2020, 59, 5206–5220. [Google Scholar] [CrossRef]
- Wang, X.; Wang, X.; Zhao, K.; Zhao, X.; Song, C. Fsl-Unet: Full-Scale Linked Unet with Spatial–Spectral Joint Perceptual Attention for Hyperspectral and Multispectral Image Fusion. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–14. [Google Scholar] [CrossRef]
- Dong, M.; Li, W.; Liang, X.; Zhang, X. MDCNN: Multispectral Pansharpening Based on a Multiscale Dilated Convolutional Neural Network. J. Appl. Remote Sens. 2021, 15, 036516. [Google Scholar] [CrossRef]
- Benzenati, T.; Kessentini, Y.; Kallel, A. Pansharpening Approach via Two-Stream Detail Injection Based on Relativistic Generative Adversarial Networks. Expert Syst. Appl. 2022, 188, 115996. [Google Scholar] [CrossRef]
- Yokoya, N.; Yairi, T.; Iwasaki, A. Coupled Nonnegative Matrix Factorization Unmixing for Hyperspectral and Multispectral Data Fusion. IEEE Trans. Geosci. Remote Sens. 2011, 50, 528–537. [Google Scholar] [CrossRef]
- Dian, R.; Li, S.; Fang, L.; Lu, T.; Bioucas-Dias, J.M. Nonlocal Sparse Tensor Factorization for Semiblind Hyperspectral and Multispectral Image Fusion. IEEE Trans. Cybern. 2020, 50, 4469–4480. [Google Scholar] [CrossRef]
- Kanatsoulis, C.I.; Fu, X.; Sidiropoulos, N.D.; Ma, W.-K. Hyperspectral Superresolution: A Coupled Tensor Factorization Approach. IEEE Trans. Signal Process. 2018, 66, 6503–6517. [Google Scholar] [CrossRef]
- Grohnfeldt, C.; Zhu, X.X.; Bamler, R. Jointly Sparse Fusion of Hyperspectral and Multispectral Imagery. In Proceedings of the 2013 IEEE International Geoscience and Remote Sensing Symposium-IGARSS, Melbourne, Australia, 21–26 July 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 4090–4093. [Google Scholar]
- Xie, Q.; Zhou, M.; Zhao, Q.; Meng, D.; Zuo, W.; Xu, Z. Multispectral and Hyperspectral Image Fusion by MS/HS Fusion Net. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–17 June 2019; pp. 1585–1594. [Google Scholar]
- Dian, R.; Li, S.; Guo, A.; Fang, L. Deep Hyperspectral Image Sharpening. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 5345–5355. [Google Scholar] [CrossRef]
- Palsson, F.; Sveinsson, J.R.; Ulfarsson, M.O. Multispectral and Hyperspectral Image Fusion Using a 3-D-Convolutional Neural Network. IEEE Geosci. Remote Sens. Lett. 2017, 14, 639–643. [Google Scholar] [CrossRef]
- Wold, S.; Esbensen, K.; Geladi, P. Principal Component Analysis. Chemom. Intell. Lab. Syst. 1987, 2, 37–52. [Google Scholar] [CrossRef]
- Zheng, Y.; Li, J.; Li, Y.; Guo, J.; Wu, X.; Shi, Y.; Chanussot, J. Edge-Conditioned Feature Transform Network for Hyperspectral and Multispectral Image Fusion. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
- Hong, D.; Han, Z.; Yao, J.; Gao, L.; Zhang, B.; Plaza, A.; Chanussot, J. SpectralFormer: Rethinking Hyperspectral Image Classification with Transformers. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–15. [Google Scholar] [CrossRef]
- He, X.; Chen, Y.; Lin, Z. Spectral-Spatial Transformer for Hyperspectral Image Classification. Remote Sens. 2021, 13, 498. [Google Scholar] [CrossRef]
- Selen, A.; Esra, T.-G. SpectralSWIN: A Spectral-Swin Transformer Network for Hyperspectral Image Classification. Int. J. Remote Sens. 2022, 43, 4025–4044. [Google Scholar]
- Cai, Y.; Lin, J.; Hu, X.; Wang, H.; Yuan, X.; Zhang, Y.; Timofte, R.; Van Gool, L. Maskguided Spectral-Wise Transformer for Efficient Hyperspectral Image Reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 17502–17511. [Google Scholar]
- Bandara, W.G.C.; Patel, V.M. HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 1767–1777. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Yuan, Q.; Wei, Y.; Meng, X.; Shen, H.; Zhang, L. A Multiscale and Multidepth Convolutional Neural Network for Remote Sensing Imagery Pan-Sharpening. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 978–989. [Google Scholar] [CrossRef]
- Liu, X.; Liu, Q.; Wang, Y. Remote Sensing Image Fusion Based on Two-Stream Fusion Network. Inf. Fusion 2020, 55, 1–15. [Google Scholar] [CrossRef]
- Han, X.H.; Shi, B.; Zheng, Y. SSF-CNN: Spatial and Spectral Fusion with CNN for Hyperspectral Image Super-Resolution. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 2506–2510. [Google Scholar]
- Wang, X.; Wang, X.; Song, R.; Zhao, X.; Zhao, K. MCT-Net: Multi-Hierarchical Cross Transformer for Hyperspectral and Multispectral Image Fusion. Knowl. Based Syst. 2023, 264, 108630. [Google Scholar] [CrossRef]
- Zhu, T.; Liu, Q.; Zhang, L. An Adaptive Atrous Spatial Pyramid Pooling Network for Hyperspectral Classification. Electronics 2023, 12, 5013. [Google Scholar] [CrossRef]
Methods | PSNR | SAM | ERGAS | RMSE |
---|---|---|---|---|
CNMF | 25.2221 | 4.3635 | 11.4361 | 13.9777 |
MSD-CNN | 35.7566 | 5.2540 | 4.7125 | 4.1563 |
TFNET | 35.6575 | 4.8931 | 3.7617 | 4.2040 |
SSF-CNN | 34.9898 | 4.7308 | 4.9807 | 4.5401 |
MCT-NET | 36.9809 | 4.1504 | 4.1325 | 3.6099 |
HSST | 37.5183 | 4.1338 | 3.8451 | 3.3933 |
Methods | PSNR | SAM | ERGAS | RMSE |
---|---|---|---|---|
CNMF | 26.3457 | 2.4866 | 9.4849 | 26.3457 |
MSD-CNN | 35.7160 | 2.7977 | 3.2249 | 0.5964 |
TFNET | 36.5435 | 2.4479 | 2.9630 | 0.5422 |
SSF-CNN | 30.0626 | 5.1641 | 16.2764 | 1.1434 |
MCT-NET | 37.8955 | 2.1803 | 2.6303 | 0.4640 |
HSST | 37.1824 | 2.2274 | 2.4898 | 0.5037 |
Methods | PSNR | SAM | ERGAS | RMSE |
---|---|---|---|---|
CNMF | 36.0468 | 2.4971 | 1.5612 | 4.0198 |
MSD-CNN | 35.7440 | 3.2117 | 1.8428 | 3.2133 |
TFNET | 35.9584 | 3.0255 | 1.7885 | 3.1350 |
SSF-CNN | 37.3912 | 2.5904 | 1.4638 | 2.6582 |
MCT-NET | 37.3294 | 2.7076 | 1.4312 | 2.6772 |
HSST | 37.6249 | 2.4644 | 1.3954 | 2.5877 |
Methods | PSNR | SAM | ERGAS | RMSE |
---|---|---|---|---|
Spectral Transformer only | 34.3125 | 3.4971 | 2.9380 | 3.7891 |
Spatial Transformer only | 21.2898 | 8.0370 | 9.4513 | 16.9695 |
Without progressive fusion | 37.3294 | 2.7076 | 1.4312 | 2.6772 |
HSST | 37.6249 | 2.4644 | 1.3954 | 2.5877 |
Unfused LR-HSI | Fused HR-HSI | ||||||
CNMF | MSD-CNN | TFNET | SSF-CNN | MCT-NET | HSST | ||
OA | 81.93 | 82.54 | 94.37 | 96.68 | 94.25 | 96.11 | 96.74 |
AA | 48.26 | 71.15 | 68.95 | 83.25 | 78.07 | 91.40 | 86.72 |
Kappa | 0.73 | 0.76 | 0.92 | 0.95 | 0.92 | 0.94 | 0.95 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhu, T.; Liu, Q.; Zhang, L. Hierarchical Spectral–Spatial Transformer for Hyperspectral and Multispectral Image Fusion. Remote Sens. 2024, 16, 4127. https://doi.org/10.3390/rs16224127
Zhu T, Liu Q, Zhang L. Hierarchical Spectral–Spatial Transformer for Hyperspectral and Multispectral Image Fusion. Remote Sensing. 2024; 16(22):4127. https://doi.org/10.3390/rs16224127
Chicago/Turabian StyleZhu, Tianxing, Qin Liu, and Lixiang Zhang. 2024. "Hierarchical Spectral–Spatial Transformer for Hyperspectral and Multispectral Image Fusion" Remote Sensing 16, no. 22: 4127. https://doi.org/10.3390/rs16224127
APA StyleZhu, T., Liu, Q., & Zhang, L. (2024). Hierarchical Spectral–Spatial Transformer for Hyperspectral and Multispectral Image Fusion. Remote Sensing, 16(22), 4127. https://doi.org/10.3390/rs16224127