Hyperspectral Object Detection Based on Spatial–Spectral Fusion and Visual Mamba
Abstract
:1. Introduction
- (1)
- We first propose an edge-preserving dimensionality reduction (EPDR) method based on spatial texture feature weight fusion to ensure that the main spectral features are extracted during the dimensionality reduction process and the key edge and texture information in the image is also preserved;
- (2)
- We propose a multi-scale spatial-feature-enhancement module (SFEM) based on the fusion of a CNN and Mamba, and the experimental results demonstrate the effectiveness of the proposed module;
- (3)
- We analyze the processing speeds of pixel-level and object-level algorithms, demonstrating the superiority of object-level algorithms in terms of efficiency.
2. Related Works
2.1. Pixel-Level Hyperspectral Object Detection
2.2. Target-Level Hyperspectral Object Detection
2.3. Hyperspectral Feature Fusion
2.4. Target-Level Object Detection Based on Pre-Fusion Methods
3. Method
3.1. Edge-Preserving Dimensionality Reduction
3.2. Spatial Feature Enhancement Module (SFEM)
4. Experiments
4.1. Dataset
4.1.1. HOD1
4.1.2. HOD3K
4.2. Experimental Environment
4.3. Comparative Experiment
4.3.1. Comparison Between Pixel-Level and Target-Level Detection
4.3.2. Comparative Experiments with State-of-the-Art Algorithms
4.3.3. Visual Comparison of Detection Results
4.4. Ablation Experiment
4.4.1. Effectiveness of the Proposed EPDR Module
4.4.2. Effectiveness of the Proposed SFEM
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Liu, Z.; Wang, X.; Zhong, Y.; Shu, M.; Sun, C. SiamHYPER: Learning a Hyperspectral Object Tracker from an RGB-Based Tracker. IEEE Trans. Image Process. 2022, 31, 7116–7129. [Google Scholar] [CrossRef] [PubMed]
- Ömrüuzun, F.; Çetin, Y.Y.; Leloğlu, U.M.; Demir, B. A Novel Semantic Content-Based Retrieval System for Hyperspectral Remote Sensing Imagery. Remote Sens. 2024, 16, 1462. [Google Scholar] [CrossRef]
- Zheng, H.; Li, D.; Zhang, M.; Gong, M.; Qin, A.K.; Liu, T.; Jiang, F. Spectral Knowledge Transfer for Remote Sensing Change Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4501316. [Google Scholar] [CrossRef]
- Zhang, W.; Li, Z.; Li, G.; Zhuang, P.; Hou, G.; Zhang, Q.; Li, C. GACNet: Generate Adversarial-Driven Cross-Aware Network for Hyperspectral Wheat Variety Identification. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5503314. [Google Scholar] [CrossRef]
- Chen, M.; Feng, S.; Zhao, C.; Qu, B.; Su, N.; Li, W.; Tao, R. Fractional Fourier-Based Frequency-Spatial–Spectral Prototype Network for Agricultural Hyperspectral Image Open-Set Classification. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5514014. [Google Scholar] [CrossRef]
- Neri, I.; Caponi, S.; Bonacci, F.; Clementi, G.; Cottone, F.; Gammaitoni, L.; Figorilli, S.; Ortenzi, L.; Aisa, S.; Pallottino, F.; et al. Real-Time AI-Assisted Push-Broom Hyperspectral System for Precision Agriculture. Sensors 2024, 24, 344. [Google Scholar] [CrossRef]
- Darvishi, P.; Karimi, D. Environmental Studies of the Khorramrood River in Iran, Based on Transformed High-Resolution Remotely Sensed Spectroscopic Data. Egypt. J. Remote Sens. Space Sci. 2024, 27, 298–316. [Google Scholar] [CrossRef]
- Liu, B.; Li, T. A Machine-Learning-Based Framework for Retrieving Water Quality Parameters in Urban Rivers Using UAV Hyperspectral Images. Remote Sens. 2024, 16, 905. [Google Scholar] [CrossRef]
- Yang, Z.; Albrow-Owen, T.; Cai, W.; Hasan, T. Miniaturization of Optical Spectrometers. Science 2021, 371, eabe0722. [Google Scholar] [CrossRef] [PubMed]
- Geelen, B.; Blanch, C.; Gonzalez, P.; Tack, N.; Lambrechts, A. A Tiny VIS-NIR Snapshot Multispectral Camera. In Advanced Fabrication Technologies for Micro/Nano Optics and Photonics VIII, Proceedings of the SPIE OPTO 2015, San Francisco, CA, USA, 13 March 2015; Von Freymann, G., Schoenfeld, W.V., Rumpf, R.C., Helvajian, H., Eds.; SPIE: Bellingham, WA, USA, 2015; p. 937414. [Google Scholar]
- Geelen, B.; Tack, N.; Lambrechts, A. A Compact Snapshot Multispectral Imager with a Monolithically Integrated Per-Pixel Filter Mosaic. In Advanced Fabrication Technologies for Micro/Nano Optics and Photonics VII, Proceedings of the SPIE MOEMS-MEMS 2014, San Francisco, CA, USA, 7 March 2014; Von Freymann, G., Schoenfeld, W.V., Rumpf, R.C., Eds.; SPIE: Bellingham, WA, USA, 2014; p. 89740L. [Google Scholar]
- Yan, L.; Zhao, M.; Wang, X.; Zhang, Y.; Chen, J. Object Detection in Hyperspectral Images. IEEE Signal Process. Lett. 2021, 28, 508–512. [Google Scholar] [CrossRef]
- Ding, N.; Zhang, C.; Eskandarian, A. SalienDet: A Saliency-Based Feature Enhancement Algorithm for Object Detection for Autonomous Driving. IEEE Trans. Intell. Veh. 2024, 9, 2624–2635. [Google Scholar] [CrossRef]
- Shao, Z.; Wang, L.; Wang, Z.; Du, W.; Wu, W. Saliency-Aware Convolution Neural Network for Ship Detection in Surveillance Video. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 781–794. [Google Scholar] [CrossRef]
- Fu, J.; Zong, L.; Li, Y.; Li, K.; Yang, B.; Liu, X. Model Adaption Object Detection System for Robot. In Proceedings of the 2020 39th Chinese Control Conference (CCC), Shenyang, China, 27–29 July 2020; IEEE: New York, NY, USA, 2020; pp. 3659–3664. [Google Scholar]
- He, X.; Tang, C.; Liu, X.; Zhang, W.; Sun, K.; Xu, J. Object Detection in Hyperspectral Image via Unified Spectral-Spatial Feature Aggregation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5521213. [Google Scholar] [CrossRef]
- Chang, C.-I. An Information-Theoretic Approach to Spectral Variability, Similarity, and Discrimination for Hyperspectral Image Analysis. IEEE Trans. Inf. Theory 2000, 46, 1927–1932. [Google Scholar] [CrossRef]
- Settle, J. On Constrained Energy Minimization and the Partial Unmixing of Multispectral Images. IEEE Trans. Geosci. Remote Sens. 2002, 40, 718–721. [Google Scholar] [CrossRef]
- Su, H.; Wu, Z.; Zhang, H.; Du, Q. Hyperspectral Anomaly Detection: A Survey. IEEE Geosci. Remote Sens. Mag. 2022, 10, 64–90. [Google Scholar] [CrossRef]
- Reed, I.S.; Yu, X. Adaptive Multiple-Band CFAR Detection of an Optical Pattern with Unknown Spectral Distribution. IEEE Trans. Acoust. Speech Signal Process. 1990, 38, 1760–1770. [Google Scholar] [CrossRef]
- Matteoli, S.; Veracini, T.; Diani, M.; Corsini, G. A Locally Adaptive Background Density Estimator: An Evolution for RX-Based Anomaly Detectors. IEEE Geosci. Remote Sens. Lett. 2014, 11, 323–327. [Google Scholar] [CrossRef]
- Xu, Y.; Wu, Z.; Li, J.; Plaza, A.; Wei, Z. Anomaly Detection in Hyperspectral Images Based on Low-Rank and Sparse Representation. IEEE Trans. Geosci. Remote Sens. 2016, 54, 1990–2000. [Google Scholar] [CrossRef]
- Huyan, N.; Zhang, X.; Zhou, H.; Jiao, L. Hyperspectral Anomaly Detection via Background and Potential Anomaly Dictionaries Construction. IEEE Trans. Geosci. Remote Sens. 2019, 57, 2263–2276. [Google Scholar] [CrossRef]
- Cheng, T.; Wang, B. Graph and Total Variation Regularized Low-Rank Representation for Hyperspectral Anomaly Detection. IEEE Trans. Geosci. Remote Sens. 2020, 58, 391–406. [Google Scholar] [CrossRef]
- Li, W.; Wu, G.; Du, Q. Transferred Deep Learning for Anomaly Detection in Hyperspectral Imagery. IEEE Geosci. Remote Sens. Lett. 2017, 14, 597–601. [Google Scholar] [CrossRef]
- Gong, M.; Zhao, H.; Wu, Y.; Tang, Z.; Feng, K.-Y.; Sheng, K. Dual Appearance-Aware Enhancement for Oriented Object Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5602914. [Google Scholar] [CrossRef]
- Yao, Y.; Cheng, G.; Lang, C.; Yuan, X.; Xie, X.; Han, J. Hierarchical Mask Prompting and Robust Integrated Regression for Oriented Object Detection. IEEE Trans. Circuits Syst. Video Technol. 2024, 10, 3444795. [Google Scholar] [CrossRef]
- Wu, A.; Deng, C. TIB: Detecting Unknown Objects via Two-Stream Information Bottleneck. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 611–625. [Google Scholar] [CrossRef] [PubMed]
- Wu, A.; Deng, C.; Liu, W. Unsupervised Out-of-Distribution Object Detection via PCA-Driven Dynamic Prototype Enhancement. IEEE Trans. Image Process. 2024, 33, 2431–2446. [Google Scholar] [CrossRef] [PubMed]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; IEEE: New York, NY, USA, 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: New York, NY, USA, 2016; pp. 779–788. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Volume 9905, pp. 21–37. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. CenterNet: Keypoint Triplets for Object Detection. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; IEEE: New York, NY, USA, 2019; pp. 6568–6577. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need 2023. Available online: https://arxiv.org/pdf/1706.03762 (accessed on 25 November 2024).
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. In Proceedings of the ICLR 2021, Vienna, Austria, 4 May 2021. [Google Scholar]
- Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable Transformers for End-to-End Object Detection. In Proceedings of the ICLR 2021, Vienna, Austria, 4 May 2021. [Google Scholar]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. DETRs Beat YOLOs on Real-Time Object Detection. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024. [Google Scholar]
- Gao, L.; Chen, L.; Liu, P.; Jiang, Y.; Xie, W.; Li, Y. A Transformer-Based Network for Hyperspectral Object Tracking. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5528211. [Google Scholar] [CrossRef]
- Ahmad, M.; Ghous, U.; Usama, M.; Mazzara, M. WaveFormer: Spectral–Spatial Wavelet Transformer for Hyperspectral Image Classification. IEEE Geosci. Remote Sens. Lett. 2024, 21, 5502405. [Google Scholar] [CrossRef]
- Gong, Z.; Zhou, X.; Yao, W. MultiScale Spectral–Spatial Convolutional Transformer for Hyperspectral Image Classification. IET Image Process. 2024, 18, 4328–4340. [Google Scholar] [CrossRef]
- Chen, J.; Yang, C.; Zhang, L.; Yang, L.; Bian, L.; Luo, Z.; Wang, J. TCCU-Net: Transformer and CNN Collaborative Unmixing Network for Hyperspectral Image. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 8073–8089. [Google Scholar] [CrossRef]
- Dong, X.; Bao, J.; Chen, D.; Zhang, W.; Yu, N.; Yuan, L.; Chen, D.; Guo, B. CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; IEEE: New York, NY, USA, 2022; pp. 12114–12124. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; IEEE: New York, NY, USA, 2021; pp. 9992–10002. [Google Scholar]
- Hassani, A.; Walton, S.; Li, J.; Li, S.; Shi, H. Neighborhood Attention Transformer. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; IEEE: New York, NY, USA, 2023; pp. 6185–6194. [Google Scholar]
- Zhu, L.; Wang, X.; Ke, Z.; Zhang, W.; Lau, R. BiFormer: Vision Transformer with Bi-Level Routing Attention. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; IEEE: New York, NY, USA, 2023; pp. 10323–10333. [Google Scholar]
- Xia, Z.; Pan, X.; Song, S.; Li, L.E.; Huang, G. Vision Transformer with Deformable Attention. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; IEEE: New York, NY, USA, 2022; pp. 4784–4793. [Google Scholar]
- Wang, W.; Xie, E.; Li, X.; Fan, D.-P.; Song, K.; Liang, D.; Lu, T.; Luo, P.; Shao, L. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; IEEE: New York, NY, USA, 2021; pp. 548–558. [Google Scholar]
- Katharopoulos, A.; Vyas, A.; Pappas, N.; Fleuret, F. Transformers Are RNNs: Fast Autoregressive Transformers with Linear Attention. In Proceedings of the International Conference on Machine Learning, Online, 13–18 July 2020. [Google Scholar]
- Liu, Y.; Tian, Y.; Zhao, Y.; Yu, H.; Xie, L.; Wang, Y.; Ye, Q.; Liu, Y. VMamba: Visual State Space Model 2024. Available online: https://arxiv.org/abs/2401.10166 (accessed on 26 May 2024).
- Yang, C.; Chen, Z.; Espinosa, M.; Ericsson, L.; Wang, Z.; Liu, J.; Crowley, E.J. PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition. arXiv 2024, arXiv:2403.17695. [Google Scholar]
- Wang, C.; Huang, J.; Lv, M.; Du, H.; Wu, Y.; Qin, R. A Local Enhanced Mamba Network for Hyperspectral Image Classification. Int. J. Appl. Earth Obs. Geoinf. 2024, 133, 104092. [Google Scholar] [CrossRef]
- Huang, L.; Chen, Y.; He, X. Spectral-Spatial Mamba for Hyperspectral Image Classification. Remote Sens. 2024, 16, 2449. [Google Scholar] [CrossRef]
- Fauvel, M.; Chanussot, J.; Benediktsson, J.A. Kernel Principal Component Analysis for the Classification of Hyperspectral Remote Sensing Data over Urban Areas. EURASIP J. Adv. Signal Process. 2009, 2009, 783194. [Google Scholar] [CrossRef]
- Wang, J.; Chang, C.-I. Independent Component Analysis-Based Dimensionality Reduction with Applications in Hyperspectral Image Analysis. IEEE Trans. Geosci. Remote Sens. 2006, 44, 1586–1600. [Google Scholar] [CrossRef]
- Kang, X.; Li, S.; Benediktsson, J.A. Spectral–Spatial Hyperspectral Image Classification with Edge-Preserving Filtering. IEEE Trans. Geosci. Remote Sens. 2014, 52, 2666–2677. [Google Scholar] [CrossRef]
- Kang, X.; Li, S.; Benediktsson, J.A. Feature Extraction of Hyperspectral Images with Image Fusion and Recursive Filtering. IEEE Trans. Geosci. Remote Sens. 2014, 52, 3742–3752. [Google Scholar] [CrossRef]
- Kang, X.; Xiang, X.; Li, S.; Benediktsson, J.A. PCA-Based Edge-Preserving Features for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 7140–7151. [Google Scholar] [CrossRef]
- Duan, P.; Kang, X.; Li, S.; Ghamisi, P.; Benediktsson, J.A. Fusion of Multiple Edge-Preserving Operations for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 10336–10349. [Google Scholar] [CrossRef]
- Ben Hamida, A.; Benoit, A.; Lambert, P.; Ben Amar, C. 3-D Deep Learning Approach for Remote Sensing Image Classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4420–4434. [Google Scholar] [CrossRef]
- Zhao, Z.; Xu, X.; Li, S.; Plaza, A. Hyperspectral Image Classification Using Groupwise Separable Convolutional Vision Transformer Network. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5511817. [Google Scholar] [CrossRef]
- Hong, D.; Han, Z.; Yao, J.; Gao, L.; Zhang, B.; Plaza, A.; Chanussot, J. SpectralFormer: Rethinking Hyperspectral Image Classification with Transformers. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5518615. [Google Scholar] [CrossRef]
- Sun, L.; Zhao, G.; Zheng, Y.; Wu, Z. Spectral–Spatial Feature Tokenization Transformer for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5522214. [Google Scholar] [CrossRef]
- Sun, L.; Zhang, H.; Zheng, Y.; Wu, Z.; Ye, Z.; Zhao, H. MASSFormer: Memory-Augmented Spectral-Spatial Transformer for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5516415. [Google Scholar] [CrossRef]
- Mei, S.; Song, C.; Ma, M.; Xu, F. Hyperspectral Image Classification Using Group-Aware Hierarchical Transformer. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5539014. [Google Scholar] [CrossRef]
- Wang, Z.; Li, C.; Xu, H.; Zhu, X. Mamba YOLO: SSMs-Based YOLO For Object Detection. arXiv 2024, arXiv:2406.05835. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: A Simple and Strong Anchor-Free Object Detector. 2020. Available online: https://arxiv.org/abs/2006.09214 (accessed on 12 October 2020).
- Zhou, X.; Wang, D.; Krähenbühl, P. Objects as Points. arXiv 2019, arXiv:1904.07850. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. arXiv 2017, arXiv:1708.02002. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv 2022, arXiv:2209.02976. [Google Scholar]
Platform | Name |
---|---|
CPU | 16 vCPU Intel (R) Xeon (R) Platinum 8481C |
GPU | RTX 4090D |
System | Ubuntu 20.04 |
Memory | 24 GB RAM |
GPU acceleration tool | CUDA 11.8 |
Algorithm | Backbone | Channel | Precision | Recall | mAP50 | mAP50:95 | Params (M) | GFLOPS |
---|---|---|---|---|---|---|---|---|
RT-DETR | Transformer | 3 | 0.812 | 0.717 | 0.794 | 0.412 | 61 | 191.4 |
Mamba-YOLO [67] | ODMamba | 3 | 0.778 | 0.689 | 0.769 | 0.406 | 5.98 | 13.6 |
FCOS [68] | ResNet50 | 3 | 0.723 | 0.721 | 0.764 | 0.397 | 32.11 | 161.21 |
CenterNet [69] | ResNet50 | 3 | 0.585 | 0.41 | 0.502 | 0.247 | 32.6 | 70.21 |
RetinaNet [70] | ResNet50 | 3 | 0.737 | 0.475 | 0.562 | 0.253 | 37.96 | 170 |
Faster RCNN | ResNet50 | 3 | 0.34 | 0.845 | 0.654 | 0.31 | 137 | 370 |
YOLOv3 [71] | Darknet53 | 3 | 0.799 | 0.655 | 0.756 | 0.405 | 12.13 | 19 |
YOLOv5 | Darknet53 | 3 | 0.724 | 0.664 | 0.786 | 0.43 | 2.51 | 7.2 |
YOLOv6 [72] | Darknet53 | 3 | 0.778 | 0.683 | 0.775 | 0.448 | 4.23 | 11.9 |
YOLOv8 | Darknet53 | 3 | 0.782 | 0.675 | 0.789 | 0.432 | 3 | 8.2 |
YOLOv9t | Darknet53 | 3 | 0.784 | 0.713 | 0.766 | 0.429 | 2 | 7.9 |
YOLOv10n | Darknet53 | 3 | 0.733 | 0.682 | 0.735 | 0.409 | 2.70 | 8.4 |
YOLOv11 | Darknet53 | 3 | 0.742 | 0.635 | 0.766 | 0.446 | 2.59 | 6.4 |
S2ADet [16] | Darknet53 | 3 + 3 | 0.739 | 0.764 | 0.792 | 0.438 | 222.96 | 169.2 |
Ours | Darknet53 | 3 | 0.865 | 0.722 | 0.808 | 0.442 | 18 | 24.2 |
Algorithm | Backbone | Channel | Precision | Recall | mAP50 | mAP50:95 | Params (M) | GFLOPS |
---|---|---|---|---|---|---|---|---|
RT-DETR | Transformer | 3 | 0.958 | 0.906 | 0.948 | 0.778 | 61 | 191.4 |
Mamba-YOLO | ODMamba | 3 | 0.937 | 0.844 | 0.922 | 0.758 | 5.98 | 13.6 |
FCOS | ResNet50 | 3 | 0.942 | 0.866 | 0.937 | 0.776 | 32.11 | 161.21 |
CenterNet | ResNet50 | 3 | 0.891 | 0.848 | 0.913 | 0.752 | 32.6 | 70.21 |
RetinaNet | ResNet50 | 3 | 0.811 | 0.749 | 0.884 | 0.717 | 37.96 | 170 |
Faster RCNN | ResNet50 | 3 | 0.947 | 0.894 | 0.947 | 0.772 | 137 | 370 |
YOLOv3 | Darknet53 | 3 | 0.925 | 0.845 | 0.917 | 0.744 | 12.13 | 19 |
YOLOv5 | Darknet53 | 3 | 0.951 | 0.869 | 0.942 | 0.774 | 2.51 | 7.2 |
YOLOv6 | Darknet53 | 3 | 0.872 | 0.721 | 0.891 | 0.722 | 4.23 | 11.9 |
YOLOv8 | Darknet53 | 3 | 0.952 | 0.852 | 0.944 | 0.762 | 3 | 8.2 |
YOLOv9t | Darknet53 | 3 | 0.944 | 0.809 | 0.915 | 0.759 | 2 | 7.9 |
YOLOv10n | Darknet53 | 3 | 0.876 | 0.792 | 0.868 | 0.744 | 2.70 | 8.4 |
YOLOv11 | Darknet53 | 3 | 0.956 | 0.866 | 0.942 | 0.756 | 2.59 | 6.4 |
S2ADet | Darknet53 | 3 + 3 | 0.962 | 0.872 | 0.933 | 0.769 | 222.96 | 169.2 |
Ours | Darknet53 | 3 | 0.964 | 0.878 | 0.958 | 0.783 | 18 | 24.2 |
YOLOv8n | EPDR | SFEM | TL | mAP50 | mAP50:95 | Precision | Recall |
---|---|---|---|---|---|---|---|
√ | 0.561 | 0.296 | 0.741 | 0.48 | |||
√ | √ | 0.781 | 0.432 | 0.842 | 0.711 | ||
√ | √ | 0.792 | 0.512 | 0.851 | 0.749 | ||
√ | √ | √ | 0.808 | 0.442 | 0.865 | 0.722 | |
√ | √ | √ | √ | 0.845 | 0.534 | 0.877 | 0.768 |
Algorithm | Channel | Time (ms) | People | Bike | Car | mAP50 | mAP50:95 | Precion | Recall |
---|---|---|---|---|---|---|---|---|---|
Raw Data | 16 | 9.9 | 0.386 | 0.629 | 0.667 | 0.561 | 0.296 | 0.741 | 0.48 |
PCA | 3 | 1.4 | 0.628 | 0.652 | 0.874 | 0.718 | 0.414 | 0.807 | 0.62 |
PCA + EPF | 3 | 1.6 | 0.701 | 0.711 | 0.892 | 0.768 | 0.429 | 0.821 | 0.642 |
FNGBS | 3 | 3.5 | 0.638 | 0.704 | 0.869 | 0.737 | 0.421 | 0.775 | 0.688 |
EFDPC | 3 | 1.8 | 0.431 | 0.691 | 0.876 | 0.666 | 0.381 | 0.743 | 0.597 |
ASPS | 3 | 1.8 | 0.697 | 0.695 | 0.863 | 0.751 | 0.424 | 0.713 | 0.705 |
Ours | 3 | 1.7 | 0.717 | 0.724 | 0.902 | 0.781 | 0.432 | 0.842 | 0.711 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, W.; Yuan, F.; Zhang, H.; Lv, Z.; Wu, B. Hyperspectral Object Detection Based on Spatial–Spectral Fusion and Visual Mamba. Remote Sens. 2024, 16, 4482. https://doi.org/10.3390/rs16234482
Li W, Yuan F, Zhang H, Lv Z, Wu B. Hyperspectral Object Detection Based on Spatial–Spectral Fusion and Visual Mamba. Remote Sensing. 2024; 16(23):4482. https://doi.org/10.3390/rs16234482
Chicago/Turabian StyleLi, Wenjun, Fuqiang Yuan, Hongkun Zhang, Zhiwen Lv, and Beiqi Wu. 2024. "Hyperspectral Object Detection Based on Spatial–Spectral Fusion and Visual Mamba" Remote Sensing 16, no. 23: 4482. https://doi.org/10.3390/rs16234482
APA StyleLi, W., Yuan, F., Zhang, H., Lv, Z., & Wu, B. (2024). Hyperspectral Object Detection Based on Spatial–Spectral Fusion and Visual Mamba. Remote Sensing, 16(23), 4482. https://doi.org/10.3390/rs16234482