1. Introduction
Hyperspectral image refers to an image with a continuous spectral range and containing multiple narrow-band wavebands. As a data cube, it contains hundreds of continuous spectral bands covering a wide range. For any point in space, a hyperspectral image can reconstruct the corresponding material area through its continuous and fine spectral curve and obtain its spatial and material properties. It is often used to obtain the spectral characteristics of surface materials, enabling quantitative analysis and the identification of material components. With the rapid development of aerospace and remote sensing technology, hyperspectral images based on remote sensing satellites are widely used in land detection [
1,
2], urban planning [
3], road network layout [
4], agricultural yield estimation [
5], disaster prevention and control [
6,
7], and other fields.
Due to the unique imaging characteristics of hyperspectral images, spatial resolution and spectral resolution are two important criteria for measuring image quality in hyperspectral imaging systems. Spatial resolution refers to the smallest target size that can be resolved by the sensor, which can accurately describe the spatial information of image details. Spectral resolution refers to the resolution of feature detail information in the spectral dimension of an image, which can distinguish feature characteristics that are similar to the human eye and is widely used in remote sensing. In the imaging principle of hyperspectral images, each spectral image corresponds to a very narrow spectral window. Only by using a larger instantaneous field of view and a longer exposure time to collect enough photons can the signal-to-noise ratio of the spectral image be improved and a higher spatial resolution be obtained [
8]. However, the spectral resolution is inversely proportional to the size of the instantaneous field of view. Therefore, a balance between spatial resolution and spectral resolution needs to be struck in the process of hyperspectral imaging. With the overlay of spectral features, hyperspectral images typically lower their spatial resolution to achieve higher spectral resolution [
9]. If the emphasis is on hardware improvement, it not only poses challenges to current engineering technology but also goes against the lightweight and commercial design concept advocated by remote sensing satellites. Under the current limitations of hyperspectral imaging technology, it has become an urgent problem to maintain high spectral resolution while improving spatial resolution through software algorithms.
Image super-resolution reconstruction can infer a high-resolution image from one or multiple consecutive low-resolution images. It can break through the limitations of the imaging system and improve the spatial resolution of the image in the post-processing stage. The development of natural image super-resolution is becoming increasingly mature, while there is still much room for progress in hyperspectral image super-resolution. We classify them into three categories: single-frame HSI super-resolution methods; auxiliary image fusion hyperspectral image super-resolution (such as panchromatic, RGB, or multispectral image); and multi-frame fusion super-resolution methodd within hyperspectral images.
Single-frame hyperspectral image super-resolution methods are derived from natural image super-resolution methods, which mainly include interpolation-based methods, reconstruction-based methods, and learning-based methods. Among them, interpolation-based methods include nearest neighbor interpolation, first-order interpolation, bicubic interpolation, etc. However, interpolation methods can cause edge blur and artifacts and cannot fully utilize image abstraction information. Akgun et al. proposed a novel hyperspectral image acquisition model and a convex set projection algorithm to reconstruct high-resolution hyperspectral images [
10]. Huang et al. proposed a dictionary-based super-resolution method by combining low-rank and group sparsity properties [
11]. Wang et al. proposed a super-resolution method based on a non-local approximate tensor [
12]. However, these methods require solving complex and time-consuming optimization problems during the testing phase and also require prior knowledge of the image, which makes it difficult to flexibly apply them to hyperspectral images. With the rapid development of convolutional neural networks, deep learning has shown superior performance in computer vision tasks. Super-resolution algorithms such as SRCNN [
13], VDSR [
14], EDSR [
15], D-DPN [
16], and SAN [
17] have been proposed successively. They have shown superior performance in natural image super-resolution. However, when it comes to hyperspectral image super-resolution, the above-mentioned algorithms cannot explore spectral and spatial information, and their network representation ability is weak. Moreover, these natural image super-resolution methods have large parameters and are difficult to apply to multi-band hyperspectral images. In addition, there are few hyperspectral datasets available, which makes it difficult to support the learning of these algorithms. Yuan et al. proposed a method to transfer the knowledge learned from natural images to reconstruct high-resolution hyperspectral images [
18]. However, the above method has limited improvement in spatial and spectral resolution.
The method of auxiliary image fusion hyperspectral image super-resolution combines low-resolution hyperspectral images with high-resolution RGB images, multispectral images, or panchromatic images. Starting from both spectral and spatial information, the method aims to obtain hyperspectral images with high spatial resolution and high spectral resolution. Which can be categorized into five types: pan-sharpening extension [
19,
20], Bayesian inference [
21,
22], matrix decomposition [
23,
24], tensor decomposition [
25,
26], and deep learning [
27,
28,
29]. The above method requires high-resolution auxiliary images and has image registration issues, which make it difficult to implement in practical applications. The above method requires high-resolution auxiliary images and has image registration issues, which make it difficult to implement in practical applications.
In the absence of auxiliary images, the multi-frame fusion super-resolution method within a single hyperspectral image has received widespread attention. Jiang et al. [
30] proposed a spatial-spectral prior network that utilizes the correlation between spatial and spectral information in hyperspectral images through group convolutions with progressive upsampling and shared parameters, but the network has a huge number of parameters. Wang et al. [
31] proposed a sequential recursive feedback network that explores complementary and continuous information in hyperspectral images and preserves the spatial and spectral structures of spectral images. Hu et al. [
32] proposed a hyperspectral image super-resolution method based on a deep information distillation network and internal fusion. Due to the strong correlation between bands in hyperspectral images, inspired by video multi-frame super-resolution, 3D convolution has begun to enter the field of hyperspectral image super-resolution [
33]. Li et al. proposed a 2D and 3D hybrid module for image reconstruction, which to some extent alleviates the redundancy of the network structure while achieving the same performance [
34]. To save parameters, Li et al. proposed a combined spectrum and feature context network [
35]. In order to address the issue of excessive parameters in 3D convolutions, Jia et al. proposed a method called “Diffused Convolutional Neural Network for Hyperspectral Image Super-Resolution” [
36], which has achieved good results.
It is difficult to simultaneously improve the spatial resolution of hyperspectral images while preserving their spectral characteristics. Hu et al. proposed a spectral difference network that separates spatial and spectral information for learning, which improves spatial resolution while preserving spectral characteristics [
37]. However, the network structure is too redundant, and the improvement in spatial resolution is limited. Hu et al. [
38] integrated the spectral difference module with the super-resolution reconstruction module, reducing the number of network parameters and enhancing the network’s generalization ability. References [
39,
40] propose a new method based on a spectral angle loss function to preserve spectral features.
In the problem of hyperspectral image super-resolution, traditional interpolation, Bayesian-based, matrix-based, and tensor decomposition methods require a significant amount of prior knowledge and have difficulties in model solving. Deep learning methods, relying on their excellent feature learning capabilities, construct models to obtain high-resolution images. However, current deep learning methods in hyperspectral super-resolution still have limitations, such as inadequate exploration of spectral and spatial correlations, redundant network structures, excessive parameters, and the inability to explore global features in both spatial and spectral domains. This paper proposes a hybrid convolution and spectral symmetry preservation network for hyperspectral super-resolution reconstruction. It uses a spatial-spectral symmetric 3D convolution to extract low-resolution bands and their adjacent band features, thus exploring the spatial and spectral correlation of hyperspectral images. A 2D convolution module, composed of deformable convolution and attention mechanisms, is designed to extract low-resolution band features and learn spatial information to the maximum extent. Finally, through the fusion module and Fourier transform reconstruction module, the network efficiently learns global and local information to obtain high-resolution hyperspectral images with high spectral fidelity.
Traditional algorithms for hyperspectral image super-resolution require a significant amount of prior knowledge and face challenges in the actual solving process. In contrast, deep-learning-based methods for hyperspectral image super-resolution have the ability to autonomously learn a large amount of feature information and construct high-resolution images. Therefore, this paper proposes a network framework based on deep learning. Compared to natural images, hyperspectral images have significantly more bands, and each band contains different spatial and spectral information. However, there is a high degree of similarity in the spectral information between adjacent bands. By effectively utilizing the spatial information between adjacent bands while preserving the spectral information of the current band, this paper proposes a design approach based on the supplementation of information from neighboring bands. The network consists of two parallel branches: a single-band feature extraction network and a multi-band feature extraction network. This network sequentially extracts the target super-resolution band and its adjacent bands. The target low-resolution band is processed by the single-band feature extraction network that consists of residual 2D convolutions, attention mechanisms, and deformable convolutions. The purpose of this network is to focus on the feature information of the target band. The residual 2D convolutions are used to extract both the shallow and deep information of the band. To capture more spatial and spectral information among different channels, we employ 3D residual convolutional modules in the multi-channel feature extraction network. Compared to 2D convolutions, 3D convolutions have an additional spatial dimension, making them widely used in hyperspectral image processing. However, the increased exploration capability in multiple dimensions also leads to an explosion of parameters. To address this issue, this paper introduces a novel spectral-symmetric 3D convolution, which significantly reduces the network parameters. In hyperspectral images, there exist distant spectral bands that contain similar spectral information and spatial information that can complement each other. To address this, we propose a context feature fusion module in our approach for integrating information from distant bands that mutually complement each other. Furthermore, in the reconstruction module, conventional upsampling methods often struggle to consider global information and only focus on the current pixel and its surrounding feature information. To overcome this limitation, we introduce Fourier transform upsampling in our approach for reconstructing high-resolution hyperspectral images. This approach takes into account global information and improves the overall quality of the reconstructed images.