Next Article in Journal
High-Resolution Humidity Observations Based on Commercial Microwave Links (CML) Data—Case of Tel Aviv Metropolitan Area
Previous Article in Journal
Comparison of Three Different Random Forest Approaches to Retrieve Daily High-Resolution Snow Cover Maps from MODIS and Sentinel-2 in a Mountain Area, Gran Paradiso National Park (NW Alps)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Remote Sensing Image Fusion Method Combining Low-Level Visual Features and Parameter-Adaptive Dual-Channel Pulse-Coupled Neural Network

1
Key Laboratory of Mine Environmental Monitoring and Improving around Poyang Lake of Ministry of Natural Resources, East China University of Technology, Nanchang 330013, China
2
Faculty of Geomatics, East China University of Technology, Nanchang 330013, China
3
State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430079, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2023, 15(2), 344; https://doi.org/10.3390/rs15020344
Submission received: 31 October 2022 / Revised: 23 December 2022 / Accepted: 3 January 2023 / Published: 6 January 2023

Abstract

:
Remote sensing image fusion can effectively solve the inherent contradiction between spatial resolution and spectral resolution of imaging systems. At present, the fusion methods of remote sensing images based on multi-scale transform usually set fusion rules according to local feature information and pulse-coupled neural network (PCNN), but there are problems such as single local feature, as fusion rule cannot effectively extract feature information, PCNN parameter setting is complex, and spatial correlation is poor. To this end, a fusion method of remote sensing images that combines low-level visual features and a parameter-adaptive dual-channel pulse-coupled neural network (PADCPCNN) in a non-subsampled shearlet transform (NSST) domain is proposed in this paper. In the low-frequency sub-band fusion process, a low-level visual feature fusion rule is constructed by combining three local features, local phase congruency, local abrupt measure, and local energy information to enhance the extraction ability of feature information. In the process of high-frequency sub-band fusion, the structure and parameters of the dual-channel pulse-coupled neural network (DCPCNN) are optimized, including: (1) the multi-scale morphological gradient is used as an external stimulus to enhance the spatial correlation of DCPCNN; and (2) implement parameter-adaptive representation according to the difference box-counting, the Otsu threshold, and the image intensity to solve the complexity of parameter setting. Five sets of remote sensing image data of different satellite platforms and ground objects are selected for experiments. The proposed method is compared with 16 other methods and evaluated from qualitative and quantitative aspects. The experimental results show that, compared with the average value of the sub-optimal method in the five sets of data, the proposed method is optimized by 0.006, 0.009, 0.009, 0.035, 0.037, 0.042, and 0.020, respectively, in the seven evaluation indexes of information entropy, mutual information, average gradient, spatial frequency, spectral distortion, ERGAS, and visual information fidelity, indicating that the proposed method has the best fusion effect.

Graphical Abstract

1. Introduction

Remote sensing image fusion is the basis of information extraction, target recognition, and ground object classification, which plays an important role in remote sensing applications [1]. Panchromatic (PAN) and multispectral (MS) image fusion can effectively solve the inherent contradiction between spatial resolution and spectral resolution of imaging system, so it has attracted extensive attention in the field of remote sensing image fusion. How to combine the advantages of the two types of data to the greatest extent to achieve redundancy control and complementary advantages of remote sensing data, thereby improving its interpretation ability and application value, is a hot topic in the field of remote sensing image processing [2].
The four main categories of panchromatic and multispectral image fusion methods include component replacement methods, multi-scale transform methods, model-based methods, and hybrid methods. [3]. The classical methods of component replacement include intensity–hue–saturation (IHS) [4], principal component analysis (PCA) [5] and Gram–Schmidt (GS) [6], etc. These methods have high computational efficiency and good retention effect on spatial details, but there are different degrees of spectral distortion [7]. Multi-scale transform usually decomposes the image into sub-bands of different scales, and then makes corresponding fusion rules according to the characteristics of different sub-bands, so as to accurately extract the features of each sub-band, which can effectively reduce aliasing artifacts and has better spectral domain advantages [8]. Multi-scale transform methods include discrete wavelet transform (DWT) [9], dual-tree complex wavelet transform (DTCWT) [10], curvelet transform [11], non-subsampled contourlet transform (NSCT) [12], and non-subsampled shearlet transform (NSST) [13]. In addition, the fusion performance of the multi-scale transform method largely depends on the setting of different sub-band fusion rules [14]. These fusion rules can be roughly divided into pixel level, block level, and region level, and different activity measures are usually used to measure the active degree of pixels, blocks, and regions [15]. The pixel level fusion rules mainly include taking the largest absolute value and weighted average, which are simple and efficient to calculate based on a single pixel but easily affected by noise. The block-level fusion rules take into account the correlation between pixels and measure image features in the form of blocks. The region-level fusion rule divides different sub-bands into several regions and selects the optimal region measured by the activity measure to construct the fusion sub-band. Therefore, activity measure construction, as the key to fusion rule setting, has attracted the attention of most researchers. Popular activity measures include spatial frequency (SF) [16], sum modified-laplacian (SML) [17], and multi-scale morphological gradient (MSMG) [18].
The main model-based methods are variational optimization models and sparse representations. The variational model-based methods mainly include two steps: the construction of the energy function and the optimal solution of the model function. For example, Khademi et al. combined Markov’s priori model and the Bayesian framework to enhance the fusion effect [19]. Wang et al. constructed a posteriori model to improve the preservation of spatial and spectral information based on the spatial consistency prior, spectral consistency prior, and similarity assumptions [20]. Sparse representation theory uses linear arrangement of a small number of atoms to approximate the image, in which the acquisition of dictionaries is important for the performance of the algorithm. Liu et al. obtained a set of compact sub-dictionaries based on a large number of high-quality trained images and selected a sub-dictionary for sparse representation based on the gradient information of the image blocks [21]. Deng et al. proposed a fusion method based on tensor-based nonconvex sparse modeling [22]. Although some scholars have improved the sparse representation model, sparse encoding and dictionary construction are still challenging problems. In recent years, the Pulse-Coupled Neural Network (PCNN) model has gained wide attention as a third-generation artificial neural network with the advantages of pulse synchronization, global coupling, and strong adaptability. It can effectively extract useful information in complex environment without learning and training [23]. However, traditional PCNN cannot process two images at the same time, with complex structure and low efficiency. In order to make up for the shortcomings of PCNN, a model of a dual-channel pulse-coupled neural network (DCPCNN) which can simultaneously use the pixels of two images as external stimuli is developed. Some improved DCPCNN models have also emerged. Yin et al. took edge energy as the external stimulus of DCPCNN, and stimulated DCPCNN according to the edge information of the image [24]. Liu et al. took the average gradient as the linking strength of DCPCNN to enhance the feature extraction ability [25]. Cheng et al. proposed triple linking strength DCPCNN, which took local structure information, direction gradient, and Laplacian energy as the linking strength of DCPCNN, and consolidated the stability of the linking strength value [26].
Hybrid methods are usually the combined forms of the previous types of fusion methods and combine the advantages of each fusion methods. Therefore, in this paper, the IHS transform of the component combination method, the NSST of the multiscale transform method, and the PCNN model are mixed, and the advantages of the IHS transform, NSST, and PCNN in each aspect are combined. Among them, although NSST can effectively suppress the loss of spectral information, the fusion rules of different sub-bands need to be reasonably designed to obtain excellent fusion results. At present, the fusion rules of multi-scale transform methods are usually designed based on local feature information, such as local energy, local spatial frequency, and local Laplacian. This local feature information focuses on a single image feature and cannot extract the feature information of the image effectively. Meanwhile, pulse-coupled neural network models are widely used for the design of high-frequency sub-band fusion rules due to their pulse synchronization and global coupling properties, but the parameters of these models need to be set artificially. Although some of the models have adaptive modulation of link strength, other parameters are selected as empirical values. In addition, using only image grayscale or simple measures as the external stimulus of the model will result in poor spatial correlation of the fused images.
In order to solve the above problems, a method of remote sensing image fusion combining low-level visual features (LLVF) and parameter-adaptive dual-channel pulse-coupled neural network (PADCPCNN) in the NSST domain, namely the NSST-LLVF-PADCPCNN method, is proposed in this paper. The main contributions and advantages of this paper can be concluded as follows.
(1)
Low-frequency sub-band fusion based on LLVF. In the process of low-frequency sub-band fusion, the fusion rules are constructed only based on a single local feature, which cannot effectively extract the feature information of the image. To this end, according to the principle of the human visual system to understand the image through the underlying visual features such as the saliency, contrast, and brightness of the image, a fusion rule which is more in line with the visual characteristics of the human eye is constructed by combining the three local features of local phase congruency, local abrupt measure, and local energy information.
(2)
High-frequency sub-band fusion based on PADCPCNN. Based on the advantage that MSMG can integrate gradient information at multiple scales, it is used as an external stimulus for DCPCNN, thereby enhancing the spatial correlation of DCPCNN. The parameters of DCPCNN are adaptively represented by differences in the box dimension, the Otsu threshold, and the image intensity to solve the complexity of parameter setting.
(3)
A remote sensing image fusion method combining LLVF and PADCPCNN. By combining the fusion strategies proposed above, a novel NSST domain fusion method is proposed, which more fully considers the energy preservation and detail extraction of remote sensing images. In order to verify the effectiveness of the method, five sets of remote sensing image data from different platforms and ground objects are selected to conduct comparative experiments between the proposed method and 16 other methods, and the experimental results are compared and analyzed from qualitative and quantitative aspects.
The paper is organized as follows: in Section 2, the fusion rules and steps of the proposed method are introduced in detail. The experimental design, including the experimental data, comparative experiments, and the selection of evaluation indexes are described in Section 3. In Section 4, the experimental results are evaluated and analyzed. Finally, some conclusions are made in Section 5.

2. Methodology

The proposed method improves the fusion rules of low-frequency sub-bands and high-frequency sub-bands, mainly including low-level visual features and PADCPCNN. The framework of the proposed NSST-LLVF-PADCPCNN method is shown in Figure 1.

2.1. Low-Level Visual Features

The human visual system understands images mainly based on underlying visual features such as saliency, contrast, and brightness, and the local phase congruency, local abrupt measure, and local energy information are used to reflect the three features, respectively.
  • Phase congruency is a dimensionless measure, often used in image edge detection, which can better measure the saliency of image features. The phase congruency at position ( i , j ) can be expressed as
    P C ( i , j ) = k ( n e n , θ k ( i , j ) ) 2 + ( n o n , θ k ( i , j ) ) 2 ε + n k A n , θ k ( i , j )
    where ε = 0.001 is the positive constant, θ k is the orientation angle at k , and A n , θ k is the amplitude and angle of the n-th Fourier component. e n , θ k ( i , j ) = I ( i , j ) M n e and o n , θ k ( i , j ) = I ( i , j ) M n o are the convolution results of the input image at ( i , j ) , where I ( i , j ) represents the pixel value at ( i , j ) , and M n e and M n o are two-dimensional Log-Gabor even symmetric and odd symmetric filters with scale n , respectively.
  • Local abrupt measure mainly reflects the contrast information of the image, which can overcome the difficulty that phase congruency is insensitive to the image contrast information. The expression of the abrupt measure S C M is
    S C M ( i , j ) = ( i 0 , j 0 ) Ω 0 ( I ( i , j ) I ( i 0 , j 0 ) ) 2
    where Ω 0 represents a local region of size 3 × 3, and ( i 0 , j 0 ) is the pixel value in local region Ω 0 . In order to calculate the contrast of ( i , j ) in the neighborhood range ( 2 M + 1 ) × ( 2 N + 1 ) , the local abrupt measure L S C M is proposed, and its expression is
    L S C M ( i , j ) = a = M M b = N N S C M ( i + a , j + b )
  • Local energy information L E can reflect the intensity of image brightness variation, and its equation can be expressed as
    L E ( i , j ) = a = M M b = N N ( I ( i + a , j + b ) ) 2
By combining three local features, P C , L S C M , and L E , a new activity measure N A M , which can fully reflect the saliency of image features, is constructed [27]. Its expression is
N A M ( i , j ) = ( P C ( i , j ) ) · ( L S C M ( i , j ) ) 2 · ( L E ( i , j ) ) 2

2.2. PADCPCNN

DCPCNN consists of three parts: a receptive field, a modulation field, and a pulse generator. The architecture of the model is shown in Figure 2, and its mathematical expression is
{ F i j A ( n ) = S i j A , F i j B ( n ) = S i j B L i j ( n ) = V L k l W i j k l Y k l ( n 1 ) U i j A ( n ) = F i j A ( n ) [ 1 + β A L i j ( n ) ] U i j B ( n ) = F i j B ( n ) [ 1 + β B L i j ( n ) ] U i j ( n ) = e α f U i j ( n 1 ) + max { U i j A ( n ) , U i j B ( n ) } Y i j ( n ) = { 1 , i f     U i j ( n ) > E i j ( n 1 ) 0 , o t h e r w i s e E i j ( n ) = e α e E i j ( n 1 ) + V E Y i j ( n )
where F i j X ( n ) , L i j ( n ) , U i j X ( n ) , Y i j ( n ) , and E i j ( n ) are the feeding input, linking input, internal activity, output, and dynamic threshold of the neuron at position ( i , j ) at the n-th iteration, respectively, and X ( A , B ) ; S i j A and S i j B are the external input of image A and image B at position ( i , j ) , that is, the gray value of the relative position of image; V L and W i j k l are the amplitude coefficient and synaptic weight coefficient matrix of linking input, respectively, and W i j k l = [ 0.5 1 0.5 1 0 1 0.5 1 0.5 ] ; α f is the attenuation coefficient of the internal activity; β A and β B are the linking strengths corresponding to S i j A and S i j B , respectively; and α e and V E are the attenuation coefficient and amplitude coefficient of the dynamic threshold.
Due to the weak spatial correlation and complex parameter setting of the traditional DCPCNN model, the external stimulus and free parameters of the model are optimized, respectively. MSMG can achieve multi-scale expansion by changing the size of morphological gradient operator structure elements and can integrate the gradient information extracted from different scales into an effective sharpness measure [28]. MSMG with high-frequency sub-bands is selected as the external stimulus signal of the model, which can better quantify the image sharpness and enhance the spatial correlation of the image [28]. Its expression is
F i j A ( n ) = M S M G A , F i j B ( n ) = M S M G B
where M S M G A and M S M G B are multi-scale morphological gradients of input images A and B. Equation (5) shows that DCPCNN mainly has six free parameters, namely V L , β A , β B , α f , α e , and V E . Because β A , β B , and V L are all coefficients of k l W i j k l Y k l ( n 1 ) , they are integrated into weight-linking coefficients γ A = β A V L and γ B = β B V L . Because differential box-counting can measure the contrast of a specific grid and reflect the intensity variation on a specific grid through the number of boxes, γ A and γ B are expressed adaptively according to the idea of differential box-counting and normalized by the Sigmoid function [29]. Its expression is
γ X = 1 1 + e η · ( g i j , max g i j , min ) X
where g i j , max and g i j , min represent the maximum and minimum gray levels within the window size 3 × 3 , respectively, with the pixel point at ( i , j ) of image X ( A , B ) as the center. η = 0.01 is a constant. Parameters α f , V E , and α e are set according to the image intensity and the Otsu threshold, and the weights are allocated through the differential box-counting. The setting rule is
{ α f = log ( 1 / ( w 1 σ A ( S ) + w 2 σ B ( S ) ) ) V E = e α f + λ α e = ln ( V E / ( w 1 S o t s u A + w 2 S o t s u B ) ( 1 e 3 α f ) / ( 1 e α f ) + ( λ 1 ) e α f )
λ = w 1 S max A / S o t s u A + w 2 S max B / S o t s u B
w 1 = w A / ( w A + w B ) , w 2 = w B / ( w A + w B )
where σ A ( S ) and σ B ( S ) represent the standard deviation of image A and image B, respectively; S o t s u A and S o t s u B represent the optimal histogram threshold of image A and image B determined by the Otsu method, respectively [30,31]; S max A and S max B represent the maximum intensity of image A and image B, respectively; and w A and w B represent the differential box-counting of image A and image B, respectively [32].

2.3. NSST-LLVF-PADCPCNN Method

The proposed method mainly consists of NSST decomposition, low frequency sub-bands fusion, high-frequency sub-bands fusion, and NSST reconstruction. Before NSST decomposition, a luminance component Y and two chromaticity components U and V are obtained by YUV space transform of multispectral image I M S . NSST is mainly composed of a non-subsampled pyramid filter bank (NSPFB) and a shearlet filter bank (SFB). The function of NSPFB is to realize the multi-scale decomposition of the image, and the function of SFB is to realize the multi-direction decomposition of the image. The NSST transform is used to decompose the luminance component Y obtained by YUV space transform and the panchromatic image I P a n obtained by preprocessing to obtain { H P a n l , k , L P a n } = n s s t _ d e ( I P a n ) and { H M S l , k , L M S } = n s s t _ d e ( Y ) , where l and k represent the decomposition series and direction, respectively, and n s s t _ d e is the NSST decomposition function. The corresponding low-frequency component { L P a n , L M S } and high-frequency component { H P a n l , k , H M S l , k } are obtained, and then the corresponding fusion rules are designed according to the characteristics of different sub-bands.
The low-frequency sub-band coefficient reflects the whole structure of the image and contains most of the energy of the image. Traditional low-frequency sub-bands fusion methods usually use local energy to measure activity levels of the image. However, an image contains rich visual information, and the local energy can only reflect the brightness information of the image, which lacks the representation of other low-level visual features. In order to overcome this defect, a fusion strategy based on low-level visual features is adopted in this paper. It can fully consider the saliency of image features, image contrast information, and image brightness information, which is more consistent with human visual features. The new activity measure N A M is used to fuse the low-frequency sub-band coefficients L P a n and L M S , and the fused low-frequency sub-band coefficients can be expressed as
L F ( i , j ) = { L P a n ( i , j ) i f   S i ( i , j ) > M ¯ × N ¯ 2 L M S ( i , j ) o t h e r w i s e
S i ( i , j ) = { ( i 0 , j 0 ) Ω 1 | N A M i ( i 0 , j 0 ) max ( N A M 1 ( i 0 , j 0 ) , , N A M i 1 ( i 0 , j 0 ) , N A M i + 1 ( i 0 , j 0 ) , , N A M K ( i 0 , j 0 ) ) }
where · represents the cardinality of the set, Ω 1 represents the sliding window with size M ¯ × N ¯ centered on ( i , j ) , and K is the number of low-frequency sub-bands images.
The high-frequency sub-band coefficient contains a lot of texture details and edge information, which directly reflects the sharpness of the image. DCPCNN is a commonly used method in high-frequency sub-band fusion, but its performance in high-frequency sub-band fusion is limited by the weak spatial correlation and complex parameter settings. Therefore, the external stimulus and free parameters of the model are optimized, respectively, to enhance the fusion performance of high-frequency sub-bands. The MSMG is able to better express the clarity of the image due to its large dispersion [28]. MSMG can also provide a higher normalized value for the image, thus having a higher prediction rate than other activity measures [28]. Therefore, MSMG is chosen as an external stimulus for DCPCNN to enhance spatial correlation. In addition, the parameters of DCPCNN are set in combination with the difference box-counting, the Otsu threshold, and the image intensity to achieve adaptive representation. The proposed PADCPCNN method is obtained by improving it and fusing the high-frequency sub-band coefficients H P a n l , k and H M S l , k . The activity degree of the high-frequency coefficient is evaluated by comparing the internal activity U P a n , i j l , k ( N ) and U M S , i j l , k ( N ) of the high-frequency sub-band H P a n l , k and H M S l , k , where N is the total number of iterations. The fusion high-frequency sub-bands H F l , k is obtained according to the following equation:
H F l , k ( i , j ) = { H P a n l , k ( i , j ) , i f     U P a n , i j l , k ( N ) U M S , i j l , k ( N ) H M S l , k ( i , j ) , o t h e r w i s e
Finally, the NSST inverse transform is performed on the fused high- and low-frequency fusion coefficients to obtain the new luminance component Y , namely Y = n s s t _ r e { H F l , k , L F } , where n s s t _ r e is the NSST reconstruction function. Then the new fused image is obtained by YUV space inverse transform of the new luminance component Y and chromaticity component U and V .

2.4. Steps

The procedure of the proposed method is shown in Figure 3, and the specific steps are as follows:
  • One luminance component Y and two chromaticity components U and V are obtained by YUV space transform of multispectral image I M S .
  • The luminance component Y obtained by YUV space transform and panchromatic image I P a n obtained by preprocessing are multi-scale decomposed by NSST transform, and the low-frequency sub-bands { L P a n , L M S } and high-frequency sub-bands { H P a n l , k , H M S l , k } are obtained, respectively, where l and k represent decomposition series and direction, respectively.
  • The fusion strategy of low-level visual features is used, the low frequency sub-bands are fused according to Equations (6)–(8), and the fused low-frequency sub-band coefficient L F is obtained.
  • PADCPCNN model is performed on the high-frequency sub-bands through Equations (9)–(14), and the fused high-frequency sub-band coefficient H F l , k is obtained.
  • The new luminance component Y is obtained by NSST inverse transform of the fused low-frequency and high-frequency fusion coefficients.
  • Then the new fused image is obtained by YUV space inverse transform of the new luminance component Y and chromaticity component U and V .

3. Experimental Design

3.1. Experimental Data

In order to verify the adaptability and effectiveness of the proposed method on remote sensing data of different satellite platforms, five sets of remote sensing images of different satellite platforms and ground objects are selected for experiments. The first set of experimental data is QuickBird data with 0.61 m panchromatic resolution and 2.44 m multispectral resolution, respectively, as shown in Figure 4(a1,a2). The second set of experimental data is SPOT-6 data with 1.5 m panchromatic resolution and 6 m multispectral resolution, respectively, as shown in Figure 4(b1,b2). The third set of experimental data is WorldView-2 data with 0.5 m panchromatic resolution and 1.8 m multispectral resolution, respectively, as shown in Figure 4(c1,c2). The fourth set of experimental data is WorldView-3 data with 0.31 m panchromatic resolution and 1.24 m multispectral resolution, respectively, as shown in Figure 4(d1,d2). The fifth set of experimental data is Pleiades data with 0.5 m panchromatic resolution and 2 m multispectral resolution, respectively, as shown in Figure 4(e1,e2). The size of multispectral image is 256 × 256, and the size of panchromatic image is 1024 × 1024. Pre-processing operations such as noise reduction and registration are performed on the original image to suppress the effects of factors such as noise and spatial location on image fusion [33,34].

3.2. Compared Methods

In order to better illustrate the experimental effect, the proposed method is compared with 16 other fusion methods, namely, Curvelet [35], dual-tree complex wavelet transform (DTCWT) [35], convolutional neural network (CNN) [36], contrast and structure extraction (CSE) [37], Total Variational Model (TVM) [38], Relative Total Variational Decomposition (RTVD) [39], adaptive sparse representation (ASR) [22], convolutional sparse representation (CSR) [40], convolution sparsity and morphological component analysis (CSMCA) [41], rolling guide filter (RGF) [42], multi-level Gaussian curvature filtering (MLGCF) [43], visual saliency map and weighted least square (VSM-WLS) [44], the NSST domain fusion method combining energy attribute (EA) and DCPCNN (EA-DCPCNN) [45], the NSST domain fusion method combining EA and PAPCNN (EA-PAPCNN) [46], and the NSST domain fusion method combing weighted local energy (WLE) and PAPCNN (WLE-PAPCNN) [30]. The NSST domain fusion method combs LLVF and PAPCNN (LLVF-PAPCNN) [47]. These compared methods can be divided into five main categories, among which Curvelet and DTCWT are classical multi-scale transform methods; TVM and RTVD are variational model-based methods; ASR, CSR, and CSMCA are sparse representation-based methods; RGF, MLGCF, and VSM-WLS are edge-preserving filter-based methods; EA-DCPCNN, EA-PAPCNN, WLE-PAPCNN, and LLVF-PAPCNN are methods combining multi-scale transform and PCNN models; and CNN and CSE are other methods. In order to ensure the rigor of the experiment, the same experimental environment is selected for the proposed method and compared methods. In addition, all parameters in these compared methods are set to the default values given by their authors. In this paper, the decomposition filter is maxflat, the decomposition degree is 4, and the iteration number of PADCPCNN is set to 110 [15,30,46,47].

3.3. Evaluation Indexes

Qualitative evaluation is mainly based on the observation of the human visual system. According to the expert knowledge base, the visual effect, texture details, color information, spatial structure, and other aspects of the fused image are compared and analyzed, and the subjective evaluation of each set of fusion results is made. Quantitative evaluation is an objective evaluation of experimental results through evaluation indexes. In the experiment, two information abundance evaluation indexes, Information Entropy (IE) and Mutual information (MI), are selected. Average Gradient (AG), Spatial Frequency (SF), and Spatial Correlation Coefficient (SCC) are selected as spatial information evaluation indexes. Spectral Distortion (SD), Spectral Angle Mapper (SAM), and Erreur Relative Globale Adimensionnelle de Synthese (ERGAS) are selected as spectral information evaluation indexes. Two overall quality evaluation indexes, Q4 [48] and Visual Information Fidelity for Fusion (VIFF) [49], are used. Details for the used indexes are shown below.
  • IE: It is an evaluation index to measure the amount of information contained in the fused image, and the greater the information entropy, the richer the information. The mathematical expression of IE is
    I E = i = 0 L 1 Z i log ( Z i )
    where L is the gray level of the image and Z i is the statistical probability of the gray histogram.
  • MI: It measures the extent to which the fused image acquires information from the original image. A larger MI indicates that more information is retained from the original image. The mathematical expression of MI is
    M I = i = 1 M j = 1 N h I A I B ( i , j ) × log 2 ( h I A I B ( i , j ) h I A ( i , j ) h I B ( i , j ) )
    where I A and I B are the fused image and the reference image, respectively; M and N are the length and width of the image, respectively; and M × N denotes the resolution of the image. h I A I B is the joint grayscale histogram of I A and I B .
  • AG: It measures the clarity of the image, and the greater the AG, the higher the clarity and the better the quality of the fusion. The mathematical expression of AG is
    A G = 1 M N i = 1 M j = 1 N F x 2 ( i , j ) + F y 2 ( i , j ) 2
    where F x 2 ( i , j ) and F y 2 ( i , j ) represent the first-order differences of the image F in the x- and y-directions, respectively.
  • SF: It reflects the grayscale rate of change of the image and reflects the active level of the image. A larger SF indicates clearer and higher fusion quality. The mathematical expression of SF is
    S F = R F 2 + C F 2
    R F and C F represent the row space frequency and column space frequency of image F , respectively. It can be expressed as
    R F = 1 M N i = 1 M j = 1 N ( F ( i , j ) F ( i , j 1 ) ) 2
    C F = 1 M N i = 1 M j = 1 N ( F ( i , j ) F ( i 1 , j ) ) 2
  • SCC: It reflects the spatial correlation between the two images, and the larger the correlation coefficient, the better the fusion effect. It can be expressed as
    S C C ( I A , I B ) = i = 1 M j = 1 N ( I A ( i , j ) I ¯ A ) ( I B ( i , j ) I ¯ B ) i = 1 M j = 1 N ( I A ( i , j ) I ¯ A ) 2 × i = 1 M j = 1 N ( I B ( i , j ) I ¯ B ) 2
  • SD: It measures the spectral difference between the fused image and the reference image. The larger the SD, the more severe the spectral distortion. The mathematical expression of SD is
    S D = 1 M N i = 1 M j = 1 N ( I A ( i , j ) I B ( i , j ) )
  • SAM: It measures the similarity between spectra by calculating the inclusion angle between two vectors. The smaller the inclusion angle is, the more similar the two spectra are. The mathematical expression of SAM is
    S A M ( v , v ^ ) = arccos ( v , v ^ v 2 · v ^ 2 )
    where v is the spectral pixel vector of the original image, and v ^ is the spectral pixel vector of the fused image.
  • ERGAS: It reflects the degree of spectral distortion between the fused image and the reference image. The larger the ERGAS, the more severe the spectral distortion. The mathematical expression of ERGAS is
    E R G A S = 100 h l 1 K k = 1 K R M S E ( I A , I B ) μ I A
    R M S E ( I A , I B ) = 1 M N i = 1 M j = 1 N ( I A ( i , j ) I B ( i , j ) ) 2
    where h l is the ratio of the resolution of the ASR image to that of the multispectral image, K denotes the number of bands, and μ denotes the mean value of the image.
  • Q4: It is a global evaluation index based on the calculation of the hypercomplex correlation coefficient between the reference image and the fused image, which jointly measure the spectral and spatial distortion. Its specific definition is detailed in reference [43].
  • VIFF: It is a newly proposed index that measures the fidelity of visual information between the fused image and each source image by measuring the fidelity of the visual information based on the Gaussian scale mixture model, the distortion model, and the HVS model. Its specific definition is detailed in reference [44].

4. Experimental Results and Analysis

In order to comprehensively evaluate the experimental results, all the fusion results of the five sets of data are compared and analyzed from two aspects of subjective qualitative evaluation and objective quantitative evaluation, respectively.

4.1. Qualitative Evaluation

The fusion results of QuickBird images are shown in Figure 5, and the corresponding SAM error images are shown in Figure 6. Through comparative analysis, it can be seen that the spatial structure of TVM and RTVD based on variational models as well as ASR, CSR, and CSMCA based on sparse representation theory is relatively fuzzy, and there are obvious artifacts, among which TVM, RTVD, and ASR have the most serious artifacts. The fusion method of CNN and CSE has serious spectral distortion, which is quite different from the original multispectral image. The proposed method can clearly show the contours of buildings and the edge structure of roads and better retain the spatial and spectral information of the original image. Other fusion methods such as Curvelet, DTCWT, MLGCF, RGF, VSM-WLS, EA-DCPCNN, EA-PAPCNN, and WLE-PAPCNN are suboptimal.
The fusion results of SPOT-6 images are presented in Figure 7, and the corresponding SAM error images are shown in Figure 8. The comparative analysis shows that the spatial information of TVM and RTVD based on variational models as well as ASR, CSR, and CSMCA fusion methods based on sparse representation theory is poor, especially the blurred degree of farmland boundary, and the road edge structure is high. The fusion result of CNN is brighter than that of the original multispectral image, whereas the fusion result of RGF is lighter than that of the original multispectral image. Seven methods of Curvelet, DTCWT, CSE, MLGCF, VSM-WLS, EA-DCPCNN, and EA-PAPCNN have no obvious spatial blur and spectral distortion. The fusion results of WLE-PAPCNN and the proposed method have high sharpness, high retention of spectral information, and good overall visual effect.
It can be seen that the sharpness of TVM and RTVD based on variational models as well as ASR and CSR based on sparse representation theory is obviously low, and the overall structure is fuzzy from the fusion results of WorldView-2 images shown in Figure 9 and the corresponding SAM error images in Figure 10. The spectral fidelity of CNN, CSE, and RGF fusion methods is poor, and the color of vegetation in the fusion results is obviously lighter than that of the original multispectral images. The color of vegetation in the valley of the seven fusion methods of Curvelet, DTCWT, CSMCA, MLGCF, VSM-WLS, EA-DCPCNN, and EA-PAPCNN is slightly lighter than that of the original multispectral image. WLE-PAPCNN and the proposed method can clearly show the texture information of ridgelines and mountains, and the overall effect is good.
Figure 11 presents the fusion results of WorldView-3 images, and Figure 12 shows the corresponding SAM error images. It shows that TVM, RTVD, ASR, CSR, and CSMCA fusion methods have poor retention ability of spatial information, and the water edge is blurred. The two fusion methods of CNN and CSE have different degrees of spectral distortion, and the color of vegetation and shallow water is brighter. The proposed method can clearly show the details of water shoreline and pond edge, and the overall color of water body and vegetation is not significantly different from the original multispectral image. The overall colors of Curvelet, DTCWT, RGF, MLGCF, VSM-WLS, EA-DCPCNN, and EA-PAPCNN fusion methods are slightly brighter, whereas the overall color of WLE-PAPCNN is slightly darker.
It can be found that the spatial resolution of TVM, RTVD, ASR, CSR, and CSMCA fusion methods is low; in particular, ASR cannot identify the contours of some small vehicles from the fusion results of Pleiades image shown in Figure 13 and the corresponding SAM error images in Figure 14. The overall color of CSE is brighter than that of the original multispectral image, and the overall color of RGF is redder than that of the original multispectral image. The proposed method performs better on the texture details of roads, bridges, and vehicles, and the colors of vegetation, water, and bare land are not significantly different from the original multispectral images. Curvelet, DTCWT, CNN, MLGCF, VSM-WLS, EA-DCPCNN, EA-PAPCNN, and WLE-PAPCNN retained the spatial and spectral information of the original image to a certain extent, and the comprehensive visual effect is suboptimal.

4.2. Quantitative Evaluation

Table 1 lists the quantitative evaluation results of the proposed method and 16 compared methods corresponding to five different sets of image data. All values are the average values of R, G, and B bands of the fused image on the corresponding evaluation indexes. The symbol “↑” indicates that a larger value is better, the symbol “↓” indicates that a smaller value is better, bold indicates the optimal value, and underscore indicates the suboptimal value.
As can be seen from Table 1, the two classical multi-scale transformation methods, Curvelet and DTCWT, perform relatively similarly in all indexes, and the spectral distortion phenomenon on SPOT-6 data is more obvious, such that the SD and ERGAS indexes both exceed 2 and 11, respectively. The CNN and CSE methods perform poorly on all indexes. Although the CNN method achieves good results on the Pleiades data, it still performs poorly on the other four sets of data. The three sparse representation-based methods ASR, CSR, and CSMCA are significantly worse than other methods in the three spatial information-based indexes of AG, SF, and SCC, so that the ambiguity is the most serious, and they are not prominent in the other five indexes. Both TVM and RTVD methods based on variational models show extremely poor spatial information retention and spectral information fidelity, with severe spatial blurring and spectral distortion. RGF, MLGCF, and VSM-WLS based on edge-preserving filtering have no obvious advantages in all indexes. Among them, the stability of RGF is very poor, and all indexes show good results in WorldView-3 data, but there are general deviations in SPOT-6 data. EA-DCPCNN, EA-PAPCNN, WLE-PAPCNN, and LLVF-PAPCNN are the hybrid methods of NSST and improved PCNN, of which EA-DCPCNN is the worst method among the four methods, and EA-PAPCNN does not perform well in various indexes. The WLE-PAPCNN and LLVF-PAPCNN methods are the two best methods among all the comparison methods, especially showing obvious advantages on the three spectral information-based indexes SD, SAM, and ERGAS. However, the proposed method has been significantly optimized in all aspects, the performance on all indexes is almost the best, and the sub-optimal values basically appear in the LLVF-PAPCNN method.
Comparing all fusion methods, all four methods, CNN, CSE, TVM, and RTVD, have poor spectral quality in spatial quality, with the two methods based on variational models, TVM and RTVD, being somewhat worse than the CNN and CSE methods. The three sparse representation-based methods, ASR, CSR, and CSMCA, have good spectral fidelity, but their spatial clarity is poor. The two classical multiscale transformation methods, Curvelet and DTCWT, are better for the representation of spatial information, but not for the spectral information. The three edge-preserving filtering-based methods, RGF, MLGCF, and VSM-WLS, perform more generally in all evaluation indexes without significant advantages. The four hybrid methods based on NSST and improved PCNN, EA-DCPCNN, EA-PAPCNN, WLE-PAPCNN, and LLVF-PAPCNN, have some advantages over several other types of comparative methods in terms of overall effectiveness and can better extract spatial and spectral information.
Compared to the average value of the LLVF-PAPCNN methods in the five sets of data, the proposed method optimized the 10 evaluation indexes of IE, MI, AG, SF, SCC, SD, SAM, ERGAS, Q4, and VIFF by 0.006, 0.009, 0.009, 0.035, 0.016, 0.037, 0.062, 0.042, 0.030, and 0.020, respectively. It shows that the proposed method has the best fusion effect. Among all the indexes, the proposed method performs well in IE and MI, two evaluation indexes based on information content, indicating that the proposed method can remove redundant information and retain complementary information to a large extent. The results of AG and SF, two evaluation indexes based on spatial information, show that the proposed method can effectively extract the detailed texture and edge information of the image and enhance the image sharpness. At the same time, the proposed method can better retain the spectral information of the original multispectral image and greatly reduce the spectral distortion, which is confirmed by SD and ERDAS, two evaluation indexes based on spectral information. In addition, the excellent performance of the proposed method on VIFF reveals that the design of the proposed method conforms to the essential features of the human visual system and can obtain better visual effects.

5. Conclusions

In this paper, a remote sensing image fusion method combining low-level visual features and PADCPCNN in the non-subsampled shearlet transform domain is proposed. Three local features are combined to solve the singleness of activity measure construction in low-frequency sub-bands fusion. In the high-frequency sub-bands fusion, the parameter settings and external inputs of the DCPCNN model are optimized, and the spatial correlation is enhanced while the model realizes the adaptive parameter selection. By selecting five sets of remote sensing image data of different satellite platforms and ground objects for experiment, the fusion results are evaluated qualitatively and quantitatively and show the effectiveness and universality of the proposed method. The experimental results show that, compared with the other 16 methods, the proposed method is significantly better than the other methods in all evaluation indexes, indicating that the proposed method improves the spatial resolution and spectral resolution of remote sensing images to a large extent.

Author Contributions

Conceptualization, K.L. and Z.H.; methodology, Z.H.; software, K.L.; validation, K.L. and X.G.; formal analysis, X.G.; investigation, Z.H.; resources, K.L. and X.G.; data curation, Z.H.; writing—original draft preparation, Z.H and X.G.; writing—review and editing, K.L. and Y.W.; visualization, Y.W.; supervision, K.L. and X.G.; project administration, Y.W. and X.G.; funding acquisition, K.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by Open Fund of Key Laboratory of Mine Environmental Monitoring and Improving around Poyang Lake of Ministry of Natural Resources, grant number MEMI-2021-2022-01, and National Natural Science Foundation of China, grant number 42101457.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

We would like to thank the editors and anonymous reviewers for their detailed review, valuable comments, and constructive suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ghassemian, H. A review of remote sensing image fusion methods. Inf. Fusion 2016, 32, 75–89. [Google Scholar] [CrossRef]
  2. Arienzo, A.; Alparone, L.; Garzelli, A.; Lolli, S. Advantages of Nonlinear Intensity Components for Contrast-Based Multispectral Pansharpening. Remote Sens. 2022, 14, 3301. [Google Scholar] [CrossRef]
  3. Kulkarni, S.C.; Rege, P.P. Pixel level fusion techniques for SAR and optical images: A review. Inf. Fusion 2020, 59, 13–29. [Google Scholar] [CrossRef]
  4. Shao, Z.; Wu, W.; Guo, S. IHS-GTF: A fusion method for optical and synthetic aperture radar data. Remote Sens. 2020, 12, 2796. [Google Scholar] [CrossRef]
  5. Batur, E.; Maktav, D. Assessment of surface water quality by using satellite images fusion based on PCA method in the Lake Gala, Turkey. IEEE Tran. Geosci. Remote Sens. 2018, 57, 2983–2989. [Google Scholar] [CrossRef]
  6. Quan, Y.; Tong, Y.; Feng, W.; Dauphin, G.; Huang, W.; Xing, M. A novel image fusion method of multi-spectral and sar images for land cover classification. Remote Sens. 2020, 12, 3801. [Google Scholar] [CrossRef]
  7. Zhao, R.; Du, S. An Encoder–Decoder with a Residual Network for Fusing Hyperspectral and Panchromatic Remote Sensing Images. Remote Sens. 2022, 14, 1981. [Google Scholar] [CrossRef]
  8. Wu, Y.; Feng, S.; Lin, C.; Zhou, H.; Huang, M. A Three Stages Detail Injection Network for Remote Sensing Images Pansharpening. Remote Sens. 2022, 14, 1077. [Google Scholar] [CrossRef]
  9. Nair, R.R.; Singh, T. Multi-sensor medical image fusion using pyramid-based DWT: A multi-resolution approach. IET Image Process. 2019, 13, 1447–1459. [Google Scholar] [CrossRef]
  10. Aishwarya, N.; Thangammal, C.B. Visible and infrared image fusion using DTCWT and adaptive combined clustered dictionary. Infrared Phys. Technol. 2018, 93, 300–309. [Google Scholar] [CrossRef]
  11. Arif, M.; Wang, G. Fast curvelet transform through genetic algorithm for multimodal medical image fusion. Soft Comput. 2020, 24, 1815–1836. [Google Scholar] [CrossRef]
  12. Wang, Z.; Li, X.; Duan, H.; Su, Y.; Zhang, X.; Guan, X. Medical image fusion based on convolutional neural networks and non-subsampled contourlet transform. Expert Syst. Appl. 2021, 171, 114574. [Google Scholar] [CrossRef]
  13. Li, B.; Peng, H.; Luo, X.; Wang, J.; Song, X.; Pérez-Jiménez, M.J.; Riscos-Núñez, A. Medical image fusion method based on coupled neural P systems in nonsubsampled shearlet transform domain. Int. J. Neural Syst. 2021, 31, 2050050. [Google Scholar] [CrossRef] [PubMed]
  14. Chen, J.; Li, X.; Luo, L.; Mei, X.; Ma, J. Infrared and visible image fusion based on target-enhanced multiscale transform decomposition. Inf. Sci. 2020, 508, 64–78. [Google Scholar] [CrossRef]
  15. Panigrahy, C.; Seal, A.; Mahato, N.K. Fractal dimension based parameter adaptive dual channel PCNN for multi-focus image fusion. Opt. Lasers Eng. 2020, 133, 106141. [Google Scholar] [CrossRef]
  16. Jin, X.; Jiang, Q.; Yao, S.; Zhou, D.; Nie, R.; Lee, S.J.; He, K. Infrared and visual image fusion method based on discrete cosine transform and local spatial frequency in discrete stationary wavelet transform domain. Infrared Phys. Technol. 2018, 88, 1–12. [Google Scholar] [CrossRef]
  17. Xu, Y.; Sun, B.; Yan, X.; Hu, J.; Chen, M. Multi-focus image fusion using learning based matting with sum of the Gaussian-based modified Laplacian. Digit. Signal Process. 2020, 106, 102821. [Google Scholar] [CrossRef]
  18. Zhang, Y.; Jin, M.; Huang, G. Medical image fusion based on improved multi-scale morphology gradient-weighted local energy and visual saliency map. Biomed. Signal Process. Control 2022, 74, 103535. [Google Scholar] [CrossRef]
  19. Khademi, G.; Ghassemian, H. Incorporating an adaptive image prior model into Bayesian fusion of multispectral and panchromatic images. IEEE Geosci. Remote Sens. Lett. 2018, 15, 917–921. [Google Scholar] [CrossRef]
  20. Wang, T.; Fang, F.; Li, F.; Zhang, G. High-quality Bayesian pansharpening. IEEE Trans. Image Process. 2018, 28, 227–239. [Google Scholar] [CrossRef]
  21. Liu, Y.; Wang, Z. Simultaneous image fusion and denoising with adaptive sparse representation. IET Image Process. 2014, 9, 347–357. [Google Scholar] [CrossRef] [Green Version]
  22. Deng, L.; Feng, M.; Tai, X. The fusion of panchromatic and multispectral remote sensing images via tensor-based sparse modeling and hyper-Laplacian prior. Inf. Fusion 2019, 52, 76–89. [Google Scholar] [CrossRef]
  23. Huang, C.; Tian, G.; Lan, Y.; Peng, Y.; Ng, E.Y.K.; Hao, Y.; Che, W. A new pulse coupled neural network (PCNN) for brain medical image fusion empowered by shuffled frog leaping algorithm. Front. Neurosci. 2019, 13, 210. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Yin, M.; Duan, P.; Liu, W.; Liang, X. A novel infrared and visible image fusion algorithm based on shift-invariant dual-tree complex shearlet transform and sparse representation. Neurocomputing 2017, 226, 182–191. [Google Scholar] [CrossRef]
  25. Liu, Z.; Feng, Y.; Zhang, Y.; Li, X. A fusion algorithm for infrared and visible images based on RDU-PCNN and ICA-bases in NSST domain. Infrared Phys. Technol. 2016, 79, 183–190. [Google Scholar] [CrossRef]
  26. Cheng, B.; Jin, L.; Li, G. Infrared and visual image fusion using LNSST and an adaptive dual-channel PCNN with triple-linking strength. Neurocomputing 2018, 310, 135–147. [Google Scholar] [CrossRef]
  27. Li, H.; Qiu, H.; Yu, Z.; Zhang, Y. Infrared and visible image fusion scheme based on NSCT and low-level visual features. Infrared Phys. Technol. 2016, 76, 174–184. [Google Scholar] [CrossRef]
  28. Zhang, Y.; Bai, X.; Wang, T. Boundary finding based multi-focus image fusion through multi-scale morphological focus-measure. Inf. Fusion 2017, 35, 81–101. [Google Scholar] [CrossRef]
  29. Panigrahy, C.; Seal, A.; Mahato, N.K.; Bhattacharjee, D. Differential box counting methods for estimating fractal dimension of gray-scale images: A survey. Chaos Solitons Fractals 2019, 126, 178–202. [Google Scholar] [CrossRef]
  30. Yin, M.; Liu, X.; Liu, Y.; Chen, X. Medical Image Fusion With Parameter-Adaptive Pulse Coupled Neural Network in Nonsubsampled Shearlet Transform Domain. IEEE Trans. Instrum. Meas. 2018, 68, 49–64. [Google Scholar] [CrossRef]
  31. Chen, Y.; Park, S.K.; Ma, Y.; Ala, R. A New Automatic Parameter Setting Method of a Simplified PCNN for Image Segmentation. IEEE Trans. Neural Netw. 2011, 22, 880–892. [Google Scholar] [CrossRef] [PubMed]
  32. Panigrahy, C.; Seal, A.; Mahato, N.K.; Krejcar, O.; Viedma, E.H. Multi-focus image fusion using fractal dimension. Appl. Opt. 2020, 59, 5642–5655. [Google Scholar] [CrossRef] [PubMed]
  33. Singh, P.; Shankar, A.; Diwakar, M. Review on nontraditional perspectives of synthetic aperture radar image despeckling. J. Electron. Imaging 2022, 32, 021609. [Google Scholar] [CrossRef]
  34. Singh, P.; Diwakar, M.; Shankar, A.; Shree, R.; Kumar, M. A Review on SAR Image and its Despeckling. Arch. Comput. Methods Eng. 2021, 28, 4633–4653. [Google Scholar] [CrossRef]
  35. Liu, Y.; Liu, S.P.; Wang, Z.F. A general framework for image fusion based on multi-scale transform and sparse representation. Inf. Fusion 2015, 24, 147–164. [Google Scholar] [CrossRef]
  36. Liu, Y.; Chen, X.; Peng, H.; Wang, Z. Multi-focus image fusion with a deep convolutional neural network. Inf. Fusion 2017, 36, 191–207. [Google Scholar] [CrossRef]
  37. Sufyan, A.; Imran, M.; Shah, S.A.; Shahwani, H.; Wadood, A.A. A novel multimodality anatomical image fusion method based on contrast and structure extraction. Int. J. Imaging Syst. Technol. 2021, 32, 324–342. [Google Scholar] [CrossRef]
  38. Du, Q.; Xu, H.; Ma, Y.; Huang, J.; Fan, F. Fusing infrared and visible images of different resolutions via total variation model. Sensors 2018, 18, 3827. [Google Scholar] [CrossRef] [Green Version]
  39. Chen, J.; Li, X.; Wu, K. Infrared and visible image fusion based on relative total variation decomposition. Infrared Phys. Technol. 2022, 123, 104112. [Google Scholar] [CrossRef]
  40. Liu, Y.; Chen, X.; Ward, R.K.; Wang, Z.J. Image Fusion With Convolutional Sparse Representation. IEEE Signal Process. Lett. 2016, 23, 1882–1886. [Google Scholar] [CrossRef]
  41. Liu, Y.; Chen, X.; Ward, R.K.; Wang, Z.J. Medical Image Fusion via Convolutional Sparsity Based Morphological Component Analysis. IEEE Signal Process. Lett. 2019, 26, 485–489. [Google Scholar] [CrossRef]
  42. Jian, L.; Yang, X.; Zhou, Z.; Zhou, K.; Liu, K. Multi-scale image fusion through rolling guidance filter. Future Gener. Comput. Syst. 2018, 83, 310–325. [Google Scholar] [CrossRef]
  43. Tan, W.; Zhou, H.; Song, J.; Li, H.; Yu, Y.; Du, J. Infrared and visible image perceptive fusion through multi-level Gaussian curvature filtering image decomposition. Appl. Opt. 2019, 58, 3064–3073. [Google Scholar] [CrossRef] [PubMed]
  44. Ma, J.; Zhou, Z.; Wang, B.; Zong, H. Infrared and visible image fusion based on visual saliency map and weighted least square optimization. Infrared Phys. Technol. 2017, 82, 8–17. [Google Scholar] [CrossRef]
  45. Tan, W.; Tiwari, P.; Pandey, H.M.; Moreira, C.; Jaiswal, A.K. Multimodal medical image fusion algorithm in the era of big data. Neural Comput. Appl. 2020, 1–21. [Google Scholar] [CrossRef]
  46. Cheng, F.; Fu, Z.; Huang, L.; Chen, P.; Huang, K. Non-subsampled shearlet transform remote sensing image fusion combined with parameter-adaptive PCNN. Acta Geod. Cartogr. Sin. 2021, 50, 1380–1389. [Google Scholar] [CrossRef]
  47. Hou, Z.; Lv, K.; Gong, X.; Zhi, J.; Wang, N. Remote sensing image fusion based on low-level visual features and PAPCNN in NSST domain. Geomat. Inf. Sci. Wuhan Univ. 2022. accepted. [Google Scholar] [CrossRef]
  48. Garzelli, A.; Nencini, F. Hypercomplex quality assessment of multi/hyperspectral images. IEEE Geosci. Remote Sens. Lett. 2009, 6, 662–665. [Google Scholar] [CrossRef]
  49. Han, Y.; Cai, Y.; Cao, Y.; Xu, X. A new image fusion performance metric based on visual information fidelity. Inf. Fusion 2013, 14, 127–135. [Google Scholar] [CrossRef]
Figure 1. Framework of the proposed NSST-LLVF-PADCPCNN method.
Figure 1. Framework of the proposed NSST-LLVF-PADCPCNN method.
Remotesensing 15 00344 g001
Figure 2. Architecture of the DCPCNN model.
Figure 2. Architecture of the DCPCNN model.
Remotesensing 15 00344 g002
Figure 3. Flowchart of the method design.
Figure 3. Flowchart of the method design.
Remotesensing 15 00344 g003
Figure 4. Five sets of multispectral image and panchromatic image. (a1) PAN image of QuickBird; (a2) MS image of QuickBird e; (b1) PAN image of SPOT-6; (b2) MS image of SPOT-6; (c1) PAN image of WorldView-2; (c2) MS image of WorldView-2; (d1) PAN image of WorldView-3; (d2) MS image of WorldView-3; (e1) PAN image of Pleiades; (e2) MS image of Pleiades.
Figure 4. Five sets of multispectral image and panchromatic image. (a1) PAN image of QuickBird; (a2) MS image of QuickBird e; (b1) PAN image of SPOT-6; (b2) MS image of SPOT-6; (c1) PAN image of WorldView-2; (c2) MS image of WorldView-2; (d1) PAN image of WorldView-3; (d2) MS image of WorldView-3; (e1) PAN image of Pleiades; (e2) MS image of Pleiades.
Remotesensing 15 00344 g004
Figure 5. Fusion effect of QuickBird image.
Figure 5. Fusion effect of QuickBird image.
Remotesensing 15 00344 g005
Figure 6. Error images corresponding to QuickBird images.
Figure 6. Error images corresponding to QuickBird images.
Remotesensing 15 00344 g006
Figure 7. Fusion effect of SPOT-6 image.
Figure 7. Fusion effect of SPOT-6 image.
Remotesensing 15 00344 g007
Figure 8. Error images corresponding to SPOT-6 images.
Figure 8. Error images corresponding to SPOT-6 images.
Remotesensing 15 00344 g008
Figure 9. Fusion effect of WorldView-2 image.
Figure 9. Fusion effect of WorldView-2 image.
Remotesensing 15 00344 g009
Figure 10. Error images corresponding to WorldView-2 images.
Figure 10. Error images corresponding to WorldView-2 images.
Remotesensing 15 00344 g010
Figure 11. Fusion effect of WorldView-3 image.
Figure 11. Fusion effect of WorldView-3 image.
Remotesensing 15 00344 g011
Figure 12. Error images corresponding to WorldView-3 images.
Figure 12. Error images corresponding to WorldView-3 images.
Remotesensing 15 00344 g012
Figure 13. Fusion effect of Pleiades image.
Figure 13. Fusion effect of Pleiades image.
Remotesensing 15 00344 g013
Figure 14. Error images corresponding to Pleiades images.
Figure 14. Error images corresponding to Pleiades images.
Remotesensing 15 00344 g014
Table 1. Objective evaluation of 5 sets of image data.
Table 1. Objective evaluation of 5 sets of image data.
ImageMethodIE↑MI↑AG↑SF↑SCC↑SD↓SAM↓ERGAS↓Q4↑VIFF↑
QuickBirdCurvelet3.9887.9771.5053.2080.8942.3382.74710.6730.8530.587
DTCWT4.0058.0101.5413.3180.8932.4382.87011.2240.8700.591
CNN3.8127.6251.5243.2880.8654.2147.55024.8360.9230.444
CSE3.7397.4791.4953.2840.8494.3387.09626.4530.7190.332
ASR3.7997.5990.8222.0250.8812.2972.43010.9980.8990.470
CSR3.9447.8881.1492.6320.9002.2682.57210.3630.8610.561
CSMCA4.0198.0371.3302.9620.8882.3852.90310.6260.3980.631
TVM3.7047.4080.9481.9180.8704.1316.56124.1530.7200.348
RTVD3.1096.2170.7341.5600.8585.1204.05738.6770.8830.182
RGF4.0518.1021.5583.3180.8622.0491.8418.2130.7490.559
MLGCF4.0178.0261.5213.2280.8952.2842.73410.0990.8810.631
VSM-WLS4.0428.0851.5833.4010.8962.4392.90910.6870.8010.655
EA-DCPCNN3.9777.9541.4513.1220.8922.5172.69111.4690.8640.620
EA-PAPCNN4.0078.0151.5243.3220.8882.5602.94012.1270.8840.520
WLE-PAPCNN4.2628.5141.5523.3590.8841.7201.7467.0250.8920.715
LLVF-PAPCNN4.3088.6161.5983.3820.8841.2681.8575.4420.8600.899
Proposed4.3198.6391.6333.4870.8991.2421.6375.3960.9380.948
SPOT-6Curvelet3.6207.2000.7631.5620.9642.0012.89011.1290.8610.550
DTCWT3.6227.2130.7871.6660.9652.0363.08911.4150.8690.564
CNN3.2736.5460.7761.6310.9293.5265.88116.8120.8370.337
CSE3.6015.9310.6771.4470.9571.9913.13310.7300.7340.641
ASR3.6007.2020.3500.9250.9521.9202.79910.6500.8510.461
CSR3.5337.0660.4751.1940.9631.9372.45210.9850.8160.501
CSMCA3.5847.1680.5901.3980.9631.9702.72011.0100.8040.609
TVM3.0386.0760.4360.9990.9373.2274.93329.5520.5610.260
RTVD2.7365.4720.3550.8450.9574.5506.34444.9350.2800.192
RGF3.4686.9370.7921.6550.9262.2223.40312.0570.6580.452
MLGCF3.6257.2510.7721.5790.9641.9192.83910.1600.8420.589
VSM-WLS3.6327.2640.7941.6640.9661.9443.06010.3430.7770.581
EA-DCPCNN3.6187.2350.7091.5410.9671.7413.03710.3400.6980.604
EA-PAPCNN3.6387.2760.7791.6610.9651.9693.1098.6250.7180.541
WLE-PAPCNN4.0028.0040.7991.6950.9660.8482.1083.8830.8150.842
LLVF-PAPCNN4.0218.0420.8091.7120.9670.6301.9653.4410.8140.958
Proposed4.0238.0470.8131.7120.9680.6011.9033.3280.8780.954
WorldView-2Curvelet5.38510.7712.6674.7780.9208.6080.7044.0950.7810.563
DTCWT5.40910.8182.8025.0400.9178.7070.7364.1460.7920.594
CNN5.20910.4172.7965.0370.86916.0351.6296.9420.7270.519
CSE5.08310.1662.7845.0190.82316.3371.3799.1680.7310.471
ASR5.68511.3701.3622.9610.8757.2030.7873.2730.8010.723
CSR5.36010.7201.9773.8040.8768.6050.7004.0470.7950.530
CSMCA5.38610.7712.4504.5600.9188.7860.7444.1160.7920.553
TVM4.9359.8691.2022.3090.84616.2841.3189.1560.7110.360
RTVD4.2348.4691.2082.1670.91138.4440.61541.1540.4130.120
RGF5.11610.2342.7995.0410.8278.7290.8684.0810.7540.474
MLGCF4.41310.8202.6274.7090.9218.4330.6903.9750.8170.580
VSM-WLS5.43910.8782.7424.9140.9198.7350.7514.1090.8040.629
EA-DCPCNN5.45610.9122.6194.7540.92010.2050.4784.9950.7830.635
EA-PAPCNN5.43510.8732.8095.0570.9168.6290.7433.5900.8070.635
WLE-PAPCNN5.73511.4702.8235.0790.9162.8980.4511.9690.8840.594
LLVF-PAPCNN5.75811.5162.8395.1020.9093.0590.4641.4820.8790.809
Proposed5.76011.5212.8395.1030.9233.0200.4581.4410.8930.833
WorldView-3Curvelet3.8707.7400.7771.6310.9792.4079.16014.8490.6410.535
DTCWT3.8747.7480.7881.6610.9782.1249.25714.8910.6320.546
CNN3.2776.5550.6511.5340.9564.68322.30049.1530.3570.383
CSE3.2906.7930.6321.4550.9594.63221.68249.4190.3410.205
ASR3.5647.1290.4181.1030.9703.53113.96126.9330.5010.370
CSR3.7747.5480.5311.3310.9782.3407.57114.4890.6950.534
CSMCA3.7907.5790.6091.5250.9752.3337.75813.8680.7560.683
TVM3.2296.4590.4491.1800.9634.64023.00449.7650.4090.283
RTVD3.0496.0970.3980.9930.9624.8466.01443.4360.3950.258
RGF4.0628.1230.8461.8080.9661.4733.9086.5510.6270.826
MLGCF3.8877.7740.7921.6950.9782.2418.79912.7650.6680.618
VSM-WLS3.8767.7510.8041.7420.9772.2198.67412.3880.7260.644
EA-DCPCNN3.6747.5280.6801.5820.9752.5585.49814.3710.7060.645
EA-PAPCNN3.8577.7150.7671.5900.9772.2898.69813.6560.6280.504
WLE-PAPCNN4.0928.1850.8051.6590.9720.8072.7354.0780.6910.488
LLVF-PAPCNN4.0948.1880.8501.8150.9720.6142.4953.5700.7580.929
Proposed4.1018.2020.8511.8550.9810.5182.4842.5740.7590.943
PleiadesCurvelet4.0898.1771.2762.5820.8222.4842.0308.0040.5790.453
DTCWT4.1168.2311.3302.7200.8242.5712.1218.2060.8270.458
CNN4.5229.0431.3502.7320.7561.9322.8066.3700.8300.657
CSE3.5867.1721.2652.6080.7364.7073.52916.9980.5610.148
ASR3.9297.8580.6791.7730.7842.3411.9548.1830.8070.446
CSR4.0378.0730.9112.1200.8202.4161.9537.8750.8260.454
CSMCA4.1918.3831.0942.3430.8122.3842.0697.4770.7520.611
TVM3.4786.9560.7041.4240.7464.5323.34916.3350.7130.163
RTVD3.1946.3880.6161.3180.7666.8732.66835.8930.6990.158
RGF4.0488.0961.3672.7190.7723.8173.4719.2070.7280.488
MLGCF4.1358.2711.2822.5870.8212.3441.9797.2410.7640.508
VSM-WLS4.1578.3141.3452.7270.8272.5672.1917.6700.8250.504
EA-DCPCNN4.1958.3901.2422.6200.8142.2091.9507.3690.8100.600
EA-PAPCNN4.1358.2691.3092.6880.8192.6082.1748.1620.8360.408
WLE-PAPCNN4.4088.8181.3492.7530.7852.0082.3215.0200.8400.641
LLVF-PAPCNN4.5239.0471.4272.8720.7911.4041.9413.9070.8640.915
Proposed4.5329.0641.4332.9110.8321.4041.9303.9010.8670.931
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hou, Z.; Lv, K.; Gong, X.; Wan, Y. A Remote Sensing Image Fusion Method Combining Low-Level Visual Features and Parameter-Adaptive Dual-Channel Pulse-Coupled Neural Network. Remote Sens. 2023, 15, 344. https://doi.org/10.3390/rs15020344

AMA Style

Hou Z, Lv K, Gong X, Wan Y. A Remote Sensing Image Fusion Method Combining Low-Level Visual Features and Parameter-Adaptive Dual-Channel Pulse-Coupled Neural Network. Remote Sensing. 2023; 15(2):344. https://doi.org/10.3390/rs15020344

Chicago/Turabian Style

Hou, Zhaoyang, Kaiyun Lv, Xunqiang Gong, and Yuting Wan. 2023. "A Remote Sensing Image Fusion Method Combining Low-Level Visual Features and Parameter-Adaptive Dual-Channel Pulse-Coupled Neural Network" Remote Sensing 15, no. 2: 344. https://doi.org/10.3390/rs15020344

APA Style

Hou, Z., Lv, K., Gong, X., & Wan, Y. (2023). A Remote Sensing Image Fusion Method Combining Low-Level Visual Features and Parameter-Adaptive Dual-Channel Pulse-Coupled Neural Network. Remote Sensing, 15(2), 344. https://doi.org/10.3390/rs15020344

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop